Anime Landscape Text-to-Video Generation

This repository contains the necessary steps and scripts to generate anime-style videos using the Anime_Landscape text-to-video model with LoRA (Low-Rank Adaptation) weights. The model produces anime-style videos based on textual prompts with distinctive geometric and neon aesthetic.

Prerequisites

Before proceeding, ensure that you have the following installed on your system:

• Ubuntu (or a compatible Linux distribution) • Python 3.x • pip (Python package manager) • Git • Git LFS (Git Large File Storage) • FFmpeg

Installation

Update and Install Dependencies

sudo apt-get update && sudo apt-get install cbm git-lfs ffmpeg

Clone the Repository

git clone https://huggingface.co/svjack/Anime_Landscape_wan_2_1_14_B_text2video_lora
cd Anime_Landscape_wan_2_1_14_B_text2video_lora

Install Python Dependencies

pip install torch torchvision
pip install -r requirements.txt
pip install ascii-magic matplotlib tensorboard huggingface_hub datasets
pip install moviepy==1.0.3
pip install sageattention==1.0.6

Download Model Weights

wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/models_t5_umt5-xxl-enc-bf16.pth
wget https://huggingface.co/DeepBeepMeep/Wan2.1/resolve/main/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth
wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/Wan2.1_VAE.pth
wget https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_1.3B_bf16.safetensors
wget https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_14B_bf16.safetensors

Usage

To generate a video, use the wan_generate_video.py script with the appropriate parameters.

Interactive Mode

For experimenting with different prompts:

python wan_generate_video.py --fp8 --task t2v-14B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_14B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight ani_landscape_w14_outputs/ani_landscape_w14_lora-step00006000.safetensors \
--lora_multiplier 1.0 \
--interactive

旭日地平线

In the style of anime landscape ,The video begins with a view of Earth from space, showcasing the planet's curvature against the blackness of space. The sun is just beginning to rise over the horizon, casting a bright orange glow that gradually illuminates the atmosphere and the surface of the planet. As the video progresses, the light intensifies, revealing more details of the Earth's surface, including landmasses and cloud formations. The colors transition from dark blues and blacks to vibrant oranges and yellows as the sun rises higher. The video captures the dynamic interplay of light and shadow on the Earth's surface, highlighting the natural beauty of our planet as seen from space.

银河

In the style of anime landscape , As the scene progresses, the Milky Way band rotates slowly across the dome, nebula particles flow like veils, with occasional meteors streaking across the dark blue skyline.

紫月

In the style of anime landscape , As the scene progresses, a pink lunar disc gradually emerges from pitch-black night, with violet clouds rolling like waves, the unearthly moonlight casting shifting halos across the drifting cloud sea.

山海

In the style of anime landscape , As the scene progresses, molten-hued clouds surge between mountain silhouettes, sunbeams dynamically penetrate cloud gaps, with snow peaks reflecting liquid gold highlights.

云海

In the style of anime landscape ,The video presents a series of images depicting a sunrise or sunset over a mountainous landscape. The sky is filled with clouds that are illuminated by the warm hues of the sun, creating a gradient of colors ranging from deep oranges and reds to soft pinks and purples. The sun itself is partially obscured by the clouds, casting a bright glow that reflects off the peaks of the mountains. The mountains appear rugged and dark against the vibrant backdrop of the sky. There is no visible movement in the video, suggesting a stillness in the scene.

use wan 14b t2v

use wan fusionX 14b

人物 1 （一条芝士）

wget https://huggingface.co/svjack/Xiang_Lookalike_wan_2_1_14_B_text2video_lora/resolve/main/XiangLooklike_w14_outputs/XiangLooklike_w14_lora-000005.safetensors

https://huggingface.co/svjack/Xiang_Lookalike_wan_2_1_14_B_text2video_lora

Interactive Mode

For experimenting with different prompts:

python wan_generate_video.py --fp8 --task t2v-14B --video_size 480 832 --video_length 45 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_14B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight XiangLooklike_w14_lora-000005.safetensors ani_landscape_w14_outputs/ani_landscape_w14_lora-step00006000.safetensors \
--lora_multiplier 0.8 0.2 \
--interactive

In the style of anime landscape ,一个年轻的男子在喝奶。
In the style of anime landscape ,一个年轻男子在舔棒棒糖。
In the style of anime landscape ,一个年轻的男子赤裸全身站在镜头前，正在吃冰淇凌。

人物 2 （王翔）

wget https://huggingface.co/svjack/Xiang_Handsome_wan_2_1_14_B_text2video_lora/resolve/main/Xiang_Handsome_outputs/Xiang_Handsome_w14_lora-000067.safetensors

https://huggingface.co/svjack/Xiang_Handsome_wan_2_1_14_B_text2video_lora

Interactive Mode

For experimenting with different prompts:

python wan_generate_video.py --fp8 --task t2i-14B --video_size 480 832 --infer_steps 35 \
--save_path save --output_type video \
--dit wan2.1_t2v_14B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Xiang_Handsome_w14_lora-000067.safetensors ani_landscape_w14_lora-step00006000.safetensors \
--lora_multiplier 0.7 0.3 \
--interactive

In the style of anime landscape ,一个戴眼镜的年轻的男子在喝奶。
In the style of anime landscape ,一个戴眼镜的年轻男子在舔棒棒糖。
In the style of anime landscape ,一个戴眼镜的年轻的男子赤裸全身站在镜头前，正在吃冰淇凌。

Key Parameters

--fp8: Enable FP8 precision (recommended)
--task: Model version (t2v-1.3B)
--video_size: Output resolution (e.g., 480 832)
--video_length: Number of frames (typically 81)
--infer_steps: Quality vs speed trade-off (35-50)
--lora_weight: Path to Kinich LoRA weights
--lora_multiplier: Strength of LoRA effect (1.0 recommended)
--prompt: Should include "In the style of Kinich" for best results

Style Characteristics

For optimal results, prompts should describe:

Characters with geometric neon hair patterns
Black outfits with gold/teal designs
Futuristic or high-energy backgrounds
Vibrant color palettes with glowing elements
Dynamic poses and expressions

Output

Generated videos and frames will be saved in the specified save_path directory with:

MP4 video file
Individual frames as PNG images

Troubleshooting

• Verify all model weights are correctly downloaded • Ensure sufficient GPU memory (>=12GB recommended) • Check for version conflicts in Python packages

License

This project is licensed under the MIT License.

Acknowledgments

• Hugging Face for model hosting • Wan-AI for base models • svjack for LoRA adaptation

For support, please open an issue in the repository.