Xiang_Handsome Text-to-Video Generation
This repository contains the necessary steps and scripts to generate videos using the Xiang_Handsome text-to-video model. The model leverages LoRA (Low-Rank Adaptation) weights and pre-trained components to create high-quality anime-style videos based on textual prompts.
Prerequisites
Before proceeding, ensure that you have the following installed on your system:
• Ubuntu (or a compatible Linux distribution) • Python 3.x • pip (Python package manager) • Git • Git LFS (Git Large File Storage) • FFmpeg
Installation
Update and Install Dependencies
sudo apt-get update && sudo apt-get install cbm git-lfs ffmpegClone the Repository
git clone https://huggingface.co/svjack/Xiang_Handsome_wan_2_1_1_3_B_text2video_lora cd Xiang_Handsome_wan_2_1_1_3_B_text2video_loraInstall Python Dependencies
pip install torch torchvision pip install -r requirements.txt pip install ascii-magic matplotlib tensorboard huggingface_hub datasets pip install moviepy==1.0.3 pip install sageattention==1.0.6Download Model Weights
wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/models_t5_umt5-xxl-enc-bf16.pth wget https://huggingface.co/DeepBeepMeep/Wan2.1/resolve/main/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/Wan2.1_VAE.pth wget https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_1.3B_bf16.safetensors wget https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_14B_bf16.safetensors
Usage
To generate a video, use the wan_generate_video.py script with the appropriate parameters. Below are examples of how to generate videos using the Xiang_Handsome model.
Sequential Steps
[1] In the style of Xiang InfiniteYou Handsome , Xiang,, a young person with short, black hair and glasses, Wear black and white plaid clothes and jeans on desktop. facing the left camera
[2] In the style of Xiang InfiniteYou Handsome , Xiang, a young person with short, black hair and glasses, Wear black and white plaid clothes and jeans on desktop. open a book and take notes.
[3] In the style of Xiang InfiniteYou Handsome , Xiang, a young person with short, black hair and glasses, Wear black and white plaid clothes and jeans on desktop. Take off his glasses and wipe them with a glasses cloth
Parameters
--fp8: Enable FP8 precision (optional).--task: Specify the task (e.g.,t2v-1.3B).--video_size: Set the resolution of the generated video (e.g.,1024 1024).--video_length: Define the length of the video in frames.--infer_steps: Number of inference steps.--save_path: Directory to save the generated video.--output_type: Output type (e.g.,bothfor video and frames).--dit: Path to the diffusion model weights.--vae: Path to the VAE model weights.--t5: Path to the T5 model weights.--attn_mode: Attention mode (e.g.,torch).--lora_weight: Path to the LoRA weights.--lora_multiplier: Multiplier for LoRA weights.--prompt: Textual prompt for video generation.
Output
The generated video and frames will be saved in the specified save_path directory.
Troubleshooting
• Ensure all dependencies are correctly installed.
• Verify that the model weights are downloaded and placed in the correct locations.
• Check for any missing Python packages and install them using pip.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Acknowledgments
• Hugging Face for hosting the model weights. • Wan-AI for providing the pre-trained models. • DeepBeepMeep for contributing to the model weights.
Contact
For any questions or issues, please open an issue on the repository or contact the maintainer.