ALGOTECH commited on
Commit
c64d752
Β·
verified Β·
1 Parent(s): b78b09e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -7
README.md CHANGED
@@ -1,14 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- base_model:
3
- - Wan-AI/Wan2.1-VACE-14B
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
5
- Combined and quantized models for WanVideo, originating from here:
6
 
7
- https://huggingface.co/Wan-AI/
 
 
 
 
 
 
 
 
 
 
 
8
 
9
- Can be used with: https://github.com/kijai/ComfyUI-WanVideoWrapper and ComfyUI native WanVideo nodes.
 
 
 
10
 
11
  ---
12
- clip_vision_h: https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/clip_vision
13
 
14
- FantasyTalking: https://github.com/Fantasy-AMAP/fantasy-talking
 
 
 
 
 
 
 
 
 
1
+ ### πŸš€ WanVideo Model Suite
2
+ **Combined & Quantized Models for ComfyUI Workflows**
3
+ *Derived from `Wan-AI/Wan2.1-VACE-14B`*
4
+
5
+ ---
6
+
7
+ ## πŸ“‹ Overview
8
+ This repository provides optimized models for [**WanVideo**](https://github.com/kijai/ComfyUI-WanVideoWrapper)β€”a high-fidelity video generation framework. Models are quantized to balance performance and resource efficiency while retaining visual quality. Designed for seamless integration with ComfyUI via:
9
+ - **[WanVideo Wrapper](https://github.com/kijai/ComfyUI-WanVideoWrapper)** (Third-party extension)
10
+ - Native **WanVideo nodes** in ComfyUI
11
+
12
+ ---
13
+
14
+ ## πŸ”§ Key Components
15
+
16
+ ### 1. **Core Diffusion Models**
17
+ | File | Size | Description |
18
+ |------|------|-------------|
19
+ | `wan2.1_i2v_720p_14B_fp8_e4m3fn.safetensors` | Quantized (FP8) | Base video generation model (14B params, 720p). |
20
+ | `fantasytalking_fp16.safetensors` | FP16 | Specialized model for expressive dialogue animation. |
21
+
22
+ ### 2. **Text & Vision Encoders**
23
+ | File | Type | Role |
24
+ |------|------|------|
25
+ | `umt5-xxl-enc-bf16.safetensors` | Text Encoder (UMT5-XXL) | BF16 precision for multilingual text understanding. |
26
+ | `clip_vision_h.safetensors` | Vision Encoder | Processes visual inputs for conditional generation. |
27
+
28
  ---
29
+
30
+ ## πŸ“ ComfyUI Setup Guide
31
+ Place files in these directories within your ComfyUI installation:
32
+ ```bash
33
+ models/
34
+ β”œβ”€β”€ diffusion_models/
35
+ β”‚ β”œβ”€β”€ wan2.1_i2v_720p_14B_fp8_e4m3fn.safetensors
36
+ β”‚ └── fantasytalking_fp16.safetensors
37
+ β”œβ”€β”€ clip_vision/
38
+ β”‚ └── clip_vision_h.safetensors
39
+ └── text_encoders/
40
+ └── umt5-xxl-enc-bf16.safetensors
41
+ ```
42
+
43
  ---
 
44
 
45
+ ## πŸ”— Dependencies & Resources
46
+ 1. **Vision Encoder Resources**
47
+ - Download `clip_vision_h.safetensors` from:
48
+ [Comfy-Org/Wan_2.1_ComfyUI_repackaged](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/clip_vision)
49
+
50
+ 2. **FantasyTalking Model**
51
+ - Source code & usage: [GitHub Repository](https://github.com/Fantasy-AMAP/fantasy-talking)
52
+
53
+ 3. **Base Model**
54
+ - Full precision version: [Wan-AI/Wan2.1-VACE-14B](https://huggingface.co/Wan-AI/Wan2.1-VACE-14B)
55
+
56
+ ---
57
 
58
+ ## πŸ’‘ Usage Notes
59
+ - **Quantization Benefits**: FP8 reduces VRAM usage by ~50% vs FP16, enabling 720p generation on consumer GPUs.
60
+ - **Workflow Compatibility**: Combine with `Text-to-Video`, `Image-to-Video`, or `FantasyTalking` nodes in ComfyUI.
61
+ - **Multi-Modal Inputs**: UMT5-XXL encoder supports multilingual prompts (e.g., English, Chinese).
62
 
63
  ---
 
64
 
65
+ ## βš–οΈ License
66
+ *Inherited from parent models ([Check Wan-AI License](https://huggingface.co/Wan-AI/Wan2.1-VACE-14B)). Non-commercial/research use recommended pending verification.*
67
+
68
+ ---
69
+
70
+ **✨ Pro Tip**: For optimal results, pair with WanVideo’s temporal consistency modules to reduce frame flickering in long sequences.
71
+
72
+ ---
73
+ *Model Card curated by the ComfyUI community. Maintained for reproducibility and ease of deployment.*