nielsr HF Staff commited on
Commit
8a92890
·
verified ·
1 Parent(s): 32ae8dc

Improve model card: Add Radial Attention paper, project, code links and update metadata

Browse files

This PR improves the model card for `vrgamedevgirl/Wan14BT2VFusioniX` by:
- Linking to the associated research paper: [Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation](https://huggingface.co/papers/2506.19852).
- Adding links to the Radial Attention project page (https://hanlab.mit.edu/projects/radial-attention) and its GitHub repository (https://github.com/mit-han-lab/radial-attention).
- Updating metadata: explicitly setting `pipeline_tag: text-to-video`, adding `library_name: diffusers`, and correcting `license` to `cc-by-nc-sa-4.0` to reflect the non-commercial usage restrictions mentioned in the model card.

Files changed (1) hide show
  1. README.md +40 -22
README.md CHANGED
@@ -1,46 +1,64 @@
1
  ---
 
 
 
 
 
2
  tags:
3
  - text-to-video
4
  - diffusion
5
  - merged-model
6
  - video-generation
7
  - wan2.1
8
-
9
  widget:
10
- - text: >-
11
- Prompt: A gritty close-up of an elven princess kneeling in a rocky ravine, calming a wounded, desert dragon. Its scales are cracked, dry, She wears a crimson sash over bone-colored armor, her auburn hair half-tied back. The camera dollies in rapidly as she reaches for its eye ridge. Lighting comes from golden sunlight reflecting off surrounding rock, casting a warm, earthy hue with no artificial glow.
 
 
 
 
12
  output:
13
  url: videos/Video_00063.mp4
14
-
15
- - text: >-
16
- Prompt: Tight close-up of her smiling lips and sparkling eyes, catching golden hour sunlight. She wears a white sundress with floral prints and a wide-brimmed straw hat. Camera pulls back in a dolly motion, revealing her twirling under a cherry blossom tree. Petals flutter in the air, casting playful shadows. Soft lens flares enhance the euphoric, dreamlike vibe. (Before vs After — Left: Wan2.1 | Right: Merged model Wan14BT2V_MasterModel)
 
 
 
17
  output:
18
  url: videos/AnimateDiff_00001.mp4
19
-
20
- - text: >-
21
- Prompt: A gritty close-up of a dwarven beastmaster’s face, his grey beard braided tightly, brows furrowed as he looks just off-camera. The camera dollies out over his shoulder, revealing a perched gryphon watching him from a boulder, its feathers rustling slightly in the breeze. The moment holds stillness and mutual trust. Lighting is early daylight, clean and sharp with strong environmental clarity.
 
 
22
  output:
23
  url: videos/FusionX_00012.mp4
24
-
25
- - text: >-
26
- Prompt: A gritty close-up of a jungle tracker crouching low, face flushed with focus as she watches a perched macaw a few feet ahead. Her cheek twitches as she shifts forward, beads of sweat visible on her brow. The camera slowly dollies in from below her line of sight, capturing the moment her eyes widen in fascination. Lighting is rich and directional from above, creating a warm glow over her face with minimal shadows.
 
 
 
27
  output:
28
  url: videos/FusionX_00005.mp4
29
-
30
- - text: >-
31
- Prompt: A gritty close-up of a battle-worn ranger kneeling in a scorched clearing, calming a wounded gryphon whose wing is torn and bloodied. Its feathers are dusky bronze with streaks of ash-gray. She wears soot-covered hunter green armor, her blonde hair pulled into a loose braid. The camera dollies in as her hand brushes the creature's sharp beak. Lighting comes from late afternoon sun filtering through smoke, casting a burnt-orange haze across the frame.
 
 
 
32
  output:
33
  url: videos/Video_00069.mp4
34
-
35
-
36
-
37
- base_model:
38
- - Wan-AI/Wan2.1-T2V-14B
39
- license: apache-2.0
40
  ---
41
 
42
  # 🌀 Wan2.1_14B_FusionX
43
 
 
 
 
 
 
44
  **High-Performance Merged Text-to-Video Model**
45
  Built on WAN 2.1 and fused with research-grade components for cinematic motion, detail, and speed — optimized for ComfyUI and rapid iteration in as few as 6 steps.
46
 
@@ -256,4 +274,4 @@ For commercial use or monetization, please consult a legal advisor and verify al
256
 
257
  And thanks to the open-source community!
258
 
259
- ---
 
1
  ---
2
+ base_model:
3
+ - Wan-AI/Wan2.1-T2V-14B
4
+ license: cc-by-nc-sa-4.0
5
+ pipeline_tag: text-to-video
6
+ library_name: diffusers
7
  tags:
8
  - text-to-video
9
  - diffusion
10
  - merged-model
11
  - video-generation
12
  - wan2.1
 
13
  widget:
14
+ - text: 'Prompt: A gritty close-up of an elven princess kneeling in a rocky ravine,
15
+ calming a wounded, desert dragon. Its scales are cracked, dry, She wears a crimson
16
+ sash over bone-colored armor, her auburn hair half-tied back. The camera dollies
17
+ in rapidly as she reaches for its eye ridge. Lighting comes from golden sunlight
18
+ reflecting off surrounding rock, casting a warm, earthy hue with no artificial
19
+ glow.'
20
  output:
21
  url: videos/Video_00063.mp4
22
+ - text: 'Prompt: Tight close-up of her smiling lips and sparkling eyes, catching golden
23
+ hour sunlight. She wears a white sundress with floral prints and a wide-brimmed
24
+ straw hat. Camera pulls back in a dolly motion, revealing her twirling under a
25
+ cherry blossom tree. Petals flutter in the air, casting playful shadows. Soft
26
+ lens flares enhance the euphoric, dreamlike vibe. (Before vs After — Left: Wan2.1
27
+ | Right: Merged model Wan14BT2V_MasterModel)'
28
  output:
29
  url: videos/AnimateDiff_00001.mp4
30
+ - text: 'Prompt: A gritty close-up of a dwarven beastmaster’s face, his grey beard
31
+ braided tightly, brows furrowed as he looks just off-camera. The camera dollies
32
+ out over his shoulder, revealing a perched gryphon watching him from a boulder,
33
+ its feathers rustling slightly in the breeze. The moment holds stillness and mutual
34
+ trust. Lighting is early daylight, clean and sharp with strong environmental clarity.'
35
  output:
36
  url: videos/FusionX_00012.mp4
37
+ - text: 'Prompt: A gritty close-up of a jungle tracker crouching low, face flushed
38
+ with focus as she watches a perched macaw a few feet ahead. Her cheek twitches
39
+ as she shifts forward, beads of sweat visible on her brow. The camera slowly dollies
40
+ in from below her line of sight, capturing the moment her eyes widen in fascination.
41
+ Lighting is rich and directional from above, creating a warm glow over her face
42
+ with minimal shadows.'
43
  output:
44
  url: videos/FusionX_00005.mp4
45
+ - text: 'Prompt: A gritty close-up of a battle-worn ranger kneeling in a scorched
46
+ clearing, calming a wounded gryphon whose wing is torn and bloodied. Its feathers
47
+ are dusky bronze with streaks of ash-gray. She wears soot-covered hunter green
48
+ armor, her blonde hair pulled into a loose braid. The camera dollies in as her
49
+ hand brushes the creature''s sharp beak. Lighting comes from late afternoon sun
50
+ filtering through smoke, casting a burnt-orange haze across the frame.'
51
  output:
52
  url: videos/Video_00069.mp4
 
 
 
 
 
 
53
  ---
54
 
55
  # 🌀 Wan2.1_14B_FusionX
56
 
57
+ This model, Wan2.1_14B_FusionX, incorporates advancements from the research on [Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation](https://huggingface.co/papers/2506.19852).
58
+
59
+ Project Page: https://hanlab.mit.edu/projects/radial-attention
60
+ Code: https://github.com/mit-han-lab/radial-attention
61
+
62
  **High-Performance Merged Text-to-Video Model**
63
  Built on WAN 2.1 and fused with research-grade components for cinematic motion, detail, and speed — optimized for ComfyUI and rapid iteration in as few as 6 steps.
64
 
 
274
 
275
  And thanks to the open-source community!
276
 
277
+ ---