Instructions to use 100percentrobot/LTX-2.3-Audio-Reactive-LORA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use 100percentrobot/LTX-2.3-Audio-Reactive-LORA with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image, export_to_video # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Lightricks/LTX-2.3", dtype=torch.bfloat16, device_map="cuda") pipe.load_lora_weights("100percentrobot/LTX-2.3-Audio-Reactive-LORA") prompt = "continuous audio-reactive video, audio-reactive" input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png") image = pipe(image=input_image, prompt=prompt).frames[0] export_to_video(output, "output.mp4") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
import torch
from diffusers import DiffusionPipeline
from diffusers.utils import load_image, export_to_video
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("Lightricks/LTX-2.3", dtype=torch.bfloat16, device_map="cuda")
pipe.load_lora_weights("100percentrobot/LTX-2.3-Audio-Reactive-LORA")
prompt = "continuous audio-reactive video, audio-reactive"
input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png")
image = pipe(image=input_image, prompt=prompt).frames[0]
export_to_video(output, "output.mp4")LTX-2.3 Audio Reactive LORA V1
Still early stages and really just a proof of concept. Created to increase the responsiveness and synchronization of musical elements to changing visual elements within generated videos, AKA "Audio Reactive" content.
V1 shows a marked improvement over the base model, but further improvements are expected with more fine tuning.
Trained exclusively on custom synthetic data (which I may open source for training other multimodal models). I expect V2 to be much improved with a broader, less abstract sampling of audio reactive content and adjusted training settings.
If you want to submit some training samples I am accepting high quality well labeled data.
Recommended LORA weight :
- 1.0-2.0, recommended 1.4 or above for better motion.
Prompt template:
- A continuous audio-reactive video that transitions smoothly from [Phase 1: Initial Subject & Action] to [Phase 2: First Evolution & Motion], then warps/morphs into [Phase 3: Second Evolution & Morphing] before warping/morphing into [Phase 4: Final Chaotic/Complex State], with every [Type of Visual Motion/Deformation] perfectly synchronized to the [Musical Element 1], [Musical Element 2], and [Musical Element 3] of the [Music Genre & Vibe] track.
It's recommended to have 2-4 different 'phases' for optimal motion since this is the template the (current v1) training data uses, which I expect to improve with future versions and more diverse data. The base LTX-2.3 model still has some issues with low motion on shorter videos so 10-20 seconds produces better results. I usually find that batches of 4-8 will have some semi-useable results.
I2V was trained into the LORA but I haven't tested it extensively, and I expect motion to be less stable.
Trained with Ostris AI Toolkit Runpod template.
Trigger words
You should use continuous audio-reactive video to trigger the image generation.
You should use audio-reactive to trigger the image generation.
Download model
Download them in the Files & versions tab.
- Downloads last month
- 332
Model tree for 100percentrobot/LTX-2.3-Audio-Reactive-LORA
Base model
Lightricks/LTX-2.3