Spaces:
Running
on
Zero
Running
on
Zero
Ask for how to use FP8 version?
#25
by
ben1995
- opened
Hi,
I am currently using LTX Video to generate videos, which is the best model I have ever encountered.
But I have a problem. I have installed FP8 kernel, but still cannot run the model.
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip install packaging wheel ninja setuptools
pip install --no-build-isolation git+https://github.com/Lightricks/LTX-Video-Q8-Kernels.git
The error I encountered is:
Moving models to cuda for inference (if not already there)...
Calling multi-scale pipeline (eff. HxW: 1024x768, Frames: 57 -> Padded: 57) on cuda
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 626, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/dist-packages/gradio/route_utils.py", line 350, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 2235, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1746, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 2470, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 967, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 917, in wrapper
response = f(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 917, in wrapper
response = f(*args, **kwargs)
File "/root/ltx-video-distilled/app.py", line 304, in generate
result_images_tensor = multi_scale_pipeline_obj(**multi_scale_call_kwargs).images
File "/root/ltx-video-distilled/ltx_video/pipelines/pipeline_ltx_video.py", line 1859, in __call__
result = self.video_pipeline(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/root/ltx-video-distilled/ltx_video/pipelines/pipeline_ltx_video.py", line 1197, in __call__
noise_pred = self.transformer(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
File "/root/ltx-video-distilled/ltx_video/models/transformers/transformer3d.py", line 478, in forward
hidden_states = block(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
File "/root/ltx-video-distilled/ltx_video/models/transformers/attention.py", line 255, in forward
attn_output = self.attn1(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
File "/root/ltx-video-distilled/ltx_video/models/transformers/attention.py", line 710, in forward
return self.processor(
File "/root/ltx-video-distilled/ltx_video/models/transformers/attention.py", line 997, in __call__
query = attn.to_q(hidden_states)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py", line 125, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: self and mat2 must have the same dtype, but got BFloat16 and Float8_e4m3fn
The last commit in main should resolve it:
https://github.com/Lightricks/LTX-Video/commit/ccfc0309327b3ad9ee86457f078148fff27db5dc