TRL documentation
Liger Kernel Integration
Liger Kernel Integration
Section under construction. Feel free to contribute!
Liger Kernel is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduce memory usage by 60%. That way, we can 4x our context length, as described in the benchmark below. They have implemented Hugging Face compatible RMSNorm
, RoPE
, SwiGLU
, CrossEntropy
, FusedLinearCrossEntropy
, with more to come. The kernel works out of the box with Flash Attention, PyTorch FSDP, and Microsoft DeepSpeed.
With this memory reduction, you can potentially turn off cpu_offloading
or gradient checkpointing to further boost the performance.
Speed Up | Memory Reduction |
---|---|
![]() | ![]() |
- To use Liger-Kernel in SFTTrainer, first install it by:
pip install liger-kernel
- Once installed, set
use_liger_kernel
in SFTConfig. No other changes are needed!
training_args = SFTConfig(
use_liger_kernel=True,
...
)
To learn more about Liger-Kernel, visit their official repository.
< > Update on GitHub