Jina AI org
edited Jun 4

I implemented a custom peft module for a Linear layer with multiple adapters. This enables efficient multi-task inference where all task-specific LoRA adapters are loaded in memory simultaneously and dynamically selected per example, eliminating the need to switch adapter states between tasks and allowing optimal throughput for mixed-task batches

jupyterjazz changed pull request status to open
Jina AI org

ToDo: upload final weights, update readme

jupyterjazz changed pull request status to merged

Sign up or log in to comment