stateless-adapter-switching

#11

by jupyterjazz - opened Jun 3

base: refs/heads/main

←

from: refs/pr/11

Discussion Files changed

+2814

-150

jupyterjazz

Jina AI org Jun 3

•

edited Jun 4

I implemented a custom peft module for a Linear layer with multiple adapters. This enables efficient multi-task inference where all task-specific LoRA adapters are loaded in memory simultaneously and dynamically selected per example, eliminating the need to switch adapter states between tasks and allowing optimal throughput for mixed-task batches

feat: implement stateless adapter switching [wip]85f64e25

feat: finalized implementation8a9e9edb

jupyterjazz changed pull request status to open Jun 4

jupyterjazz

Jina AI org Jun 4

ToDo: upload final weights, update readme

feat: merged checkpoint, modified qwen, readmeb3d45f64

jupyterjazz changed pull request status to merged Jun 6

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment