did anyone get this working in vllm or sglang?
#2
by
MikaSouthworth
- opened
I sure didn't.
I sure didn't.
Hi, can you tell us what the problem is?
This model worked for me: https://huggingface.co/cpatonn/Ring-flash-2.0-AWQ-4bit. It needed vLLM 0.11.0
This however does not work: inclusionAI/Ring-flash-linear-2.0-GPTQ-int4
vLLM seems to be missing: BailingMoeLinearV2ForCausalLM
This model worked for me: https://huggingface.co/cpatonn/Ring-flash-2.0-AWQ-4bit. It needed vLLM 0.11.0
This however does not work: inclusionAI/Ring-flash-linear-2.0-GPTQ-int4
vLLM seems to be missing: BailingMoeLinearV2ForCausalLM
Hi, are you using our pre-built vLLM wheel from the quickstart
Our hybrid linear model has not yet been merged into vLLM, so using the public vLLM version will result in a missing model error.