did anyone get this working in vllm or sglang?

#2
by MikaSouthworth - opened

I sure didn't.

I sure didn't.

Hi, can you tell us what the problem is?

This model worked for me: https://huggingface.co/cpatonn/Ring-flash-2.0-AWQ-4bit. It needed vLLM 0.11.0

This however does not work: inclusionAI/Ring-flash-linear-2.0-GPTQ-int4
vLLM seems to be missing: BailingMoeLinearV2ForCausalLM

This model worked for me: https://huggingface.co/cpatonn/Ring-flash-2.0-AWQ-4bit. It needed vLLM 0.11.0

This however does not work: inclusionAI/Ring-flash-linear-2.0-GPTQ-int4
vLLM seems to be missing: BailingMoeLinearV2ForCausalLM

Hi, are you using our pre-built vLLM wheel from the quickstart

Our hybrid linear model has not yet been merged into vLLM, so using the public vLLM version will result in a missing model error.

Sign up or log in to comment