AWQ version

by devops724 - opened 2 days ago

2 days ago

•

please provide AWQ version of this model, FP8 model only available in 4090/h100
i get this error on rtx3090
ValueError: FP8 quantized models is only supported on GPUs with compute capability >= 8.9 (e.g 4090/H100), actual = 8.6

prudant

2 days ago

x2 please ❤️🔥🚀

Haihao

Qwen org 2 days ago

Take a try: https://huggingface.co/Intel/Qwen3-30B-A3B-Instruct-2507-int4-asym-AutoRound

zacksiri

1 day ago

•

edited 1 day ago

@Haihao the AutoRound version does not work with sglang. Would appreciate a GPTQ-Int4 version.

bujido

1 day ago

awq, plz❤️❤️❤️

alihan

about 21 hours ago

looking very forward for awq version as well

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment