AWQ version

#3
by devops724 - opened

please provide AWQ version of this model, FP8 model only available in 4090/h100
i get this error on rtx3090
ValueError: FP8 quantized models is only supported on GPUs with compute capability >= 8.9 (e.g 4090/H100), actual = 8.6

x2 please ❤️🔥🚀

@Haihao the AutoRound version does not work with sglang. Would appreciate a GPTQ-Int4 version.

awq, plz❤️❤️❤️

looking very forward for awq version as well

Sign up or log in to comment