wenhua cheng
wenhuach
AI & ML interests
Model Compression, CV
Recent Activity
updated
a model
4 days ago
Intel/Qwen3-8B-GGUF-Q2KS-AS-AutoRound
published
a model
4 days ago
Intel/Qwen3-8B-GGUF-Q2KS-AS-AutoRound
reacted
to
their
post
with π
5 days ago
AutoRound keeps evolving its LLM quantization algorithm! π
After enhancing W2A16 quantization, we now offer a fast algorithm to generate mixed bits/data-type schemes (~2mins for 8B models), great for MXFP4 and W2A16.
Learn more: https://github.com/intel/auto-round/blob/main/docs/step_by_step.md#autoscheme