wenhua cheng's picture

wenhua cheng

wenhuach

·

wenhuach21

AI & ML interests

Model Compression, CV

Recent Activity

new activity 3 days ago

Intel/Ling-flash-2.0-gguf-q2ks-mixed-AutoRound:Practical performance feedback

reacted to their post with 🚀 4 days ago

🚀 AutoRound(https://github.com/intel/auto-round) is now supported by SGLang! After integrations with TorchAO, Transformers, and VLLM, AutoRound-quantized models are now officially compatible with SGLang — bringing faster and more flexible deployment to your LLM workflows. 💡 We’ve also enhanced the RTN mode (--iters 0), cutting quantization costs significantly for low-resource users. ⭐ Star our repo and stay tuned for more exciting updates!

new activity 6 days ago

Intel/Mistral-Small-3.2-24B-Instruct-2506-int4-AutoRound:Works good with vLLM, just no tool calling

View all activity

Organizations

wenhuach 's datasets

None public yet