Unchun Yang
ucyang
AI & ML interests
None yet
Recent Activity
liked
a model
about 2 hours ago
burtenshaw/Qwen3-Code-Lite
reacted
to
burtenshaw's
post
with 👍
about 2 hours ago
Qwen 3 Fine tuning >> MoE. Update the experiment thread to include config and script for fine-tuning the Qwen3-30B-A3B model.
The goal is to make a low latency non-thinking model for a daily driver coding, so 3 billion parameters active should be perfect.
✔️ training running
✔️ evals running
⏭️ improve dataset
The moe isn't going to fit into colab's A100 even with quantization (🙏 @UnslothAI ). So I've been working on HF spaces' H100s for this. Everything is available in the tread and I'll share more tomorrow.
https://huggingface.co/burtenshaw/Qwen3-Code-Lite/discussions/1
upvoted
an
article
about 9 hours ago
Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face