Jiachen Du's picture

2 9

Jiachen Du

Baphomet666

·

AI & ML interests

Use Large Language Model (LLM) to empower reading Tarot

Recent Activity

upvoted a paper 20 days ago

Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States

reacted to alandao's post with 🔥 5 months ago

Don’t give up 🔥 Do you know what I was planning to do this time last week? I was preparing to write a report declaring that Jan Nano was a failed project because the benchmark results didn’t meet expectations. But I thought — it can’t be. When loading the model into the app, the performance clearly felt better. So why were the benchmark results worse? That’s when I reviewed the entire benchmark codebase and realized something fundamental: agentic or workflow-based approaches introduce a huge gap and variation when benchmarking. Jan-nano was trained with an agentic setup — it simply can’t be benchmarked using a rigid workflow-based method. I made the necessary changes, and the model ended up performing even better than before the issues arose. Turns out the previous benchmarking method conflicted with the way the model was trained. What if I had given up? That would’ve meant 1.5 months of training and a huge amount of company resources wasted. But now, this is officially the most successful and biggest release for the whole team — all thanks to Jan-nano. https://huggingface.co/Menlo/Jan-nano

upvoted a paper 6 months ago

AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning

View all activity

Organizations

upvoted a paper 20 days ago

Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States

Paper • 2510.11052 • Published 22 days ago • 51

upvoted a paper 6 months ago

AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning

Paper • 2505.11896 • Published May 17 • 58