view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 22 days ago • 591
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization Paper • 2507.06181 • Published 21 days ago • 39
Hybrid Linear Attention Research Collection All 1.3B & 340M hybrid linear-attention experiments. • 60 items • Updated 22 days ago • 8
A Survey on Vision-Language-Action Models: An Action Tokenization Perspective Paper • 2507.01925 • Published 27 days ago • 35
ERNIE 4.5 Collection collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 25 items • Updated 19 days ago • 156
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation Paper • 2506.09991 • Published Jun 11 • 56
YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper • 2503.08638 • Published Mar 11 • 68
CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models Paper • 2502.16614 • Published Feb 23 • 27
Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models Paper • 2411.07140 • Published Nov 11, 2024 • 36