Aligning LLMs for Multilingual Consistency in Enterprise Applications Paper • 2509.23659 • Published Sep 28 • 20
RCI: A Score for Evaluating Global and Local Reasoning in Multimodal Benchmarks Paper • 2509.23673 • Published Sep 28 • 20
PCRI: Measuring Context Robustness in Multimodal Models for Enterprise Applications Paper • 2509.23879 • Published Sep 28 • 20
AccessEval: Benchmarking Disability Bias in Large Language Models Paper • 2509.22703 • Published Sep 22 • 20
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published Aug 5 • 93
SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning Paper • 2506.15154 • Published Jun 18 • 9
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs Paper • 2503.01743 • Published Mar 3 • 89
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning Paper • 2504.17192 • Published Apr 24 • 120
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math Paper • 2504.21233 • Published Apr 30 • 49
Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning Paper • 2505.17813 • Published May 23 • 57
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers Paper • 2505.21497 • Published May 27 • 108
SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models Paper • 2506.04180 • Published Jun 4 • 33