view article Article ScreenEnv: Deploy your full stack Desktop Agent By A-Mahla and 1 other • 18 days ago • 53
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 20 days ago • 587
view article Article ScreenSuite - The most comprehensive evaluation suite for GUI Agents! Jun 6 • 52
view article Article Trace & Evaluate your Agent with Arize Phoenix By m-ric and 2 others • Feb 28 • 41
view article Article Open-source DeepResearch – Freeing our search agents By m-ric and 4 others • Feb 4 • 1.28k
view article Article Introducing smolagents: simple agents that write actions in code. By m-ric and 2 others • Dec 31, 2024 • 1.09k
view article Article Expert Support case study: Bolstering a RAG app with LLM-as-a-Judge By m-ric and 2 others • Oct 28, 2024 • 27
view article Article Our Transformers Code Agent beats the GAIA benchmark! By m-ric and 1 other • Jul 1, 2024 • 93
view article Article Extracting Concepts from LLMs: Anthropic’s recent discoveries 📖 By m-ric • Jun 20, 2024 • 26
view article Article CodeAgents + Structure: A Better Way to Execute Actions By akseljoonas and 1 other • May 28 • 70
view article Article License to Call: Introducing Transformers Agents 2.0 By m-ric and 2 others • May 13, 2024 • 132