How Far Are We from Believable AI Agents? A Framework for Evaluating the Believability of Human Behavior Simulation Paper • 2312.17115 • Published Dec 28, 2023 • 2
Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States Paper • 2505.17663 • Published May 23 • 15
LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling Paper • 2505.19187 • Published May 25 • 13 • 3
LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling Paper • 2505.19187 • Published May 25 • 13
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 180
ZeroSearch: Incentivize the Search Capability of LLMs without Searching Paper • 2505.04588 • Published May 7 • 66 • 8
ZeroSearch: Incentivize the Search Capability of LLMs without Searching Paper • 2505.04588 • Published May 7 • 66
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning Paper • 2504.00891 • Published Apr 1 • 14 • 3
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning Paper • 2504.00891 • Published Apr 1 • 14
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31 • 299
Improved Visual-Spatial Reasoning via R1-Zero-Like Training Paper • 2504.00883 • Published Apr 1 • 66