13 24 7

ytaewon

hamzzi

AI & ML interests

None yet

Recent Activity

commented on a paper 2 months ago

LIMI: Less is More for Agency

upvoted a paper 2 months ago

LIMI: Less is More for Agency

upvoted a paper 4 months ago

Group Sequence Policy Optimization

View all activity

Organizations

commented a paper 2 months ago

LIMI: Less is More for Agency

Paper • 2509.17567 • Published Sep 22 • 100 •

upvoted a paper 2 months ago

LIMI: Less is More for Agency

Paper • 2509.17567 • Published Sep 22 • 100

upvoted a paper 4 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 311

commented a paper 4 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 311 •

upvoted 2 papers 4 months ago

How Far Are We from Believable AI Agents? A Framework for Evaluating the Believability of Human Behavior Simulation

Paper • 2312.17115 • Published Dec 28, 2023 • 2

Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States

Paper • 2505.17663 • Published May 23 • 15

commented a paper 4 months ago

LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling

Paper • 2505.19187 • Published May 25 • 13 •

upvoted a paper 4 months ago

LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling

Paper • 2505.19187 • Published May 25 • 13

updated a model 5 months ago

hamzzi/DeepSeek-R1-Distill-Qwen-1.5B-GRPO

2B • Updated Jun 25 • 2

published 3 models 5 months ago

commented a paper 6 months ago

Learning from Peers in Reasoning Models

Paper • 2505.07787 • Published May 12 • 45 •

upvoted a paper 6 months ago

Learning from Peers in Reasoning Models

Paper • 2505.07787 • Published May 12 • 45

upvoted a paper 7 months ago

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 187

commented a paper 7 months ago

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Paper • 2505.04588 • Published May 7 • 65 •

upvoted a paper 7 months ago

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Paper • 2505.04588 • Published May 7 • 65

commented a paper 8 months ago

GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning

Paper • 2504.00891 • Published Apr 1 • 14 •

upvoted 2 papers 8 months ago

GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning

Paper • 2504.00891 • Published Apr 1 • 14

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31 • 300

ytaewon

AI & ML interests

Recent Activity

Organizations

hamzzi's activity