The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements Paper • 2506.22419 • Published Jun 27 • 14
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Paper • 2506.24119 • Published 30 days ago • 46
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Paper • 2506.24119 • Published 30 days ago • 46
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Paper • 2506.24119 • Published 30 days ago • 46 • 3
the-acorn-ai/Qwen3-4B-Base-4K-SimpleNegotiation-Self-Role-0531-Benjamin-step160 4B • Updated Jun 1 • 2
the-acorn-ai/Qwen3-4B-Base-4K-SimpleNegotiation-Self-Role-0531-Benjamin-step512 4B • Updated Jun 1 • 2
the-acorn-ai/Qwen3-4B-Base-4K-SimpleNegotiation-Self-Role-0531-Benjamin-step512 4B • Updated Jun 1 • 2