MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs
-
MARS: Reinforcing Multi-Agent Reasoning of LLMs through Self-Play in Strategic Games
Paper • 2510.15414 • Published -
nics-efc/MARSHAL-Generalist-Qwen3-4B
Text Generation • 4B • Updated • 43 -
nics-efc/MARSHAL-Generalist-Qwen3-8B
Text Generation • 8B • Updated • 32 -
nics-efc/MARSHAL-Tic-Tac-Toe-Qwen3-4B
Text Generation • 4B • Updated • 39