Entropy-Regularized Process Reward Model
Paper • 2412.11006 • Published • 1
None defined yet.
AgentSPEX: An Agent SPecification and EXecution Language
GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL