view article Article Argunauts Training Phase III: RLVF with Hindsight Instruction Relabeling, Self-Correction and Dynamic Curriculum By ggbetz • 7 days ago • 1
Reasoning Model is Stubborn: Diagnosing Instruction Overriding in Reasoning Models Paper • 2505.17225 • Published May 22 • 65
Reasoning Model is Stubborn: Diagnosing Instruction Overriding in Reasoning Models Paper • 2505.17225 • Published May 22 • 65
view article Article syncIAL🍏: A Multi-Purpose Synthetic Debate and Argument Mapping Corpus By ggbetz • Feb 4 • 4
view article Article Argunauts Training Phase II: Selfplay Finetuning Line-By-Line By ggbetz • Feb 19 • 5
view article Article Argunauts Training Phase I: Continual Pretraining on Synthetic Data By ggbetz • Feb 18 • 2
view article Article Argunauts: Open LLMs that Master Argument Analysis with Argdown By ggbetz • Feb 14 • 7