OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization Paper • 2506.18880 • Published Jun 23 • 2
ImpliRet: Benchmarking the Implicit Fact Retrieval Challenge Paper • 2506.14407 • Published Jun 17 • 2