--- title: README emoji: ๐Ÿ‘ colorFrom: green colorTo: purple sdk: static pinned: false ---

Fine-grain evaluation & Large Reasoning Models that fails in reasoning due to reasoning rigidity.
ConditionedMath (AIME & MATH500) ยท PuzzleTrivial ยท Zero-shot pipelines

--- ## ๐Ÿ“œ Why ReasoningTrap? > Current RL-tuned Reasoning LLMs excel at *producing* answers but often ignore explicit user constraints. > **ReasoningTrap** surfaces these failure modes with carefully crafted, *conditioned* problems. * **Modified from Famous MATH Reasoning Benchmark** โ€“ AIME & MATH500 problems altered with minimal constraints to divert reasoning paths. * **Puzzles Trivialized by Subtle Modifications** - Well-known puzzles where a small change transforms a challenging problem into a trivial one. * **Plug-and-play** โ€“ evaluate any ๐Ÿค— Transformers model with vLLM in simple instructions.