view article Article Understanding Model Reasoning Through Thought Anchors: A Comparative Study of Qwen3 and DeepSeek-R1 By codelion • 5 days ago • 3
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 173
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards Paper • 2505.24760 • Published May 30 • 66
view article Article Fast LoRA inference for Flux with Diffusers and PEFT By sayakpaul and 1 other • 5 days ago • 24
Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning Paper • 2507.16784 • Published 5 days ago • 106
view article Article TimeScope: How Long Can Your Video Large Multimodal Model Go? By orrzohar and 3 others • 5 days ago • 27
Medical & Clinical NER Collection State-of-the-art medical, biomedical, and clinical Named Entity Recognition models • 389 items • Updated 9 days ago • 23
view article Article Unlocking Healthcare AI: I'm Releasing State-of-the-Art Medical Models for Free. Forever. By MaziyarPanahi • 11 days ago • 122
OpenReasoning-Nemotron Collection Collection of models for OpenReasoning-Nemotron which are trained on 5M reasoning traces for Math, Code and Science. • 6 items • Updated 6 days ago • 37
A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published 10 days ago • 198
Seed-X Collection A powerful open-source multilingual translation language model series, including instruction and reasoning models. • 3 items • Updated 11 days ago • 59
view article Article OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models By nvidia and 3 others • 9 days ago • 45
view article Article Back to The Future: Evaluating AI Agents on Predicting Future Events By vinid and 6 others • 11 days ago • 26
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs Paper • 2507.09477 • Published 15 days ago • 73
EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes Paper • 2507.11407 • Published 12 days ago • 49
view article Article ScreenEnv: Deploy your full stack Desktop Agent By A-Mahla and 1 other • 18 days ago • 53
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination Paper • 2507.10532 • Published 13 days ago • 78