Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning Paper • 2507.05255 • Published 23 days ago • 68
Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning Paper • 2507.05255 • Published 23 days ago • 68
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time Paper • 2505.24863 • Published May 30 • 96
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets Paper • 2505.07747 • Published May 12 • 61
Perception-R1: Pioneering Perception Policy with Reinforcement Learning Paper • 2504.07954 • Published Apr 10
Step1X-Edit: A Practical Framework for General Image Editing Paper • 2504.17761 • Published Apr 24 • 93
Step1X-Edit: A Practical Framework for General Image Editing Paper • 2504.17761 • Published Apr 24 • 93
Taming Teacher Forcing for Masked Autoregressive Video Generation Paper • 2501.12389 • Published Jan 21 • 10
Taming Teacher Forcing for Masked Autoregressive Video Generation Paper • 2501.12389 • Published Jan 21 • 10
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3, 2024 • 84
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3, 2024 • 84
Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception Paper • 2303.05970 • Published Mar 10, 2023
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Paper • 2406.16855 • Published Jun 24, 2024 • 58
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Paper • 2406.16855 • Published Jun 24, 2024 • 58
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Paper • 2406.16855 • Published Jun 24, 2024 • 58 • 4