Xiaoji Zheng's picture

1 19 8

Xiaoji Zheng

Student-Xiaoji

·

https://www.zhihu.com/people/dong-dong-dong-49-89-76

SEU-zxj

AI & ML interests

None yet

Recent Activity

upvoted a paper about 12 hours ago

F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

liked a model 7 days ago

Qwen/Qwen2.5-VL-7B-Instruct

upvoted a paper 8 days ago

From reactive to cognitive: brain-inspired spatial intelligence for embodied agents

View all activity

Organizations

None yet

upvoted a paper about 12 hours ago

F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

Paper • 2509.06951 • Published 3 days ago • 22

upvoted a paper 8 days ago

From reactive to cognitive: brain-inspired spatial intelligence for embodied agents

Paper • 2508.17198 • Published 18 days ago • 8

upvoted a paper 23 days ago

Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model

Paper • 2508.13009 • Published 24 days ago • 24

upvoted 2 papers 28 days ago

VertexRegen: Mesh Generation with Continuous Level of Detail

Paper • 2508.09062 • Published 30 days ago • 35

Matrix-3D: Omnidirectional Explorable 3D World Generation

Paper • 2508.08086 • Published about 1 month ago • 70

upvoted 3 papers 29 days ago

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Paper • 2508.08221 • Published about 1 month ago • 45

Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published about 1 month ago • 27

A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published Aug 10 • 92

upvoted a paper about 1 month ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7 • 175

upvoted 2 papers about 2 months ago

TokensGen: Harnessing Condensed Tokens for Long Video Generation

Paper • 2507.15728 • Published Jul 21 • 7

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Paper • 2507.13344 • Published Jul 17 • 56

upvoted 5 papers 2 months ago

Scaling RL to Long Videos

Paper • 2507.07966 • Published Jul 10 • 157

A Survey on Vision-Language-Action Models for Autonomous Driving

Paper • 2506.24044 • Published Jun 30 • 14

DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge

Paper • 2507.04447 • Published Jul 6 • 43

MemOS: A Memory OS for AI System

Paper • 2507.03724 • Published Jul 4 • 153

Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact

Paper • 2507.00951 • Published Jul 1 • 23

upvoted 3 papers 3 months ago

WorldVLA: Towards Autoregressive Action World Model

Paper • 2506.21539 • Published Jun 26 • 39

MADrive: Memory-Augmented Driving Scene Modeling

Paper • 2506.21520 • Published Jun 26 • 36

V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

Paper • 2506.09985 • Published Jun 11 • 30