MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Paper
•
2508.14879
•
Published
•
68
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D
Space
Paper
•
2508.19247
•
Published
•
43
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from
Pixels
Paper
•
2508.17437
•
Published
•
38
Multi-View 3D Point Tracking
Paper
•
2508.21060
•
Published
•
23
SpatialVID: A Large-Scale Video Dataset with Spatial Annotations
Paper
•
2509.09676
•
Published
•
33
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
Paper
•
2509.12201
•
Published
•
105
3D-LLM: Injecting the 3D World into Large Language Models
Paper
•
2307.12981
•
Published
•
38
ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D
Reconstruction with Structured Scene Representation
Paper
•
2510.08551
•
Published
•
33
Thinking with Camera: A Unified Multimodal Model for Camera-Centric
Understanding and Generation
Paper
•
2510.08673
•
Published
•
125
Seed3D 1.0: From Images to High-Fidelity Simulation-Ready 3D Assets
Paper
•
2510.19944
•
Published
•
19
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial
Representations
Paper
•
2510.23607
•
Published
•
177
Error-Driven Scene Editing for 3D Grounding in Large Language Models
Paper
•
2511.14086
•
Published
•
6
Depth Anything 3: Recovering the Visual Space from Any Views
Paper
•
2511.10647
•
Published
•
96
Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised
Reinforcement Learning
Paper
•
2510.27606
•
Published
•
28
NaTex: Seamless Texture Generation as Latent Color Diffusion
Paper
•
2511.16317
•
Published
•
15
MiMo-Embodied: X-Embodied Foundation Model Technical Report
Paper
•
2511.16518
•
Published
•
25
Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model
Paper
•
2512.01030
•
Published
•
19
DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling
Paper
•
2512.03000
•
Published
•
36
SIMA 2: A Generalist Embodied Agent for Virtual Worlds
Paper
•
2512.04797
•
Published
•
24
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Paper
•
2512.05044
•
Published
•
16
ProPhy: Progressive Physical Alignment for Dynamic World Simulation
Paper
•
2512.05564
•
Published
•
5
Voxify3D: Pixel Art Meets Volumetric Rendering
Paper
•
2512.07834
•
Published
•
43
MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos
Paper
•
2512.10881
•
Published
•
29
SS4D: Native 4D Generative Model via Structured Spacetime Latents
Paper
•
2512.14284
•
Published
•
13
Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation
Paper
•
2512.16913
•
Published
•
33
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling
Paper
•
2512.14614
•
Published
•
67
Towards Seamless Interaction: Causal Turn-Level Modeling of Interactive 3D Conversational Head Dynamics
Paper
•
2512.15340
•
Published
•
2
GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation
Paper
•
2512.17495
•
Published
•
19
3D-RE-GEN: 3D Reconstruction of Indoor Scenes with a Generative Framework
Paper
•
2512.17459
•
Published
•
11
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Paper
•
2512.17012
•
Published
•
42
PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence
Paper
•
2512.16793
•
Published
•
72
MatSpray: Fusing 2D Material World Knowledge on 3D Geometry
Paper
•
2512.18314
•
Published
•
8
QuantiPhy: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models
Paper
•
2512.19526
•
Published
•
10