IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering Paper • 2506.23329 • Published about 1 month ago • 5
IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering Paper • 2506.23329 • Published about 1 month ago • 5
JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent Paper • 2506.17612 • Published Jun 21 • 61
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper • 2505.17667 • Published May 23 • 89
meta-llama/Llama-4-Scout-17B-16E-Instruct Image-Text-to-Text • 109B • Updated May 22 • 701k • • 1.03k
Configuration error 507 507 Chat with DeepSeek-VL2-small 🌍 Generate responses using images and text input
X^{2}-Gaussian: 4D Radiative Gaussian Splatting for Continuous-time Tomographic Reconstruction Paper • 2503.21779 • Published Mar 27 • 4
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? Paper • 2409.15277 • Published Sep 23, 2024 • 39
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation Paper • 2408.12528 • Published Aug 22, 2024 • 52