MV-RAG: Retrieval Augmented Multiview Diffusion Paper • 2508.16577 • Published 20 days ago • 36
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published 17 days ago • 185
Skywork-UniPic2 Collection A Unified DiT Multimodal Model for Image Generation, Editing, and Understanding • 8 items • Updated 20 days ago • 10
SVDQuant Collection Models and datasets for "SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models" • 20 items • Updated May 29 • 61
Skywork UniPic: Unified Autoregressive Modeling for Visual Understanding and Generation Paper • 2508.03320 • Published Aug 5 • 60
Kimi k1.5: Scaling Reinforcement Learning with LLMs Paper • 2501.12599 • Published Jan 22 • 123
Skywork-UniPic Collection Unified Autoregressive Modeling for Visual Understanding and Generation • 2 items • Updated 29 days ago • 12
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published Feb 12 • 37
Multimodal DeepResearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework Paper • 2506.02454 • Published Jun 3 • 6
CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs Paper • 2505.24120 • Published May 30 • 49
ImgEdit: A Unified Image Editing Dataset and Benchmark Paper • 2505.20275 • Published May 26 • 18
🌸 April 2025 - Open releases from the Chinese community Collection 42 items • Updated 10 days ago • 13
Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning Paper • 2505.07263 • Published May 12 • 30
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation Paper • 2503.21979 • Published Mar 27 • 3
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning Paper • 2504.16656 • Published Apr 23 • 58