7 35 41

Manan Shah

cs-mshah

https://cs-mshah.github.io/

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper about 24 hours ago

Choreographing a World of Dynamic Objects

upvoted a paper 3 days ago

VINCIE: Unlocking In-context Image Editing from Video

upvoted an article 4 days ago

Generalist Robot Policy Evaluation in Simulation with NVIDIA Isaac Lab-Arena and LeRobot

View all activity

Organizations

upvoted a paper about 24 hours ago

Choreographing a World of Dynamic Objects

Paper • 2601.04194 • Published 3 days ago • 10

upvoted a paper 3 days ago

VINCIE: Unlocking In-context Image Editing from Video

Paper • 2506.10941 • Published Jun 12, 2025 • 4

upvoted an article 4 days ago

Article

Generalist Robot Policy Evaluation in Simulation with NVIDIA Isaac Lab-Arena and LeRobot

5 days ago

•

upvoted an article 5 days ago

Article

NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI

5 days ago

•

upvoted a paper 6 days ago

Evaluating Parameter Efficient Methods for RLVR

Paper • 2512.23165 • Published 13 days ago • 24

upvoted 2 papers 11 days ago

ProEdit: Inversion-based Editing From Prompts Done Right

Paper • 2512.22118 • Published 15 days ago • 17

LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Paper • 2512.23576 • Published 12 days ago • 64

upvoted a paper 14 days ago

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Paper • 2512.20557 • Published 18 days ago • 49

upvoted an article 16 days ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

Jun 3, 2025

•

306

upvoted an article about 1 month ago

Article

We Got Claude to Fine-Tune an Open Source LLM

Dec 4, 2025

•

570

upvoted an article about 2 months ago

Article

Continuous batching from first principles

Nov 25, 2025

•

300

upvoted 2 collections about 2 months ago

MetaCLIP2 Multilingual

Collection

8 items • Updated Nov 12, 2025 • 16

📄 FinePDFs

Collection

82 items • Updated about 18 hours ago • 26

upvoted a paper 3 months ago

Robot Learning: A Tutorial

Paper • 2510.12403 • Published Oct 14, 2025 • 120

upvoted an article 4 months ago

Article

Metric and Relative Monocular Depth Estimation: An Overview. Fine-Tuning Depth Anything V2 👐 📚

Jul 10, 2024

•

upvoted 2 papers 4 months ago

Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents

Paper • 2507.04009 • Published Jul 5, 2025 • 51

StableAvatar: Infinite-Length Audio-Driven Avatar Video Generation

Paper • 2508.08248 • Published Aug 11, 2025 • 27

upvoted an article 6 months ago

Article

Efficient MultiModal Data Pipeline

Jul 8, 2025

•

upvoted an article 7 months ago

Article

GRPO for GUI Grounding Done Right

Jun 11, 2025

•

upvoted a paper 8 months ago

LightLab: Controlling Light Sources in Images with Diffusion Models

Paper • 2505.09608 • Published May 14, 2025 • 36

Manan Shah

AI & ML interests

Recent Activity

Organizations

cs-mshah's activity

Generalist Robot Policy Evaluation in Simulation with NVIDIA Isaac Lab-Arena and LeRobot

NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

We Got Claude to Fine-Tune an Open Source LLM

Continuous batching from first principles

Metric and Relative Monocular Depth Estimation: An Overview. Fine-Tuning Depth Anything V2 👐 📚

Efficient MultiModal Data Pipeline

GRPO for GUI Grounding Done Right