Simon Kotchou

Simon-Kotchou

Simon-Kotchou

AI & ML interests

Self supervised learning, Computer vision

Recent Activity

upvoted a paper about 2 hours ago

Yume: An Interactive World Generation Model

upvoted a paper 3 days ago

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

liked a dataset 3 days ago

nvidia/PhysicalAI-Autonomous-Vehicle-Cosmos-Drive-Dreams

View all activity

Organizations

upvoted a paper about 2 hours ago

Yume: An Interactive World Generation Model

Paper • 2507.17744 • Published 4 days ago • 65

upvoted a paper 3 days ago

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

Paper • 2503.03983 • Published Mar 6 • 25

liked a dataset 3 days ago

nvidia/PhysicalAI-Autonomous-Vehicle-Cosmos-Drive-Dreams

Updated Jun 15 • 14.8k • 17

liked a model 3 days ago

nvidia/Cosmos-Embed1-448p

1B • Updated Jun 10 • 2.34k • 1

upvoted a paper 10 days ago

Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control

Paper • 2503.14492 • Published Mar 18 • 20

upvoted a paper about 1 month ago

Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion

Paper • 2506.08009 • Published Jun 9 • 26

liked a Space about 1 month ago

309

Self Forcing Wan 2.1

🎥

Real-time video generation

liked a dataset about 1 month ago

Lixsp11/Sekai-Project

Viewer • Updated about 1 month ago • 344k • 541 • 25

upvoted a paper about 1 month ago

Scaling Language-Free Visual Representation Learning

Paper • 2504.01017 • Published Apr 1 • 32

liked a model about 1 month ago

gdhe17/Self-Forcing

Text-to-Video • Updated Jun 12 • 106

liked 2 models about 2 months ago

facebook/vjepa2-vitg-fpc64-384

Video Classification • 1B • Updated Jun 17 • 6.29k • 27

facebook/webssl-dino7b-full8b-518

Image Feature Extraction • 6B • Updated Apr 24 • 24 • 12

upvoted 2 papers 3 months ago

Intuitive physics understanding emerges from self-supervised pretraining on natural videos

Paper • 2502.11831 • Published Feb 17 • 20

Perception Encoder: The best visual embeddings are not at the output of the network

Paper • 2504.13181 • Published Apr 17 • 35

liked a model 3 months ago

nari-labs/Dia-1.6B

Text-to-Speech • Updated Jun 1 • 59.2k • • 2.66k

upvoted a paper 3 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 146

liked a model 5 months ago

Wan-AI/Wan2.1-T2V-14B

Text-to-Video • Updated Mar 12 • 90.7k • • 1.37k

upvoted a paper 5 months ago

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Paper • 2502.10248 • Published Feb 14 • 56

liked a model 5 months ago

stepfun-ai/stepvideo-t2v

Text-to-Video • Updated Feb 19 • 122 • 469

liked a dataset 5 months ago

facebook/natural_reasoning

Viewer • Updated Feb 21 • 1.15M • 1.44k • 512

Simon Kotchou

AI & ML interests

Recent Activity

Organizations

Simon-Kotchou's activity

Self Forcing Wan 2.1