Yizhuo Li's picture

2 9 4

Yizhuo Li

liyz

·

AI & ML interests

None yet

Organizations

None yet

authored a paper 3 months ago

Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies

Paper • 2508.20072 • Published Aug 27 • 31

authored 10 papers 6 months ago

Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5 • 27

Unmasked Teacher: Towards Training-Efficient Video Foundation Models

Paper • 2303.16058 • Published Mar 28, 2023

VideoChat: Chat-Centric Video Understanding

Paper • 2305.06355 • Published May 10, 2023 • 3

Harvest Video Foundation Models via Efficient Post-Pretraining

Paper • 2310.19554 • Published Oct 30, 2023

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark

Paper • 2311.17005 • Published Nov 28, 2023 • 2

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

Paper • 2307.06942 • Published Jul 13, 2023 • 23

UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer

Paper • 2211.09552 • Published Nov 17, 2022

InternVideo: General Video Foundation Models via Generative and Discriminative Learning

Paper • 2212.03191 • Published Dec 6, 2022

DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models

Paper • 2412.04446 • Published Dec 5, 2024

AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation

Paper • 2506.03126 • Published Jun 3 • 22

authored 2 papers 12 months ago

Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation

Paper • 2412.04432 • Published Dec 5, 2024 • 16

Moto: Latent Motion Token as the Bridging Language for Robot Manipulation

Paper • 2412.04445 • Published Dec 5, 2024 • 23