Zesen Cheng
ClownRat
AI & ML interests
multi-modal foundation model; Segmentation, Detection, and Tracking;
Recent Activity
liked
a dataset
9 days ago
OpenGVLab/VideoChat-Flash-Training-Data
liked
a Space
17 days ago
lixin4ever/VideoRefer-VideoLLaMA3
upvoted
a
paper
about 2 months ago
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical
Understanding and Reasoning