Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
plmsmile 's Collections
image llm
video llm
vision foundation modesl
mllm datasets
image-video llm
benchmarks
llm
text datasets
video generation
train methods
mllm applications

mllm datasets

updated Jun 21, 2024
Upvote
-

  • TextSquare: Scaling up Text-Centric Visual Instruction Tuning

    Paper • 2404.12803 • Published Apr 19, 2024 • 31

  • OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

    Paper • 2406.08418 • Published Jun 12, 2024 • 31

  • SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages

    Paper • 2406.10118 • Published Jun 14, 2024 • 33
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs