Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
kiennt 's Collections
Code LLM
Vision Language Model

Vision Language Model

updated Nov 12, 2023
Upvote
-

  • LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

    Paper • 2311.00571 • Published Nov 1, 2023 • 43

  • Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?

    Paper • 2311.00047 • Published Oct 31, 2023 • 10

  • From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models

    Paper • 2310.08825 • Published Oct 13, 2023 • 1

  • On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving

    Paper • 2311.05332 • Published Nov 9, 2023 • 13

  • LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

    Paper • 2311.05437 • Published Nov 9, 2023 • 51
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs