Improve model card: Add pipeline tag, library_name, paper, code, usage, and additional tags
#1
by
nielsr
HF Staff
- opened
This PR significantly enhances the model card for Senqiao/VisionThink-General
by:
- Adding
pipeline_tag: image-text-to-text
to enable better discoverability for multimodal tasks on the Hugging Face Hub. - Specifying
library_name: transformers
as the model is compatible with the Hugging Face Transformers library. - Including additional relevant
tags
such asvision-language-model
,multimodal
, andqwen
. - Providing a detailed description of the model, summarizing its core contributions from the paper.
- Including a direct link to the official paper: VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning.
- Adding a direct link to the official GitHub repository for the code: https://github.com/dvlab-research/VisionThink.
- Incorporating key highlights of the model's capabilities.
- Adding installation instructions and a practical Python code snippet for quick inference using
transformers
. - Including the citation and acknowledgement sections.
These additions will make the model more discoverable, informative, and user-friendly for researchers and practitioners.