Improve model card: Add Github link and more tags
#13
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,14 +1,19 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
base_model:
|
| 6 |
- meta-llama/Llama-3.2-11B-Vision-Instruct
|
| 7 |
datasets:
|
| 8 |
- Xkev/LLaVA-CoT-100k
|
| 9 |
-
|
|
|
|
| 10 |
library_name: transformers
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
---
|
|
|
|
| 12 |
# Model Card for Model ID
|
| 13 |
|
| 14 |
<!-- Provide a quick summary of what the model is/does. -->
|
|
@@ -24,6 +29,8 @@ The model was proposed in [LLaVA-CoT: Let Vision Language Models Reason Step-by-
|
|
| 24 |
- **License:** apache-2.0
|
| 25 |
- **Finetuned from model:** meta-llama/Llama-3.2-11B-Vision-Instruct
|
| 26 |
|
|
|
|
|
|
|
| 27 |
## Benchmark Results
|
| 28 |
|
| 29 |
| MMStar | MMBench | MMVet | MathVista | AI2D | Hallusion | Average |
|
|
@@ -95,5 +102,5 @@ Using the same setting should accurately reproduce our results.
|
|
| 95 |
|
| 96 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
| 97 |
|
| 98 |
-
The model may generate biased or offensive content, similar to other VLMs, due to limitations in the training data.
|
| 99 |
Technically, the model's performance in aspects like instruction following still falls short of leading industry models.
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
base_model:
|
| 3 |
- meta-llama/Llama-3.2-11B-Vision-Instruct
|
| 4 |
datasets:
|
| 5 |
- Xkev/LLaVA-CoT-100k
|
| 6 |
+
language:
|
| 7 |
+
- en
|
| 8 |
library_name: transformers
|
| 9 |
+
license: apache-2.0
|
| 10 |
+
pipeline_tag: image-text-to-text
|
| 11 |
+
tags:
|
| 12 |
+
- llava
|
| 13 |
+
- reasoning
|
| 14 |
+
- vqa
|
| 15 |
---
|
| 16 |
+
|
| 17 |
# Model Card for Model ID
|
| 18 |
|
| 19 |
<!-- Provide a quick summary of what the model is/does. -->
|
|
|
|
| 29 |
- **License:** apache-2.0
|
| 30 |
- **Finetuned from model:** meta-llama/Llama-3.2-11B-Vision-Instruct
|
| 31 |
|
| 32 |
+
**Code:** [https://github.com/PKU-YuanGroup/LLaVA-CoT](https://github.com/PKU-YuanGroup/LLaVA-CoT)
|
| 33 |
+
|
| 34 |
## Benchmark Results
|
| 35 |
|
| 36 |
| MMStar | MMBench | MMVet | MathVista | AI2D | Hallusion | Average |
|
|
|
|
| 102 |
|
| 103 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
| 104 |
|
| 105 |
+
The model may generate biased or offensive content, similar to other VLMs, due to limitations in the training data.
|
| 106 |
Technically, the model's performance in aspects like instruction following still falls short of leading industry models.
|