Improve model card: Add Github link and more tags

#13
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +12 -5
README.md CHANGED
@@ -1,14 +1,19 @@
1
  ---
2
- license: apache-2.0
3
- language:
4
- - en
5
  base_model:
6
  - meta-llama/Llama-3.2-11B-Vision-Instruct
7
  datasets:
8
  - Xkev/LLaVA-CoT-100k
9
- pipeline_tag: image-text-to-text
 
10
  library_name: transformers
 
 
 
 
 
 
11
  ---
 
12
  # Model Card for Model ID
13
 
14
  <!-- Provide a quick summary of what the model is/does. -->
@@ -24,6 +29,8 @@ The model was proposed in [LLaVA-CoT: Let Vision Language Models Reason Step-by-
24
  - **License:** apache-2.0
25
  - **Finetuned from model:** meta-llama/Llama-3.2-11B-Vision-Instruct
26
 
 
 
27
  ## Benchmark Results
28
 
29
  | MMStar | MMBench | MMVet | MathVista | AI2D | Hallusion | Average |
@@ -95,5 +102,5 @@ Using the same setting should accurately reproduce our results.
95
 
96
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
97
 
98
- The model may generate biased or offensive content, similar to other VLMs, due to limitations in the training data.
99
  Technically, the model's performance in aspects like instruction following still falls short of leading industry models.
 
1
  ---
 
 
 
2
  base_model:
3
  - meta-llama/Llama-3.2-11B-Vision-Instruct
4
  datasets:
5
  - Xkev/LLaVA-CoT-100k
6
+ language:
7
+ - en
8
  library_name: transformers
9
+ license: apache-2.0
10
+ pipeline_tag: image-text-to-text
11
+ tags:
12
+ - llava
13
+ - reasoning
14
+ - vqa
15
  ---
16
+
17
  # Model Card for Model ID
18
 
19
  <!-- Provide a quick summary of what the model is/does. -->
 
29
  - **License:** apache-2.0
30
  - **Finetuned from model:** meta-llama/Llama-3.2-11B-Vision-Instruct
31
 
32
+ **Code:** [https://github.com/PKU-YuanGroup/LLaVA-CoT](https://github.com/PKU-YuanGroup/LLaVA-CoT)
33
+
34
  ## Benchmark Results
35
 
36
  | MMStar | MMBench | MMVet | MathVista | AI2D | Hallusion | Average |
 
102
 
103
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
104
 
105
+ The model may generate biased or offensive content, similar to other VLMs, due to limitations in the training data.
106
  Technically, the model's performance in aspects like instruction following still falls short of leading industry models.