jingyiZ00 commited on
Commit
85b9b9e
·
verified ·
1 Parent(s): daae4f3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - HuanjinYao/Mulberry-SFT
5
+ base_model:
6
+ - Qwen/Qwen2-VL-2B-Instruct
7
+ pipeline_tag: image-text-to-text
8
+ library_name: transformers
9
+ ---
10
+ # R1-VL-2B
11
+ R1-VL-2B is a reasoning model trained with step-wise group relative policy optimization (StepGRPO).
12
+
13
+ ### Paper: https://arxiv.org/pdf/2503.12937
14
+ ### Github: https://github.com/jingyi0000/R1-VL
15
+ ### Base model: https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct