Text Generation
Transformers
Safetensors
qwen2
text-generation-inference
conversational
MRAGU lbourdois commited on
Commit
53950c2
Β·
verified Β·
1 Parent(s): 4a8b5a5

Improve language tag (#1)

Browse files

- Improve language tag (6797c83afdd7b372f9e41b936f51cc12ab328d71)


Co-authored-by: LoΓ―ck BOURDOIS <[email protected]>

Files changed (1) hide show
  1. README.md +65 -53
README.md CHANGED
@@ -1,54 +1,66 @@
1
- ---
2
- license: mit
3
- language:
4
- - vi
5
- datasets:
6
- - 5CD-AI/Vietnamese-cosmos-qa-gg-translated
7
- base_model:
8
- - Qwen/Qwen2.5-0.5B
9
- library_name: transformers
10
- tags:
11
- - text-generation-inference
12
- ---
13
-
14
- <div align="center">
15
- <img src="https://github.com/bloomifycafe/blossomsAI/blob/main/assets/logo.png?raw=true" alt="Logo"/>
16
- </div>
17
- </br>
18
- <div align="center">
19
-
20
- # 🌟 BloomVN-0.5B-ppo
21
-
22
- </div>
23
-
24
- ### A fine-tuned multilingual model for Vietnamese language
25
-
26
- ## πŸ“‹ Overview
27
-
28
- This model serves as a small-scale experiment (0.5B parameters) testing the Reinforcement Learning capabilities of veRL framework. The implementation uses PPO (Proximal Policy Optimization) method on a limited training dataset to evaluate veRL's performance and training behavior.
29
-
30
- ## πŸ”§ Method
31
-
32
- The experimentation process was conducted using [veRL](https://github.com/volcengine/verl), focusing on:
33
- - Implementation of PPO algorithm with a 0.5B parameter model
34
- - Running training experiments on a small dataset
35
- - Testing veRL's framework capabilities in handling RL tasks
36
- - Evaluating training efficiency and model behavior
37
-
38
- This lightweight approach allowed us to assess veRL's performance in a controlled, small-scale environment.
39
-
40
- ## πŸ“Š VLMU Benchmark
41
-
42
- | EVALUATION DATE | STEM πŸ”¬ | SOCIAL SCIENCE 🌍 | HUMANITIES πŸ“š | OTHERS 🎯 | AVG ⭐ |
43
- |----------------|--------|------------------|---------------|-----------|--------|
44
- | 07/02/2025 | 23.18 | 32.84 | 32.71 | 33.67 | 29.43 |
45
-
46
-
47
- ## 🀝 Contributors
48
-
49
- Developed with ❀️ by [BlossomAI](https://github.com/BlossomAI)
50
-
51
- ---
52
- <div align="center">
53
- <sub>Star ⭐️ this repo if you find it valuable!</sub>
 
 
 
 
 
 
 
 
 
 
 
 
54
  </div>
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - zho
5
+ - eng
6
+ - fra
7
+ - spa
8
+ - por
9
+ - deu
10
+ - ita
11
+ - rus
12
+ - jpn
13
+ - kor
14
+ - vie
15
+ - tha
16
+ - ara
17
+ datasets:
18
+ - 5CD-AI/Vietnamese-cosmos-qa-gg-translated
19
+ base_model:
20
+ - Qwen/Qwen2.5-0.5B
21
+ library_name: transformers
22
+ tags:
23
+ - text-generation-inference
24
+ ---
25
+
26
+ <div align="center">
27
+ <img src="https://github.com/bloomifycafe/blossomsAI/blob/main/assets/logo.png?raw=true" alt="Logo"/>
28
+ </div>
29
+ </br>
30
+ <div align="center">
31
+
32
+ # 🌟 BloomVN-0.5B-ppo
33
+
34
+ </div>
35
+
36
+ ### A fine-tuned multilingual model for Vietnamese language
37
+
38
+ ## πŸ“‹ Overview
39
+
40
+ This model serves as a small-scale experiment (0.5B parameters) testing the Reinforcement Learning capabilities of veRL framework. The implementation uses PPO (Proximal Policy Optimization) method on a limited training dataset to evaluate veRL's performance and training behavior.
41
+
42
+ ## πŸ”§ Method
43
+
44
+ The experimentation process was conducted using [veRL](https://github.com/volcengine/verl), focusing on:
45
+ - Implementation of PPO algorithm with a 0.5B parameter model
46
+ - Running training experiments on a small dataset
47
+ - Testing veRL's framework capabilities in handling RL tasks
48
+ - Evaluating training efficiency and model behavior
49
+
50
+ This lightweight approach allowed us to assess veRL's performance in a controlled, small-scale environment.
51
+
52
+ ## πŸ“Š VLMU Benchmark
53
+
54
+ | EVALUATION DATE | STEM πŸ”¬ | SOCIAL SCIENCE 🌍 | HUMANITIES πŸ“š | OTHERS 🎯 | AVG ⭐ |
55
+ |----------------|--------|------------------|---------------|-----------|--------|
56
+ | 07/02/2025 | 23.18 | 32.84 | 32.71 | 33.67 | 29.43 |
57
+
58
+
59
+ ## 🀝 Contributors
60
+
61
+ Developed with ❀️ by [BlossomAI](https://github.com/BlossomAI)
62
+
63
+ ---
64
+ <div align="center">
65
+ <sub>Star ⭐️ this repo if you find it valuable!</sub>
66
  </div>