lbourdois commited on
Commit
18e7b7f
·
verified ·
1 Parent(s): 7e49de0

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +71 -57
README.md CHANGED
@@ -8,65 +8,79 @@ tags:
8
  - trl
9
  - sft
10
  - generated_from_trainer
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  model-index:
12
  - name: trained_model
13
  results: []
14
  ---
15
-
16
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
- should probably proofread and complete it, then remove this comment. -->
18
-
19
- # trained_model
20
-
21
- This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) on the generator dataset.
22
- It achieves the following results on the evaluation set:
23
- - Loss: 0.5432
24
- - Bertscore Precision: 0.9305
25
- - Bertscore Recall: 0.9338
26
- - Bertscore F1: 0.9321
27
-
28
- ## Model description
29
-
30
- More information needed
31
-
32
- ## Intended uses & limitations
33
-
34
- More information needed
35
-
36
- ## Training and evaluation data
37
-
38
- More information needed
39
-
40
- ## Training procedure
41
-
42
- ### Training hyperparameters
43
-
44
- The following hyperparameters were used during training:
45
- - learning_rate: 0.0001
46
- - train_batch_size: 2
47
- - eval_batch_size: 2
48
- - seed: 42
49
- - gradient_accumulation_steps: 8
50
- - total_train_batch_size: 16
51
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
52
- - lr_scheduler_type: linear
53
- - num_epochs: 5
54
-
55
- ### Training results
56
-
57
- | Training Loss | Epoch | Step | Validation Loss | Bertscore Precision | Bertscore Recall | Bertscore F1 |
58
- |:-------------:|:------:|:----:|:---------------:|:-------------------:|:----------------:|:------------:|
59
- | No log | 0.9664 | 18 | 1.1003 | 0.8802 | 0.8897 | 0.8849 |
60
- | 1.7123 | 1.9866 | 37 | 0.6787 | 0.9207 | 0.9228 | 0.9218 |
61
- | 1.7123 | 2.9530 | 55 | 0.5895 | 0.9300 | 0.9330 | 0.9315 |
62
- | 0.5828 | 3.9732 | 74 | 0.5516 | 0.9330 | 0.9355 | 0.9342 |
63
- | 0.4501 | 4.8322 | 90 | 0.5432 | 0.9305 | 0.9338 | 0.9321 |
64
-
65
-
66
- ### Framework versions
67
-
68
- - PEFT 0.13.0
69
- - Transformers 4.45.1
70
- - Pytorch 2.5.1+cpu
71
- - Datasets 3.0.1
72
  - Tokenizers 0.20.0
 
8
  - trl
9
  - sft
10
  - generated_from_trainer
11
+ language:
12
+ - zho
13
+ - eng
14
+ - fra
15
+ - spa
16
+ - por
17
+ - deu
18
+ - ita
19
+ - rus
20
+ - jpn
21
+ - kor
22
+ - vie
23
+ - tha
24
+ - ara
25
  model-index:
26
  - name: trained_model
27
  results: []
28
  ---
29
+
30
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
31
+ should probably proofread and complete it, then remove this comment. -->
32
+
33
+ # trained_model
34
+
35
+ This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) on the generator dataset.
36
+ It achieves the following results on the evaluation set:
37
+ - Loss: 0.5432
38
+ - Bertscore Precision: 0.9305
39
+ - Bertscore Recall: 0.9338
40
+ - Bertscore F1: 0.9321
41
+
42
+ ## Model description
43
+
44
+ More information needed
45
+
46
+ ## Intended uses & limitations
47
+
48
+ More information needed
49
+
50
+ ## Training and evaluation data
51
+
52
+ More information needed
53
+
54
+ ## Training procedure
55
+
56
+ ### Training hyperparameters
57
+
58
+ The following hyperparameters were used during training:
59
+ - learning_rate: 0.0001
60
+ - train_batch_size: 2
61
+ - eval_batch_size: 2
62
+ - seed: 42
63
+ - gradient_accumulation_steps: 8
64
+ - total_train_batch_size: 16
65
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
66
+ - lr_scheduler_type: linear
67
+ - num_epochs: 5
68
+
69
+ ### Training results
70
+
71
+ | Training Loss | Epoch | Step | Validation Loss | Bertscore Precision | Bertscore Recall | Bertscore F1 |
72
+ |:-------------:|:------:|:----:|:---------------:|:-------------------:|:----------------:|:------------:|
73
+ | No log | 0.9664 | 18 | 1.1003 | 0.8802 | 0.8897 | 0.8849 |
74
+ | 1.7123 | 1.9866 | 37 | 0.6787 | 0.9207 | 0.9228 | 0.9218 |
75
+ | 1.7123 | 2.9530 | 55 | 0.5895 | 0.9300 | 0.9330 | 0.9315 |
76
+ | 0.5828 | 3.9732 | 74 | 0.5516 | 0.9330 | 0.9355 | 0.9342 |
77
+ | 0.4501 | 4.8322 | 90 | 0.5432 | 0.9305 | 0.9338 | 0.9321 |
78
+
79
+
80
+ ### Framework versions
81
+
82
+ - PEFT 0.13.0
83
+ - Transformers 4.45.1
84
+ - Pytorch 2.5.1+cpu
85
+ - Datasets 3.0.1
86
  - Tokenizers 0.20.0