agentlans commited on
Commit
4747172
·
verified ·
1 Parent(s): 8087b42

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md CHANGED
@@ -124,6 +124,32 @@ These sentences were randomly selected from the [agentlans/high-quality-english-
124
  3. **Avoid extremely short inputs:** Single words or one-word answers rarely produce useful questions.
125
  4. **Check generated questions:** While the model is powerful, review outputs for accuracy and relevance, especially for educational or professional use.
126
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
127
  ## Licence
128
 
129
  Apache 2.0
 
124
  3. **Avoid extremely short inputs:** Single words or one-word answers rarely produce useful questions.
125
  4. **Check generated questions:** While the model is powerful, review outputs for accuracy and relevance, especially for educational or professional use.
126
 
127
+ ## Training procedure
128
+
129
+ ### Training hyperparameters
130
+
131
+ The following hyperparameters were used during training:
132
+ - learning_rate: 5e-05
133
+ - train_batch_size: 8
134
+ - eval_batch_size: 8
135
+ - seed: 42
136
+ - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
137
+ - lr_scheduler_type: linear
138
+ - num_epochs: 20.0
139
+
140
+ ### Training results
141
+
142
+ The model was trained for 20 epochs on over 153k samples, processing 221M tokens. It achieved a training loss of 0.64 and an evaluation loss of 1.30.
143
+
144
+ Training was efficient, with ~385 samples/sec and ~27k tokens/sec, and evaluation ran at ~820 samples/sec.
145
+
146
+ ### Framework versions
147
+
148
+ - Transformers 4.57.1
149
+ - Pytorch 2.9.0+cu128
150
+ - Datasets 4.3.0
151
+ - Tokenizers 0.22.1
152
+
153
  ## Licence
154
 
155
  Apache 2.0