Update README.md
Browse files
README.md
CHANGED
|
@@ -124,6 +124,32 @@ These sentences were randomly selected from the [agentlans/high-quality-english-
|
|
| 124 |
3. **Avoid extremely short inputs:** Single words or one-word answers rarely produce useful questions.
|
| 125 |
4. **Check generated questions:** While the model is powerful, review outputs for accuracy and relevance, especially for educational or professional use.
|
| 126 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 127 |
## Licence
|
| 128 |
|
| 129 |
Apache 2.0
|
|
|
|
| 124 |
3. **Avoid extremely short inputs:** Single words or one-word answers rarely produce useful questions.
|
| 125 |
4. **Check generated questions:** While the model is powerful, review outputs for accuracy and relevance, especially for educational or professional use.
|
| 126 |
|
| 127 |
+
## Training procedure
|
| 128 |
+
|
| 129 |
+
### Training hyperparameters
|
| 130 |
+
|
| 131 |
+
The following hyperparameters were used during training:
|
| 132 |
+
- learning_rate: 5e-05
|
| 133 |
+
- train_batch_size: 8
|
| 134 |
+
- eval_batch_size: 8
|
| 135 |
+
- seed: 42
|
| 136 |
+
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 137 |
+
- lr_scheduler_type: linear
|
| 138 |
+
- num_epochs: 20.0
|
| 139 |
+
|
| 140 |
+
### Training results
|
| 141 |
+
|
| 142 |
+
The model was trained for 20 epochs on over 153k samples, processing 221M tokens. It achieved a training loss of 0.64 and an evaluation loss of 1.30.
|
| 143 |
+
|
| 144 |
+
Training was efficient, with ~385 samples/sec and ~27k tokens/sec, and evaluation ran at ~820 samples/sec.
|
| 145 |
+
|
| 146 |
+
### Framework versions
|
| 147 |
+
|
| 148 |
+
- Transformers 4.57.1
|
| 149 |
+
- Pytorch 2.9.0+cu128
|
| 150 |
+
- Datasets 4.3.0
|
| 151 |
+
- Tokenizers 0.22.1
|
| 152 |
+
|
| 153 |
## Licence
|
| 154 |
|
| 155 |
Apache 2.0
|