nasos10 commited on
Commit
7c59473
·
verified ·
1 Parent(s): 7ed2c14

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -3
README.md CHANGED
@@ -1,3 +1,20 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - openai/gsm8k
5
+ base_model:
6
+ - meta-llama/Meta-Llama-3-8B
7
+ library_name: transformers
8
+ ---
9
+
10
+ #### MuToR: Multi-Token prediction with Registers
11
+ Arxiv: [https://arxiv.org/abs/2505.10518](https://arxiv.org/abs/2505.10518)
12
+
13
+ **TL;DR**: **MuToR** is a simple, plug-and-play approach for multi-token prediction.
14
+ It leverages dummy register tokens to predict multiple targets in the future, enriching the supervisory signal and improving performance across diverse settings and modalities. The register tokens are discarded on inference, leaving generation speed unchanged.
15
+
16
+ ---
17
+
18
+ #### Model Description
19
+ This model is a finetuned version of **Llama 3 8B**. It was finetuned using the MuToR method for 5 epochs on the GSM8K training split.
20
+ Please refer to our [code](https://github.com/nasosger/MuToR) for guidelines on how to use the models to reproduce our results.