This repository contains one of the models analyzed in our paper Reverse-Engineering the Retrieval Process in GenIR Models.
Training
The model is based on T5-large and was trained on the TriviaQA dataset as a atomic GenIR model reproducing DSI.
Model Overview
| Model | Huggingface URL | 
|---|---|
| NQ10k | DSI-large-NQ10k | 
| NQ100k | DSI-large-NQ100k | 
| NQ320k | DSI-large-NQ320k | 
| Trivia-QA | DSI-large-TriviaQA | 
| Trivia-QA QG | DSI-large-TriviaQA QG | 
Citation
@inproceedings{Reusch2025Reverse,
  author = {Reusch, Anja and Belinkov, Yonatan},
  title = {Reverse-Engineering the Retrieval Process in GenIR Models},
  year = {2025},
  isbn = {9798400715921},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3726302.3730076},
  doi = {10.1145/3726302.3730076},
  booktitle = {Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval},
  pages = {668โ677},
  numpages = {10},
  location = {Padua, Italy},
  series = {SIGIR '25}
}
- Downloads last month
- 8
Model tree for AnReu/DSI-large-TriviaQA
Base model
google-t5/t5-large