Introducing RTEB: A New Standard for Retrieval Evaluation
				
				β’
					
					118
Massive Text Embeddings Benchmark

HUME: Measuring the Human-Model Performance Gap in Text Embedding Task

Maintaining MTEB: Towards Long Term Usability and Reproducibility of Embedding Benchmarks
MTEB is a Python framework for evaluating embeddings and retrieval systems for both text and image. MTEB covers more than 1000 languages and diverse tasks, from classics like classification and clustering to use-case specialized tasks such as legal, code, or healthcare retrieval.
You can get started using mteb, check out our documentation.
| Overview | |
|---|---|
| π Leaderboard | The interactive leaderboard of the benchmark | 
| Get Started. | |
| π Get Started | Overview of how to use mteb | 
| π€ Defining Models | How to use existing model and define custom ones | 
| π Selecting tasks | How to select tasks, benchmarks, splits etc. | 
| π Running Evaluation | How to run the evaluations, including cache management, speeding up evaluations etc. | 
| π Loading Results | How to load and work with existing model results | 
| Overview. | |
| π Tasks | Overview of available tasks | 
| π Benchmarks | Overview of available benchmarks | 
| π€ Models | Overview of available Models | 
| Contributing | |
| π€ Adding a model | How to submit a model to MTEB and to the leaderboard | 
| π©βπ» Adding a dataset | How to add a new task/dataset to MTEB | 
| π©βπ» Adding a benchmark | How to add a new benchmark to MTEB and to the leaderboard | 
| π€ Contributing | How to contribute to MTEB and set it up for development |