OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment Paper • 2510.07743 • Published 25 days ago • 8
view article Article Introducing MTEB v2: Evaluation of embedding and retrieval systems for more than just text By isaacchung and 2 others • 13 days ago • 33
view article Article Vocabulary is the most important element of Sparse Retrieval By yjoonjang • 29 days ago • 8
view article Article Reinforcement Learning for Large Language Models: Beyond the Agent Paradigm By royswastik • Mar 19 • 8
RaDeR training datasets Collection These are some of the retrieval training datasets used for training RaDeR models, sonsisting of different types of query combinations. • 3 items • Updated Jun 12 • 1
JinaVDR (Visual Document Retrieval) Collection max. ~1000 images and OCR text included • 42 items • Updated Jul 20 • 6
Solving math word problems with process- and outcome-based feedback Paper • 2211.14275 • Published Nov 25, 2022 • 10
view article Article MedEmbed: Fine-Tuned Embedding Models for Medical / Clinical IR By abhinand • Oct 20, 2024 • 51
BioClinical ModernBERT Collection This project was a collaboration between members of the Dana-Farber Cancer Institute, LightOn, MIT, OpenEvidence and Microsoft. • 3 items • Updated Sep 9 • 11
view article Article Mitigating False Negatives in Multiple Negatives Ranking Loss for Retriever Training By dragonkue • May 25 • 20