Improve model card with pipeline tag, library name, and GitHub link (#1)
Browse files- Improve model card with pipeline tag, library name, and GitHub link (298b937a8562c2c1d8da7a158a610096915569d1)
Co-authored-by: Niels Rogge <[email protected]>
README.md
CHANGED
|
@@ -1,3 +1,17 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: cc-by-sa-4.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-sa-4.0
|
| 3 |
+
pipeline_tag: question-answering
|
| 4 |
+
library_name: transformers
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
# Search and Refine During Think: Autonomous Retrieval-Augmented Reasoning of LLMs
|
| 8 |
+
|
| 9 |
+
The model was presented in the paper [Search and Refine During Think: Autonomous Retrieval-Augmented Reasoning of LLMs](https://huggingface.co/papers/2505.11277).
|
| 10 |
+
|
| 11 |
+
# Paper abstract
|
| 12 |
+
|
| 13 |
+
Large language models have demonstrated impressive reasoning capabilities but are inherently limited by their knowledge reservoir. Retrieval-augmented reasoning mitigates this limitation by allowing LLMs to query external resources, but existing methods often retrieve irrelevant or noisy information, hindering accurate reasoning. In this paper, we propose AutoRefine, a reinforcement learning post-training framework that adopts a new ``search-and-refine-during-think'' paradigm. AutoRefine introduces explicit knowledge refinement steps between successive search calls, enabling the model to iteratively filter, distill, and organize evidence before generating an answer. Furthermore, we incorporate tailored retrieval-specific rewards alongside answer correctness rewards using group relative policy optimization. Experiments on single-hop and multi-hop QA benchmarks demonstrate that AutoRefine significantly outperforms existing approaches, particularly in complex, multi-hop reasoning scenarios. Detailed analysis shows that AutoRefine issues frequent, higher-quality searches and synthesizes evidence effectively.
|
| 14 |
+
|
| 15 |
+
# Code
|
| 16 |
+
|
| 17 |
+
The code for this project is available on GitHub: [https://github.com/volcengine/verl](https://github.com/volcengine/verl) (Note: this links to the base project mentioned in acknowledgements, a more specific repo link should be added if available within the model repo itself.)
|