File size: 697 Bytes

0b845be
 
3d36e87
 
 
 
0e7da04
0b845be
 
3d36e87
0b845be
 
3d36e87
0b845be
2d4a120
0b845be
3d36e87
 
4a1bebb
0e7da04

---
library_name: transformers
language:
- en
base_model:
- allenai/olmOCR-7B-0225-preview
license: apache-2.0
---

# olmOCR-7B-faithful

<!-- Provide a quick summary of what the model is/does. -->
This is a fine-tuned version of the olmOCR-7B-0225-preview model that aims to extract all information from a given document, including header and footer information.

More information on how we fine-tuned the model can be found in our [blog post](https://huggingface.co/blog/tngtech/finetuning-olmocr-to-be-a-faithful-ocr-engine).

## Acknowledgment
We thank the Allen Institute for AI and Alibaba Cloud for their great open-source work, which enabled this fine-tuning project.

Improved using Qwen.