@prithivMLmods on Hugging Face: "The POINTS-Reader, a vision-language model for end-to-end document conversion…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

prithivMLmods

posted an update Sep 10, 2025

Post

3141

The POINTS-Reader, a vision-language model for end-to-end document conversion, is a powerful, distillation-free Vision-Language Model that sets new SoTA benchmarks. The demo is now available on HF (Extraction, Preview, Documentation). The input consists of a fixed prompt and a document image, while the output contains only a string (the text extracted from the document image). 🔥🤗

✦ Space/App: prithivMLmods/POINTS-Reader-OCR
✦ Model: tencent/POINTS-Reader
✦ Paper: https://arxiv.org/pdf/2509.01215

🤗 The app is done and ready to go brrrr with zero GPU. Thankyou @merve

.
.
.
To know more about it, visit the app page or the respective model page!!

YuanLiuuuuuu

Sep 11, 2025

Big thumbs up to @prithivMLmods for the new Hugging Face demo "POINTS-Reader"! 🚀 A clean, intuitive demo that makes it easy to try the model .

prithivMLmods

Sep 11, 2025

You guys did an awesome job! @YuanLiuuuuuu
I've been grinding through papers since last night, and it’s been benevolent. 😇

https://arxiv.org/pdf/2509.01215

In this post