ArabicOCR-Qwen2.5-VL-7B-Vision
This repository contains the float16
merged version of a Vision-Language Model (VLM), fine-tuned by loay for the specific task of performing Optical Character Recognition (OCR) on Arabic text from images.
The model was created by fine-tuning the unsloth/Qwen2.5-VL-7B-Instruct-bnb-4bit
model using LoRA adapters. The high-performance training was made possible by the Unsloth library, and the adapters were then merged back into the base model for easy deployment.
Model Details
- Fine-tuned by: loay
- Base Model:
unsloth/Qwen2.5-VL-7B-Instruct-bnb-4bit
- Fine-tuning Task: Arabic Optical Character Recognition (OCR)
- Training Data: The model was trained on a curated dataset of images containing Arabic text and their corresponding transcriptions.
- Output Format: This is a
float16
precision model, ideal for inference on GPUs with sufficient VRAM (requires >14GB).
- Downloads last month
- 52
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for loay/ArabicOCR-Qwen2.5-VL-7B-Vision
Base model
Qwen/Qwen2.5-VL-7B-Instruct
Quantized
unsloth/Qwen2.5-VL-7B-Instruct-bnb-4bit