textract-ai - FIXED VERSION βœ…

πŸŽ‰ FIXED: Hub loading now works properly!

A high-accuracy OCR model based on Qwen2-VL-2B-Instruct, now with proper Hugging Face Hub support.

βœ… What's Fixed

  • Hub Loading: AutoModel.from_pretrained() now works correctly
  • from_pretrained Method: Proper implementation added
  • Configuration: Fixed model configuration for Hub compatibility
  • Error Handling: Improved error handling and fallbacks

πŸš€ Quick Start (NOW WORKS!)

from transformers import AutoModel
from PIL import Image

# Load model from Hub (FIXED!)
model = AutoModel.from_pretrained("BabaK07/textract-ai", trust_remote_code=True)

# Load image
image = Image.open("your_image.jpg")

# Extract text
result = model.generate_ocr_text(image, use_native=True)

print(f"Text: {result['text']}")
print(f"Confidence: {result['confidence']:.1%}")
print(f"Success: {result['success']}")

πŸ“Š Performance

  • 🎯 Accuracy: High accuracy OCR (up to 95% confidence)
  • ⏱️ Speed: ~13 seconds per image (high quality)
  • 🌍 Languages: Multi-language support
  • πŸ’» Device: CPU and GPU support
  • πŸ“„ Documents: Excellent for complex documents

πŸ› οΈ Features

  • βœ… Hub Loading: Works with AutoModel.from_pretrained()
  • βœ… High Accuracy: Based on Qwen2-VL-2B-Instruct
  • βœ… Multi-language: Supports many languages
  • βœ… Document OCR: Excellent for invoices, forms, documents
  • βœ… Robust Processing: Multiple extraction methods
  • βœ… Production Ready: Error handling included

πŸ“ Usage Examples

Basic Usage

from transformers import AutoModel
from PIL import Image

model = AutoModel.from_pretrained("BabaK07/textract-ai", trust_remote_code=True)
image = Image.open("document.jpg")
result = model.generate_ocr_text(image, use_native=True)

High Accuracy Mode

result = model.generate_ocr_text(image, use_native=True)  # Best accuracy

Fast Mode

result = model.generate_ocr_text(image, use_native=False)  # Faster processing

File Path Input

result = model.generate_ocr_text("path/to/your/image.jpg")

πŸ”§ Installation

pip install torch transformers pillow

πŸ“ˆ Model Details

  • Base Model: Qwen/Qwen2-VL-2B-Instruct
  • Model Size: ~2.5B parameters
  • Architecture: Vision-Language Transformer
  • Optimization: OCR-specific processing
  • Training: Custom OCR pipeline

πŸ†š Comparison

Feature Before (Broken) After (FIXED)
Hub Loading ❌ ValueError βœ… Works perfectly
from_pretrained ❌ Missing βœ… Implemented
AutoModel ❌ Failed βœ… Compatible
Configuration ❌ Invalid βœ… Proper config

🎯 Use Cases

  • High-Accuracy OCR: When accuracy is most important
  • Document Processing: Complex invoices, forms, contracts
  • Multi-language Text: International documents
  • Professional OCR: Business and enterprise use
  • Research Applications: Academic and research projects

πŸ”— Related Models

πŸ“ž Support

For issues or questions, please check the model repository or contact the author.


Status: βœ… FIXED and ready for production use!

Downloads last month
33
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for BabaK07/textract-ai

Base model

Qwen/Qwen2-VL-2B
Finetuned
(274)
this model