PaddleOCR ONNX Models

Summary

This repository provides ONNX-format implementations of PaddleOCR models, offering comprehensive optical character recognition capabilities for multilingual text detection and recognition. The models are exported from the original PaddleOCR framework and optimized for efficient inference across various deployment scenarios.

The repository contains two main types of models: text detection models that identify text regions in images, and text recognition models that convert detected text regions into actual text content. The detection models utilize advanced segmentation-based approaches with post-processing techniques including contour detection, polygon approximation, and score thresholding to accurately localize text regions with high precision.

For text recognition, the models support multiple languages including Chinese, English, Japanese, Korean, Arabic, Cyrillic, Devanagari, and several other scripts. Each recognition model comes with its own character dictionary (dict.txt) tailored to the specific language or script family. The recognition pipeline handles text normalization, feature extraction, and sequence decoding to produce accurate text transcriptions with confidence scores.

The models are designed with practical deployment in mind, offering various versions optimized for different use cases - from lightweight mobile applications to high-accuracy server deployments. Key features include support for rotated text detection, duplicate character removal, and configurable confidence thresholds for balancing precision and recall in real-world applications.

Usage

The models can be easily used through the dghs-imgutils library:

# Installation
pip install dghs-imgutils
# Basic OCR usage
from imgutils.ocr import ocr, list_det_models, list_rec_models

# List available models
print("Detection models:", list_det_models())
print("Recognition models:", list_rec_models())

# Perform OCR on an image
results = ocr('your_image.jpg')
for bbox, text, confidence in results:
    print(f"Text: {text}, Confidence: {confidence:.4f}, BBox: {bbox}")

# Custom model selection
results = ocr('your_image.jpg', 
              detect_model='ch_PP-OCRv4_det',
              recognize_model='japan_PP-OCRv3_rec')
# Text detection only
from imgutils.ocr import detect_text_with_ocr

# Detect text regions without recognition
detections = detect_text_with_ocr('your_image.jpg')
for bbox, label, confidence in detections:
    print(f"BBox: {bbox}, Confidence: {confidence:.4f}")

Available Models

Text Detection Models

  • ch_PP-OCRv2_det - Chinese text detection v2
  • ch_PP-OCRv3_det - Chinese text detection v3
  • ch_PP-OCRv4_det - Chinese text detection v4
  • ch_PP-OCRv4_server_det - Server-optimized Chinese detection v4
  • ch_ppocr_mobile_slim_v2.0_det - Lightweight mobile detection
  • ch_ppocr_mobile_v2.0_det - Mobile-optimized detection
  • ch_ppocr_server_v2.0_det - Server-optimized detection
  • en_PP-OCRv3_det - English text detection

Text Recognition Models

  • arabic_PP-OCRv3_rec - Arabic text recognition
  • ch_PP-OCRv2_rec - Chinese text recognition v2
  • ch_PP-OCRv3_rec - Chinese text recognition v3
  • ch_PP-OCRv4_rec - Chinese text recognition v4
  • ch_PP-OCRv4_server_rec - Server-optimized Chinese recognition v4
  • ch_ppocr_mobile_v2.0_rec - Mobile-optimized Chinese recognition
  • ch_ppocr_server_v2.0_rec - Server-optimized Chinese recognition
  • chinese_cht_PP-OCRv3_rec - Traditional Chinese recognition
  • cyrillic_PP-OCRv3_rec - Cyrillic script recognition
  • devanagari_PP-OCRv3_rec - Devanagari script recognition
  • en_PP-OCRv3_rec - English text recognition v3
  • en_PP-OCRv4_rec - English text recognition v4
  • en_number_mobile_v2.0_rec - Mobile-optimized number recognition
  • japan_PP-OCRv3_rec - Japanese text recognition
  • ka_PP-OCRv3_rec - Kannada text recognition
  • korean_PP-OCRv3_rec - Korean text recognition
  • latin_PP-OCRv3_rec - Latin script recognition
  • ta_PP-OCRv3_rec - Tamil text recognition
  • te_PP-OCRv3_rec - Telugu text recognition

Model Configuration

The OCR pipeline supports several configurable parameters:

  • heat_threshold: Heat map threshold for text detection (default: 0.3)
  • box_threshold: Box confidence threshold (default: 0.7)
  • max_candidates: Maximum number of text candidates (default: 1000)
  • unclip_ratio: Expansion ratio for detected boxes (default: 2.0)
  • rotation_threshold: Aspect ratio threshold for rotation detection (default: 1.5)
  • is_remove_duplicate: Whether to remove duplicate characters (default: False)

Performance Notes

  • The default detection model ch_PP-OCRv4_det provides excellent balance of accuracy and speed
  • The default recognition model ch_PP-OCRv4_rec supports both Chinese and English with high accuracy
  • For specific languages, choose the corresponding recognition model for optimal results
  • Server versions generally offer higher accuracy at the cost of increased computational requirements
  • Mobile versions are optimized for speed and resource efficiency

Original Content

Onnx version of PaddleOCR.

Citation

@misc{paddleocr_onnx,
  title        = {{PaddleOCR ONNX Models}},
  author       = {PaddlePaddle and Repository Contributors},
  howpublished = {\url{https://huggingface.co/deepghs/paddleocr}},
  year         = {2023},
  note         = {ONNX-format implementations of PaddleOCR models for multilingual text detection and recognition},
  abstract     = {This repository provides ONNX-format implementations of PaddleOCR models, offering comprehensive optical character recognition capabilities for multilingual text detection and recognition. The models are exported from the original PaddleOCR framework and optimized for efficient inference across various deployment scenarios. The repository contains text detection models that identify text regions in images and text recognition models that convert detected text regions into actual text content, supporting multiple languages including Chinese, English, Japanese, Korean, Arabic, Cyrillic, Devanagari, and several other scripts.},
  keywords     = {OCR, text-detection, text-recognition, multilingual, ONNX}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support