--- language: - multilingual tags: - deepseek - vision-language - ocr - document-parse base_model: - deepseek-ai/DeepSeek-OCR --- # DeepSeek OCR > [!NOTE] > Note currently only [NexaSDK](https://github.com/NexaAI/nexa-sdk) supports this model's GGUF. ## Quickstart 1. **Install [NexaSDK](https://github.com/NexaAI/nexa-sdk)** 2. Run the model locally with one line of code: ```bash nexa infer NexaAI/DeepSeek-OCR-GGUF ``` 3. Then drag your image to terminal or type into the image path case 1 : extract text ```bash Free OCR. ``` case 2 : extract bounding box ```bash <|grounding|>Convert the document to markdown. ``` ## Model Description **DeepSeek OCR** is a high-accuracy optical character recognition model built for extracting text from complex visual inputs such as documents, screenshots, receipts, and natural scenes. It combines vision-language modeling with efficient visual encoders to achieve superior recognition of multi-language and multi-layout text while remaining lightweight enough for edge or on-device deployment. ## Features - **Multilingual OCR** — recognizes printed and handwritten text across major global languages. - **Document Layout Understanding** — preserves structure such as tables, paragraphs, and titles. - **Scene Text Recognition** — robust against lighting, distortion, and low-quality captures. - **Lightweight & Fast** — optimized for CPU and GPU acceleration. - **End-to-End Pipeline** — supports image-to-text and structured JSON output. ## Use Cases - Digitizing scanned documents or PDFs - Extracting text from mobile camera inputs or screenshots - Invoice and receipt parsing - OCR-based search and indexing systems - Visual question answering or document agents ## Inputs and Outputs **Input:** - Image file (JPEG, PNG, or tensor array) - Optional parameters for language hints or layout detection **Output:** - Extracted text (plain text or structured format with bounding boxes) - Confidence scores per word or region ## Integration DeepSeek OCR can be integrated through: - Python API (`pip install deepseek-ocr`) - REST or gRPC endpoints for server deployment ## License This model is released under the **Apache 2.0 License**, allowing commercial use, modification, and redistribution with attribution.