Spaces:

nimaidev
/

indictrans2-translator

Running

App Files Files Community

nimaic commited on 28 days ago

Commit

86f7d50

1 Parent(s): d3fae04

initial commit

Browse files

Files changed (4) hide show

Dockerfile +73 -0
README.md +390 -6
app.py +196 -0
requirements.txt +10 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,73 @@

+# Use Python base image with CPU
+FROM python:3.10-slim
+# Environment setup
+ENV PYTHONUNBUFFERED=1
+ENV PYTHONDONTWRITEBYTECODE=1
+ENV DEBIAN_FRONTEND=noninteractive
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    git \
+    wget \
+    curl \
+    build-essential \
+    python3-dev \
+    libffi-dev \
+    libssl-dev \
+    libjpeg-dev \
+    libpng-dev \
+    libfreetype6-dev \
+    pkg-config \
+    && rm -rf /var/lib/apt/lists/*
+# Set working directory
+WORKDIR /app
+# Preinstall Python base tools
+RUN pip install --no-cache-dir --upgrade pip setuptools wheel
+# Install base ML + tokenizers + NLP tools
+RUN pip install --no-cache-dir \
+    transformers>=4.33.2 \
+    sentencepiece \
+    sacremoses \
+    nltk \
+    pandas \
+    regex \
+    mock \
+    mosestokenizer \
+    bitsandbytes \
+    scipy \
+    accelerate \
+    datasets
+# Download NLTK data
+RUN python3 -c "import nltk; nltk.download('punkt')"
+# Install FastAPI app dependencies
+RUN pip install --no-cache-dir fastapi uvicorn pydantic psutil
+# Clone IndicTransToolkit directly to app and install editable
+RUN git clone https://github.com/VarunGumma/IndicTransToolkit.git /app/IndicTransToolkit && \
+    pip install -e /app/IndicTransToolkit
+    # rm -rf /tmp/IndicTransToolkit
+# Copy app source
+COPY . .
+# Create non-root user
+RUN useradd --create-home --shell /bin/bash app
+RUN chown -R app:app /app
+USER app
+# Health route port
+EXPOSE 7860
+# Optional healthcheck (requires you to define GET /health)
+HEALTHCHECK --interval=30s --timeout=30s --start-period=60s --retries=3 \
+    CMD curl -f http://localhost:7860/health || exit 1
+# Run FastAPI
+CMD ["python", "-m", "uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

README.md CHANGED Viewed

@@ -1,11 +1,395 @@
 ---
-title: Indictrans2 Translator
-emoji: 🌍
-colorFrom: blue
-colorTo: pink
 sdk: docker
 pinned: false
-license: mit
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# IndicTrans2 Language Server
 ---
+title: IndicTrans2 Translator API
+emoji: 🌐
+colorFrom: indigo
+colorTo: blue
 sdk: docker
+app_file: app.py
 pinned: false
+tags:
+  - translation
+  - fastapi
+  - docker
+  - indic
+license: apache-2.0
 ---
+# IndicTrans2 Translator API
+A multilingual FastAPI translation backend powered by AI4Bharat's IndicTrans2.
+A high-performance FastAPI-based translation server powered by [IndicTrans2](https://github.com/AI4Bharat/IndicTrans2) models for seamless translation between 22 scheduled Indian languages and English.
+## 🌟 Features
+- **Multilingual Support**: Translate between 12 major Indian languages and English
+- **High-Quality Translations**: Powered by state-of-the-art IndicTrans2 models
+- **REST API**: Simple HTTP API for easy integration
+- **GPU Acceleration**: CUDA support for faster inference
+- **Memory Optimization**: Efficient model loading and GPU memory management
+- **Graceful Shutdown**: Proper cleanup of resources on server termination
+## 🚀 Quick Start
+### Prerequisites
+- Python 3.7+
+- CUDA-compatible GPU (recommended)
+- At least 8GB GPU memory for optimal performance
+### Installation
+1. **Clone the repository**
+   ```bash
+   git clone https://github.com/AI4Bharat/IndicTrans2
+   cd IndicTrans2
+   ```
+2. **Install dependencies**
+   ```bash
+   # Install IndicTrans2 dependencies
+   source install.sh
+   # Install additional requirements for the server
+   pip install fastapi uvicorn torch transformers
+   ```
+3. **Install IndicTransToolkit**
+   ```bash
+   cd huggingface_interface/IndicTransToolkit
+   pip install -e .
+   cd ../..
+   ```
+4. **Run the server**
+   ```bash
+   python lang_server.py
+   ```
+The server will start on `http://0.0.0.0:9000`
+## 📋 Supported Languages
+The server supports translation between the following languages:
+| Language | Code | Script |
+|----------|------|--------|
+| English | `eng_Latn` | Latin |
+| Bengali | `ben_Beng` | Bengali |
+| Punjabi | `pan_Guru` | Gurmukhi |
+| Assamese | `asm_Beng` | Bengali |
+| Konkani | `gom_Deva` | Devanagari |
+| Gujarati | `guj_Gujr` | Gujarati |
+| Hindi | `hin_Deva` | Devanagari |
+| Kannada | `kan_Knda` | Kannada |
+| Malayalam | `mal_Mlym` | Malayalam |
+| Odia | `ory_Orya` | Odia |
+| Tamil | `tam_Taml` | Tamil |
+| Telugu | `tel_Telu` | Telugu |
+## 🔧 API Usage
+### Translation Endpoint
+**POST** `/language-server/translate`
+#### Request Body
+```json
+{
+    "input_sentence": "Hello, how are you?",
+    "source_lan": "eng_Latn",
+    "target_lang": "hin_Deva"
+}
+```
+#### Response
+```json
+{
+    "translation": "नमस्ते, आप कैसे हैं?"
+}
+```
+#### Error Response
+```json
+{
+    "message": "Not a valid dialect"
+}
+```
+### Example Usage
+#### cURL
+```bash
+curl -X POST "http://localhost:9000/language-server/translate" \
+     -H "Content-Type: application/json" \
+     -d '{
+       "input_sentence": "Good morning!",
+       "source_lan": "eng_Latn",
+       "target_lang": "hin_Deva"
+     }'
+```
+#### Python
+```python
+import requests
+url = "http://localhost:9000/language-server/translate"
+data = {
+    "input_sentence": "Good morning!",
+    "source_lan": "eng_Latn",
+    "target_lang": "hin_Deva"
+}
+response = requests.post(url, json=data)
+print(response.json())
+```
+#### JavaScript
+```javascript
+const response = await fetch('http://localhost:9000/language-server/translate', {
+    method: 'POST',
+    headers: {
+        'Content-Type': 'application/json',
+    },
+    body: JSON.stringify({
+        input_sentence: 'Good morning!',
+        source_lan: 'eng_Latn',
+        target_lang: 'hin_Deva'
+    })
+});
+const result = await response.json();
+console.log(result);
+```
+## ⚡ Performance Optimization
+The server is optimized for production use with several performance features:
+### Model Configuration
+- **Distilled Models**: Uses 200M parameter distilled models for faster inference
+- **Memory Efficient**: Automatic GPU memory cleanup after each request
+- **Batch Processing**: Supports batch translation for multiple sentences
+### Recommended Settings for Speed
+To optimize performance, you can modify the following in `lang_server.py`:
+```python
+# Enable quantization for faster inference
+quantization = "4-bit"  # or "8-bit"
+# Reduce generation parameters for speed
+max_length = 128        # Reduced from 256
+num_beams = 1          # Greedy decoding for fastest results
+```
+## 🏗️ Architecture
+The server uses a dual-model architecture:
+1. **English → Indic Model**: `ai4bharat/indictrans2-en-indic-dist-200M`
+2. **Indic → English Model**: `ai4bharat/indictrans2-indic-en-dist-200M`
+The appropriate model is automatically selected based on the target language:
+- If target is English (`eng_Latn`): Uses Indic→English model
+- If target is any Indic language: Uses English→Indic model
+## 🔧 Configuration
+### Environment Variables
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `BATCH_SIZE` | `4` | Batch size for translation |
+| `DEVICE` | `cuda` | Device for model inference |
+### Model Selection
+You can switch between different model variants by modifying the checkpoint directories:
+```python
+# For base models (higher quality, slower)
+en_indic_ckpt_dir = "ai4bharat/indictrans2-en-indic-1B"
+# For distilled models (faster, good quality)
+en_indic_ckpt_dir = "ai4bharat/indictrans2-en-indic-dist-200M"
+```
+## 🐳 Docker Deployment
+Create a `Dockerfile`:
+```dockerfile
+FROM nvidia/cuda:11.8-devel-ubuntu20.04
+# Install Python and dependencies
+RUN apt-get update && apt-get install -y python3 python3-pip git
+WORKDIR /app
+# Clone and setup IndicTrans2
+RUN git clone https://github.com/AI4Bharat/IndicTrans2 .
+RUN source install.sh
+RUN pip install fastapi uvicorn
+# Install IndicTransToolkit
+WORKDIR /app/huggingface_interface/IndicTransToolkit
+RUN pip install -e .
+WORKDIR /app
+# Copy your server file
+COPY lang_server.py .
+# Expose port
+EXPOSE 9000
+# Run the server
+CMD ["python3", "lang_server.py"]
+```
+Build and run:
+```bash
+docker build -t indictrans2-server .
+docker run --gpus all -p 9000:9000 indictrans2-server
+```
+## 📊 Benchmarks
+The IndicTrans2 models achieve state-of-the-art performance on various benchmarks:
+- **FLORES-22**: Comprehensive evaluation across 22 languages
+- **IN22**: New benchmark with 1024 sentences across multiple domains
+- **chrF++**: Primary evaluation metric for translation quality
+For detailed benchmark results, refer to the [IndicTrans2 paper](https://arxiv.org/abs/2305.16307).
+## 🛠️ Development
+### Running in Development Mode
+```bash
+# Install development dependencies
+pip install fastapi[all] uvicorn[standard]
+# Run with auto-reload
+uvicorn lang_server:app --host 0.0.0.0 --port 9000 --reload
+```
+### Testing
+```bash
+# Test the translation endpoint
+python -c "
+import requests
+response = requests.post('http://localhost:9000/language-server/translate',
+                        json={'input_sentence': 'Hello', 'source_lan': 'eng_Latn', 'target_lang': 'hin_Deva'})
+print(response.json())
+"
+```
+## 🚦 Production Deployment
+### Using Gunicorn
+```bash
+pip install gunicorn
+# Run with multiple workers
+gunicorn lang_server:app -w 4 -k uvicorn.workers.UvicornWorker -b 0.0.0.0:9000
+```
+### Nginx Configuration
+```nginx
+server {
+    listen 80;
+    server_name your-domain.com;
+    location / {
+        proxy_pass http://127.0.0.1:9000;
+        proxy_set_header Host $host;
+        proxy_set_header X-Real-IP $remote_addr;
+    }
+}
+```
+## 🔍 Troubleshooting
+### Common Issues
+1. **CUDA Out of Memory**
+   - Reduce `BATCH_SIZE` in the code
+   - Enable quantization: `quantization = "4-bit"`
+   - Use smaller model variant
+2. **Slow Performance**
+   - Ensure GPU is available and being used
+   - Enable quantization for faster inference
+   - Reduce `max_length` and `num_beams` parameters
+3. **Model Loading Issues**
+   - Check internet connection for model downloading
+   - Verify sufficient disk space (models are ~2GB each)
+   - Ensure proper CUDA installation
+### Monitoring
+```python
+# Add to your server for monitoring
+import psutil
+import GPUtil
+@app.get("/health")
+def health_check():
+    gpu = GPUtil.getGPUs()[0] if GPUtil.getGPUs() else None
+    return {
+        "status": "healthy",
+        "gpu_memory": f"{gpu.memoryUsed}/{gpu.memoryTotal}MB" if gpu else "No GPU",
+        "cpu_percent": psutil.cpu_percent(),
+        "memory_percent": psutil.virtual_memory().percent
+    }
+```
+## 📄 License
+This project uses the IndicTrans2 models which are released under the MIT License. See the [LICENSE](https://github.com/AI4Bharat/IndicTrans2/blob/main/LICENSE) file for details.
+## 🤝 Contributing
+1. Fork the repository
+2. Create your feature branch (`git checkout -b feature/amazing-feature`)
+3. Commit your changes (`git commit -m 'Add some amazing feature'`)
+4. Push to the branch (`git push origin feature/amazing-feature`)
+5. Open a Pull Request
+## 🙏 Acknowledgments
+- [AI4Bharat](https://ai4bharat.iitm.ac.in/) for the IndicTrans2 models
+- [Hugging Face](https://huggingface.co/) for model hosting and transformers library
+- [FastAPI](https://fastapi.tiangolo.com/) for the excellent web framework
+## 📚 Citation
+If you use this server in your research, please cite the IndicTrans2 paper:
+```bibtex
+@article{gala2023indictrans,
+    title={IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages},
+    author={Jay Gala and Pranjal A Chitale and A K Raghavan and Varun Gumma and Sumanth Doddapaneni and Aswanth Kumar M and Janki Atul Nawale and Anupama Sujatha and Ratish Puduppully and Vivek Raghavan and Pratyush Kumar and Mitesh M Khapra and Raj Dabre and Anoop Kunchukuttan},
+    journal={Transactions on Machine Learning Research},
+    issn={2835-8856},
+    year={2023},
+    url={https://openreview.net/forum?id=vfT4YuzAYA},
+}
+```
+## 🔗 Links
+- [IndicTrans2 GitHub](https://github.com/AI4Bharat/IndicTrans2)
+- [IndicTrans2 Paper](https://arxiv.org/abs/2305.16307)
+- [AI4Bharat Website](https://ai4bharat.iitm.ac.in/)
+- [Demo](https://models.ai4bharat.org/#/nmt/v2)
+- [Colab Notebook](https://colab.research.google.com/github/AI4Bharat/IndicTrans2/blob/main/huggingface_interface/colab_inference.ipynb)

app.py ADDED Viewed

	@@ -0,0 +1,196 @@

+import time
+from fastapi import FastAPI
+from pydantic import BaseModel
+import torch
+from transformers import AutoModelForSeq2SeqLM, BitsAndBytesConfig, AutoTokenizer
+from IndicTransToolkit.processor import IndicProcessor
+import signal
+import sys
+import uvicorn
+BATCH_SIZE = 4
+DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
+quantization = None
+def initialize_model_and_tokenizer(ckpt_dir, quantization):
+    if quantization == "4-bit":
+        qconfig = BitsAndBytesConfig(
+            load_in_4bit=True,
+            bnb_4bit_use_double_quant=True,
+            bnb_4bit_compute_dtype=torch.bfloat16,
+        )
+    elif quantization == "8-bit":
+        qconfig = BitsAndBytesConfig(
+            load_in_8bit=True,
+            bnb_8bit_use_double_quant=True,
+            bnb_8bit_compute_dtype=torch.bfloat16,
+        )
+    else:
+        qconfig = None
+    tokenizer = AutoTokenizer.from_pretrained(ckpt_dir, trust_remote_code=True)
+    model = AutoModelForSeq2SeqLM.from_pretrained(
+        ckpt_dir,
+        trust_remote_code=True,
+        low_cpu_mem_usage=True,
+        quantization_config=qconfig,
+    )
+    if qconfig is None:
+        model = model.to(DEVICE)
+        # Only use half precision if CUDA is available
+        if DEVICE == "cuda" and torch.cuda.is_available():
+            model.half()
+    model.eval()
+    return tokenizer, model
+def batch_translate(input_sentences, src_lang, tgt_lang, model, tokenizer, ip):
+    translations = []
+    for i in range(0, len(input_sentences), BATCH_SIZE):
+        batch = input_sentences[i : i + BATCH_SIZE]
+        # Preprocess the batch and extract entity mappings
+        batch = ip.preprocess_batch(batch, src_lang=src_lang, tgt_lang=tgt_lang)
+        # Tokenize the batch and generate input encodings
+        inputs = tokenizer(
+            batch,
+            truncation=True,
+            padding="longest",
+            return_tensors="pt",
+            return_attention_mask=True,
+        ).to(DEVICE)
+        # Generate translations using the model
+        with torch.no_grad():
+            generated_tokens = model.generate(
+                **inputs,
+                use_cache=True,
+                min_length=0,
+                max_length=256,
+                num_beams=4,
+                num_return_sequences=1,
+            )
+        # Decode the generated tokens into text
+        generated_tokens = tokenizer.batch_decode(
+            generated_tokens,
+            skip_special_tokens=True,
+            clean_up_tokenization_spaces=True,
+        )
+        # Postprocess the translations, including entity replacement
+        translations += ip.postprocess_batch(generated_tokens, lang=tgt_lang)
+        del inputs
+        if torch.cuda.is_available():
+            torch.cuda.empty_cache()
+    return translations
+# en_indic_ckpt_dir = "ai4bharat/indictrans2-en-indic-1B"  # ai4bharat/indictrans2-en-indic-dist-200M
+en_indic_ckpt_dir = "ai4bharat/indictrans2-en-indic-dist-200M"
+en_indic_tokenizer, en_indic_model = initialize_model_and_tokenizer(en_indic_ckpt_dir, quantization)
+indic_en_ckpt_dir = "ai4bharat/indictrans2-indic-en-dist-200M"
+indic_en_tokenizer, indic_en_model = initialize_model_and_tokenizer(indic_en_ckpt_dir, quantization)
+ip = IndicProcessor(inference=True)
+app = FastAPI()
+class Translate(BaseModel):
+    input_sentence : str
+    source_lan : str
+    target_lang: str
+lang_list = [
+        "eng_Latn", # Latin English
+        "ben_Beng", # Bengali
+        "pan_Guru", # Punjabi
+        "asm_Beng", # Assamese
+        "gom_Deva", # Konkani
+        "guj_Gujr", # Gujarati
+        "hin_Deva", # Hindi
+        "kan_Knda", # Kannada,
+        "mal_Mlym", # Malayalam
+        "ory_Orya", # Odia,
+        "tam_Taml", # Tamil,
+        "tel_Telu", # Telugu
+    ]
+# post method to translate
+@app.post("/api/v1/translate")
+def translate(input : Translate):# -> dict[str, Any]:
+    # start time
+    start_time = time.time()
+    if input.source_lan  not in lang_list or input.target_lang not in lang_list:
+        return {
+            "message" : "Not a valid dialect",
+            "translation": None
+        }
+    model = None
+    tokenizer = None
+    if input.target_lang == "eng_Latn":
+        model = indic_en_model
+        tokenizer = indic_en_tokenizer
+    else:
+        model = en_indic_model
+        tokenizer = en_indic_tokenizer
+    translation = batch_translate(
+        [input.input_sentence],  # Note: batch_translate expects a list
+        src_lang=input.source_lan,
+        tgt_lang=input.target_lang,
+        model=model,
+        tokenizer=tokenizer,
+        ip=ip  # Don't forget to pass the ip parameter
+    )
+    # Calculate processing time
+    end_time = time.time()
+    processing_time = round(end_time - start_time, 2)
+    return {
+        "message" : f"translation processed successfully in {processing_time} seconds",
+        "translation": translation[0]
+    }
+@app.get("/health")
+def health_check():
+    return {
+        "status": "healthy",
+        "gpu_available": torch.cuda.is_available(),
+        "gpu_count": torch.cuda.device_count() if torch.cuda.is_available() else 0
+    }
+# Signal handler for graceful shutdown
+def handle_sigterm(signum, frame):
+    print("Received SIGTERM signal. Cleaning up models and exiting...")
+    # Delete models to free GPU memory
+    global en_indic_tokenizer, en_indic_model, indic_en_tokenizer, indic_en_model
+    del en_indic_tokenizer, en_indic_model
+    del indic_en_tokenizer, indic_en_model
+    if torch.cuda.is_available():
+        torch.cuda.empty_cache()
+    sys.exit(0)
+# Register the signal handler
+signal.signal(signal.SIGTERM, handle_sigterm)
+if __name__ == "__main__":
+    uvicorn.run(app, host="0.0.0.0", port=9000)

requirements.txt ADDED Viewed

	@@ -0,0 +1,10 @@

+fastapi==0.104.1
+uvicorn[standard]==0.24.0
+pydantic==2.5.0
+psutil==5.9.6
+transformers==4.35.0
+accelerate==0.24.1
+tokenizers==0.15.0
+sentencepiece==0.1.99
+sacremoses==0.0.53
+numpy==1.24.3