sema-api / README.md
kamau1's picture
fix(api): resolve SlowAPI request param conflict, update schema_extra naming, enhance tests, and serve Swagger UI at root
be6b137
---
title: Sema Translation API
emoji: 🌍
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
license: mit
short_description: Enterprise-grade translation API with 200+ language support
---
# Sema Translation API 🌍
Enterprise-grade translation API supporting 200+ languages with automatic language detection, rate limiting, usage tracking, and comprehensive monitoring. Built with FastAPI and powered by the consolidated `sematech/sema-utils` model repository.
## πŸš€ Features
### Core Translation
- **Automatic Language Detection**: Detects source language automatically if not provided
- **200+ Language Support**: Supports all FLORES-200 language codes
- **High-Performance Translation**: Uses CTranslate2 for optimized inference
- **Character Count Tracking**: Monitors usage for billing and analytics
### Enterprise Features
- **Rate Limiting**: 60 requests/minute, 1000 requests/hour per IP
- **Request Tracking**: Unique request IDs for debugging and monitoring
- **Usage Analytics**: Comprehensive metrics with Prometheus integration
- **Structured Logging**: JSON-formatted logs for easy parsing
- **Health Monitoring**: Detailed health checks for system monitoring
### Security & Reliability
- **Input Validation**: Comprehensive request validation with Pydantic
- **Error Handling**: Graceful error handling with detailed error responses
- **CORS Support**: Configurable cross-origin resource sharing
- **Future-Ready Auth**: Designed for Supabase authentication integration
### API Quality
- **OpenAPI Documentation**: Auto-generated Swagger UI and ReDoc
- **Type Safety**: Full TypeScript-compatible API schemas
- **Production Ready**: Follows FastAPI production best practices
## πŸ“ Project Structure
```
app/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ main.py # Application entry point
β”œβ”€β”€ api/ # API route definitions
β”‚ β”œβ”€β”€ __init__.py
β”‚ └── v1/ # Versioned API routes
β”‚ β”œβ”€β”€ __init__.py
β”‚ └── endpoints.py # Route handlers
β”œβ”€β”€ core/ # Core configuration
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ config.py # Settings and configuration
β”‚ β”œβ”€β”€ logging.py # Logging configuration
β”‚ └── metrics.py # Prometheus metrics
β”œβ”€β”€ middleware/ # Custom middleware
β”‚ β”œβ”€β”€ __init__.py
β”‚ └── request_middleware.py # Request tracking middleware
β”œβ”€β”€ models/ # Data models
β”‚ β”œβ”€β”€ __init__.py
β”‚ └── schemas.py # Pydantic models
β”œβ”€β”€ services/ # Business logic
β”‚ β”œβ”€β”€ __init__.py
β”‚ └── translation.py # Translation service
└── utils/ # Utility functions
β”œβ”€β”€ __init__.py
└── helpers.py # Helper functions
```
## πŸ”— API Endpoints
### Health & Monitoring
- **`GET /`** - Interactive Swagger UI documentation
- **`GET /status`** - Basic health check
- **`GET /health`** - Detailed health monitoring
- **`GET /metrics`** - Prometheus metrics
- **`GET /redoc`** - ReDoc documentation
### Translation
- **`POST /translate`** - Main translation endpoint
- **`POST /api/v1/translate`** - Versioned translation endpoint
### Request/Response Examples
**Translation Request:**
```json
{
"text": "Habari ya asubuhi",
"target_language": "eng_Latn",
"source_language": "swh_Latn" // Optional
}
```
**Translation Response:**
```json
{
"translated_text": "Good morning",
"source_language": "swh_Latn",
"target_language": "eng_Latn",
"inference_time": 0.234,
"character_count": 17,
"timestamp": "Monday | 2024-06-21 | 14:30:25",
"request_id": "550e8400-e29b-41d4-a716-446655440000"
}
```
## Language Codes
This API uses FLORES-200 language codes. Some common examples:
- `eng_Latn` - English
- `swh_Latn` - Swahili
- `kik_Latn` - Kikuyu
- `luo_Latn` - Luo
- `fra_Latn` - French
- `spa_Latn` - Spanish
## Usage Examples
### Python
```python
import requests
response = requests.post("https://your-space-url/translate", json={
"text": "Habari ya asubuhi",
"target_language": "eng_Latn"
})
print(response.json())
```
### cURL
```bash
curl -X POST "https://your-space-url/translate" \
-H "Content-Type: application/json" \
-d '{
"text": "WΔ© mwega?",
"source_language": "kik_Latn",
"target_language": "eng_Latn"
}'
```
## Model Information
This API uses models from the consolidated `sematech/sema-utils` repository:
- **Translation Model**: `sematrans-3.3B` (CTranslate2 optimized)
- **Language Detection**: `lid218e.bin` (FastText)
- **Tokenization**: `spm.model` (SentencePiece)
## API Documentation
Once the Space is running, visit `/docs` for interactive API documentation.
---
Created by Lewis Kamau Kimaru | Sema AI