ShivamPansuriya commited on
Commit
74708f4
Β·
1 Parent(s): c927402

Add application file

Browse files
DEPLOYMENT.md ADDED
@@ -0,0 +1,277 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Deployment Guide
2
+
3
+ This guide covers deploying the Video Transcription Service to Render.com's free tier.
4
+
5
+ ## Prerequisites
6
+
7
+ 1. **GitHub Account**: Your code needs to be in a GitHub repository
8
+ 2. **Render Account**: Sign up at [render.com](https://render.com) (free)
9
+ 3. **Git**: Installed on your local machine
10
+
11
+ ## Step-by-Step Deployment
12
+
13
+ ### 1. Prepare Your Repository
14
+
15
+ ```bash
16
+ # Initialize git repository (if not already done)
17
+ git init
18
+
19
+ # Add all files
20
+ git add .
21
+
22
+ # Commit changes
23
+ git commit -m "Initial commit - Video Transcription Service"
24
+
25
+ # Add your GitHub repository as remote
26
+ git remote add origin https://github.com/yourusername/your-repo-name.git
27
+
28
+ # Push to GitHub
29
+ git push -u origin main
30
+ ```
31
+
32
+ ### 2. Deploy to Render
33
+
34
+ 1. **Go to Render Dashboard**
35
+ - Visit [dashboard.render.com](https://dashboard.render.com)
36
+ - Sign in with your GitHub account
37
+
38
+ 2. **Create New Web Service**
39
+ - Click "New +" button
40
+ - Select "Web Service"
41
+ - Choose "Build and deploy from a Git repository"
42
+
43
+ 3. **Connect Repository**
44
+ - Select your GitHub repository
45
+ - Click "Connect"
46
+
47
+ 4. **Configure Service**
48
+ - **Name**: `video-transcription-service` (or your preferred name)
49
+ - **Environment**: `Docker`
50
+ - **Region**: Choose closest to your users
51
+ - **Branch**: `main`
52
+ - **Dockerfile Path**: `./Dockerfile`
53
+
54
+ 5. **Advanced Settings**
55
+ - **Plan**: Free (automatically selected)
56
+ - **Environment Variables**: None needed (auto-configured)
57
+ - **Health Check Path**: `/health`
58
+ - **Auto-Deploy**: Yes (recommended)
59
+
60
+ 6. **Deploy**
61
+ - Click "Create Web Service"
62
+ - Render will start building your service
63
+
64
+ ### 3. Monitor Deployment
65
+
66
+ 1. **Build Process**
67
+ - Watch the build logs in real-time
68
+ - First build takes 5-10 minutes (installing dependencies)
69
+ - Look for "Build successful" message
70
+
71
+ 2. **Deployment Status**
72
+ - Service will show "Live" when ready
73
+ - Initial startup may take 30-60 seconds (loading AI model)
74
+
75
+ 3. **Test Your Service**
76
+ - Your service URL: `https://your-service-name.onrender.com`
77
+ - API docs: `https://your-service-name.onrender.com/docs`
78
+ - Health check: `https://your-service-name.onrender.com/health`
79
+
80
+ ## Configuration Details
81
+
82
+ ### Automatic Configuration
83
+
84
+ The service is pre-configured for Render's free tier:
85
+
86
+ - **Port**: Automatically uses `$PORT` environment variable
87
+ - **Memory**: Optimized for 512MB limit
88
+ - **CPU**: Efficient processing for shared CPU
89
+ - **Storage**: No persistent storage (in-memory only)
90
+ - **Health Checks**: Configured at `/health` endpoint
91
+
92
+ ### Free Tier Limitations
93
+
94
+ **Resource Limits:**
95
+ - 512MB RAM
96
+ - Shared CPU
97
+ - 750 hours/month (service sleeps after 15min inactivity)
98
+ - No persistent storage
99
+
100
+ **Service Behavior:**
101
+ - **Cold Starts**: 30-60 seconds after sleep
102
+ - **File Size**: 100MB maximum per video
103
+ - **Processing**: Sequential (one video at a time)
104
+ - **Storage**: 3.5 hours maximum per transcription
105
+
106
+ ## Troubleshooting
107
+
108
+ ### Common Build Issues
109
+
110
+ 1. **Out of Memory During Build**
111
+ ```
112
+ Error: Process killed (out of memory)
113
+ ```
114
+ - This is rare but can happen with large dependencies
115
+ - Try pushing smaller commits
116
+ - Contact Render support if persistent
117
+
118
+ 2. **FFmpeg Installation Failed**
119
+ ```
120
+ E: Unable to locate package ffmpeg
121
+ ```
122
+ - Check Dockerfile has correct apt-get commands
123
+ - Ensure base image is correct (python:3.11-slim)
124
+
125
+ 3. **Python Package Installation Failed**
126
+ ```
127
+ ERROR: Could not install packages
128
+ ```
129
+ - Check requirements.txt syntax
130
+ - Ensure all package names are correct
131
+ - Try removing version pins if needed
132
+
133
+ ### Runtime Issues
134
+
135
+ 1. **Service Won't Start**
136
+ - Check runtime logs in Render dashboard
137
+ - Look for Python import errors
138
+ - Verify all dependencies are installed
139
+
140
+ 2. **Health Check Failing**
141
+ ```
142
+ Health check failed
143
+ ```
144
+ - Service might be taking too long to start
145
+ - Check if Whisper model is loading correctly
146
+ - Verify `/health` endpoint is accessible
147
+
148
+ 3. **Out of Memory at Runtime**
149
+ ```
150
+ Process killed (signal 9)
151
+ ```
152
+ - Large video files can cause this
153
+ - Reduce MAX_FILE_SIZE in config.py
154
+ - Use smaller Whisper model (tiny instead of base)
155
+
156
+ 4. **Slow Processing**
157
+ - First request loads AI model (30-60 seconds)
158
+ - Subsequent requests are faster
159
+ - Consider using smaller model for speed
160
+
161
+ ### Service Sleeping
162
+
163
+ **Free Tier Behavior:**
164
+ - Service sleeps after 15 minutes of inactivity
165
+ - First request after sleep takes 30-60 seconds
166
+ - This is normal for free tier
167
+
168
+ **Solutions:**
169
+ - Upgrade to paid plan for always-on service
170
+ - Use external monitoring to keep service awake
171
+ - Inform users about potential cold start delays
172
+
173
+ ## Monitoring and Maintenance
174
+
175
+ ### Logs
176
+
177
+ Access logs in Render dashboard:
178
+ 1. Go to your service
179
+ 2. Click "Logs" tab
180
+ 3. Monitor for errors and performance
181
+
182
+ ### Metrics
183
+
184
+ Monitor service health:
185
+ - Response times
186
+ - Error rates
187
+ - Memory usage
188
+ - Active transcriptions
189
+
190
+ ### Updates
191
+
192
+ Deploy updates automatically:
193
+ 1. Push changes to GitHub
194
+ 2. Render auto-deploys from main branch
195
+ 3. Monitor deployment in dashboard
196
+
197
+ ## Scaling Considerations
198
+
199
+ ### Free Tier Optimization
200
+
201
+ **Current Setup:**
202
+ - Single instance
203
+ - 512MB RAM
204
+ - Shared CPU
205
+ - In-memory storage
206
+
207
+ **Optimization Tips:**
208
+ - Use smaller Whisper model for speed
209
+ - Implement request queuing
210
+ - Add request size validation
211
+ - Monitor memory usage
212
+
213
+ ### Upgrade Path
214
+
215
+ **Paid Plans Offer:**
216
+ - More RAM (1GB+)
217
+ - Dedicated CPU
218
+ - Always-on service
219
+ - Multiple instances
220
+ - Persistent storage options
221
+
222
+ ## Security
223
+
224
+ ### Current Security Features
225
+
226
+ - Rate limiting (10 requests/minute)
227
+ - File size validation
228
+ - File type validation
229
+ - No persistent file storage
230
+ - Automatic cleanup
231
+
232
+ ### Additional Security (Optional)
233
+
234
+ - API key authentication
235
+ - HTTPS only (automatic on Render)
236
+ - Request logging
237
+ - IP whitelisting
238
+ - CORS configuration
239
+
240
+ ## Support
241
+
242
+ ### Getting Help
243
+
244
+ 1. **Render Support**
245
+ - Free tier includes community support
246
+ - Check Render documentation
247
+ - Use Render community forum
248
+
249
+ 2. **Service Issues**
250
+ - Check service logs first
251
+ - Verify configuration
252
+ - Test with smaller files
253
+
254
+ 3. **API Issues**
255
+ - Use `/docs` endpoint for testing
256
+ - Check request format
257
+ - Verify file types and sizes
258
+
259
+ ### Useful Commands
260
+
261
+ ```bash
262
+ # Test your deployed service
263
+ curl https://your-service.onrender.com/health
264
+
265
+ # Upload test video
266
+ curl -X POST "https://your-service.onrender.com/transcribe" \
267
268
+
269
+ # Check transcription status
270
+ curl "https://your-service.onrender.com/transcribe/1"
271
+ ```
272
+
273
+ ---
274
+
275
+ **Your service is now live and ready to transcribe videos! πŸŽ‰**
276
+
277
+ Share your service URL with users or integrate it into your applications.
Dockerfile ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Use Python 3.11 slim image for better performance
2
+ FROM python:3.11-slim
3
+
4
+ # Set working directory
5
+ WORKDIR /app
6
+
7
+ # Install system dependencies
8
+ RUN apt-get update && apt-get install -y \
9
+ ffmpeg \
10
+ git \
11
+ && rm -rf /var/lib/apt/lists/*
12
+
13
+ # Copy requirements first for better caching
14
+ COPY requirements.txt .
15
+
16
+ # Install Python dependencies with NumPy compatibility fix
17
+ RUN pip install --no-cache-dir "numpy<2.0.0" && \
18
+ pip install --no-cache-dir -r requirements.txt
19
+
20
+ # Set environment variables for optimal performance
21
+ ENV WHISPER_MODEL=tiny
22
+ ENV MODEL_PRELOAD=true
23
+ ENV DEBUG=false
24
+ ENV PYTHONUNBUFFERED=1
25
+
26
+ # Copy application code
27
+ COPY . .
28
+
29
+ # Create non-root user for security
30
+ RUN useradd -m -u 1000 appuser && chown -R appuser:appuser /app
31
+ USER appuser
32
+
33
+ # Expose port
34
+ EXPOSE 8000
35
+
36
+ # Health check
37
+ HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
38
+ CMD python -c "import requests; requests.get('http://localhost:8000/health')"
39
+
40
+ # Run the application with robust startup
41
+ CMD ["python", "start_robust.py"]
HF_DEPLOYMENT_SUMMARY.md ADDED
@@ -0,0 +1,221 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # πŸŽ‰ Hugging Face Spaces Deployment - Complete Solution
2
+
3
+ Your Video Transcription Service is now ready for deployment to Hugging Face Spaces with **full API compatibility** and enhanced features!
4
+
5
+ ## βœ… **What You Get**
6
+
7
+ ### **🌐 Dual Interface**
8
+ - **Beautiful Gradio Web UI** - User-friendly interface for manual uploads
9
+ - **Full REST API** - Programmatic access identical to your current FastAPI service
10
+ - **Simultaneous Access** - Both interfaces work at the same time
11
+
12
+ ### **πŸš€ Enhanced Features**
13
+ - **Higher Resource Limits** - 16GB RAM vs 512MB on Render
14
+ - **Better Performance** - Dedicated CPU cores
15
+ - **Larger File Support** - Up to 200MB videos
16
+ - **GPU Option Available** - For heavy workloads
17
+ - **Community Integration** - Easy sharing and discovery
18
+
19
+ ### **πŸ”§ Preserved Functionality**
20
+ - βœ… All existing API endpoints (`/api/transcribe`, `/api/transcribe/{id}`, `/api/health`)
21
+ - βœ… Multiple video format support
22
+ - βœ… Language detection/specification
23
+ - βœ… Progress tracking and logging
24
+ - βœ… Error handling
25
+ - βœ… Automatic cleanup after 3-4 hours
26
+ - βœ… Rate limiting and validation
27
+
28
+ ## πŸ“ **Deployment Package Ready**
29
+
30
+ All files are prepared in `hf_spaces_deploy/`:
31
+
32
+ ```
33
+ hf_spaces_deploy/
34
+ β”œβ”€β”€ app.py # Gradio + FastAPI hybrid interface
35
+ β”œβ”€β”€ requirements.txt # HF Spaces optimized dependencies
36
+ β”œβ”€β”€ README.md # HF Spaces documentation with API examples
37
+ β”œβ”€β”€ config.py # HF-optimized configuration
38
+ β”œβ”€β”€ models.py # Data models
39
+ β”œβ”€β”€ storage.py # Storage management
40
+ β”œβ”€β”€ transcription_service.py # Core transcription logic
41
+ β”œβ”€β”€ logging_config.py # Logging configuration
42
+ └── restart_handler.py # Performance optimization
43
+ ```
44
+
45
+ ## πŸš€ **Quick Deployment Steps**
46
+
47
+ ### **1. Create Hugging Face Space**
48
+ - Go to https://huggingface.co/spaces
49
+ - Click "Create new Space"
50
+ - Name: `video-transcription`
51
+ - SDK: **Gradio**
52
+ - Visibility: **Public** (for API access)
53
+
54
+ ### **2. Deploy via Git**
55
+ ```bash
56
+ cd hf_spaces_deploy
57
+ git init
58
+ git add .
59
+ git commit -m "Deploy Video Transcription Service"
60
+ git remote add origin https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
61
+ git push -u origin main
62
+ ```
63
+
64
+ ### **3. Wait for Build**
65
+ - Monitor logs in HF Spaces dashboard
66
+ - Build takes 5-10 minutes
67
+ - Model downloads automatically
68
+
69
+ ## 🌐 **API Compatibility Confirmed**
70
+
71
+ ### **Identical Endpoints**
72
+ Your existing API calls work unchanged:
73
+
74
+ ```python
75
+ # OLD (Render.com)
76
+ BASE_URL = "https://your-service.onrender.com"
77
+
78
+ # NEW (HF Spaces) - Just change the URL!
79
+ BASE_URL = "https://username-spacename.hf.space"
80
+
81
+ # All endpoints remain the same:
82
+ POST /api/transcribe
83
+ GET /api/transcribe/{id}
84
+ GET /api/health
85
+ ```
86
+
87
+ ### **Example API Usage**
88
+ ```python
89
+ import requests
90
+
91
+ # Upload video (same as before)
92
+ with open('video.mp4', 'rb') as f:
93
+ response = requests.post(
94
+ 'https://username-spacename.hf.space/api/transcribe',
95
+ files={'file': f},
96
+ data={'language': 'en'}
97
+ )
98
+
99
+ transcription_id = response.json()['id']
100
+
101
+ # Check status (same as before)
102
+ result = requests.get(f'https://username-spacename.hf.space/api/transcribe/{transcription_id}')
103
+ print(result.json())
104
+ ```
105
+
106
+ ### **Enhanced API Client**
107
+ Use the new HF-optimized client:
108
+
109
+ ```python
110
+ from hf_api_client import HFTranscriptionClient
111
+
112
+ client = HFTranscriptionClient("https://username-spacename.hf.space")
113
+ result = client.transcribe_and_wait("video.mp4")
114
+ print(result['text'])
115
+ ```
116
+
117
+ ## 🎯 **Key Advantages**
118
+
119
+ | Feature | Render.com | Hugging Face Spaces |
120
+ |---------|------------|-------------------|
121
+ | **Memory** | 512MB | 16GB (32GB upgrade) |
122
+ | **CPU** | Shared | 2-8 vCPU dedicated |
123
+ | **File Size** | 100MB | 200MB |
124
+ | **Interface** | API only | Gradio + API |
125
+ | **GPU** | None | T4 available |
126
+ | **Community** | Limited | Built-in sharing |
127
+ | **Reliability** | Cold starts | Better uptime |
128
+
129
+ ## πŸ“Š **Testing Your Deployment**
130
+
131
+ ### **Web Interface Test**
132
+ 1. Visit: `https://username-spacename.hf.space`
133
+ 2. Upload a test video
134
+ 3. Verify transcription works
135
+ 4. Check status updates
136
+
137
+ ### **API Test**
138
+ ```bash
139
+ # Health check
140
+ curl "https://username-spacename.hf.space/api/health"
141
+
142
+ # Upload test
143
+ curl -X POST "https://username-spacename.hf.space/api/transcribe" \
144
145
+ -F "language=en"
146
+
147
+ # Status check
148
+ curl "https://username-spacename.hf.space/api/transcribe/1"
149
+ ```
150
+
151
+ ### **Python Client Test**
152
+ ```bash
153
+ python hf_api_client.py https://username-spacename.hf.space test_video.mp4
154
+ ```
155
+
156
+ ## πŸ”§ **Performance Optimization**
157
+
158
+ ### **Hardware Options**
159
+ - **CPU basic** (free) - 2 vCPU, 16GB RAM
160
+ - **CPU upgrade** ($0.05/hour) - 8 vCPU, 32GB RAM
161
+ - **GPU T4** ($0.60/hour) - For heavy workloads
162
+
163
+ ### **Model Selection**
164
+ ```python
165
+ # Environment variables in Space settings:
166
+ WHISPER_MODEL=tiny # Fastest (39MB)
167
+ WHISPER_MODEL=base # Balanced (74MB) - Default
168
+ WHISPER_MODEL=small # Best quality (244MB)
169
+ ```
170
+
171
+ ## πŸŽ‰ **Migration Benefits**
172
+
173
+ ### **Immediate Improvements**
174
+ - βœ… **32x More Memory** (16GB vs 512MB)
175
+ - βœ… **Dedicated CPU** vs shared
176
+ - βœ… **2x Larger Files** (200MB vs 100MB)
177
+ - βœ… **Beautiful Web Interface** + API
178
+ - βœ… **Better Reliability** and uptime
179
+ - βœ… **Community Features** and sharing
180
+
181
+ ### **Future Possibilities**
182
+ - πŸš€ **GPU Acceleration** for faster processing
183
+ - πŸ“ˆ **Scaling Options** with better hardware
184
+ - 🌐 **Community Integration** and discovery
185
+ - πŸ”§ **Advanced Features** with HF ecosystem
186
+
187
+ ## πŸ“‹ **Next Steps**
188
+
189
+ 1. **Deploy to HF Spaces** using the prepared files
190
+ 2. **Test both interfaces** (web + API)
191
+ 3. **Update your applications** with new URLs
192
+ 4. **Monitor performance** and optimize as needed
193
+ 5. **Share with community** if desired
194
+
195
+ ## 🎯 **Success Criteria**
196
+
197
+ Your migration is successful when:
198
+ - [ ] βœ… Web interface loads and works
199
+ - [ ] βœ… API endpoints respond correctly
200
+ - [ ] βœ… Video transcription completes successfully
201
+ - [ ] βœ… Both small and large files process
202
+ - [ ] βœ… Multiple concurrent requests work
203
+ - [ ] βœ… Error handling functions properly
204
+ - [ ] βœ… Automatic cleanup operates
205
+ - [ ] βœ… Performance meets or exceeds Render.com
206
+
207
+ ---
208
+
209
+ ## 🎊 **Congratulations!**
210
+
211
+ You now have a **production-ready Video Transcription Service** on Hugging Face Spaces with:
212
+
213
+ - 🌐 **Beautiful Gradio interface** for users
214
+ - πŸ”— **Full API compatibility** for applications
215
+ - πŸš€ **Enhanced performance** and reliability
216
+ - πŸ“ˆ **Scalability options** for growth
217
+ - 🎯 **All existing features** preserved and improved
218
+
219
+ **Your service will be live at: `https://username-spacename.hf.space`**
220
+
221
+ **Ready to deploy? Follow the steps in `HF_MIGRATION_GUIDE.md`! πŸš€**
HF_MIGRATION_GUIDE.md ADDED
@@ -0,0 +1,301 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # πŸš€ Hugging Face Spaces Migration Guide
2
+
3
+ Complete guide to migrate your Video Transcription Service from Render.com to Hugging Face Spaces with enhanced features and API access.
4
+
5
+ ## 🎯 **Why Hugging Face Spaces?**
6
+
7
+ ### **Advantages over Render.com:**
8
+ - βœ… **Higher Resource Limits**: More memory and CPU
9
+ - βœ… **Better Performance**: Optimized for ML workloads
10
+ - βœ… **Free GPU Access**: Available for intensive tasks
11
+ - βœ… **Gradio Integration**: Beautiful web interface
12
+ - βœ… **Community Features**: Easy sharing and discovery
13
+ - βœ… **Persistent Storage**: Better file handling
14
+ - βœ… **API + Web Interface**: Both available simultaneously
15
+
16
+ ## πŸ“‹ **Pre-Migration Checklist**
17
+
18
+ - [ ] Hugging Face account created
19
+ - [ ] Git installed locally
20
+ - [ ] Python environment ready
21
+ - [ ] Test video files prepared
22
+ - [ ] Current service functionality documented
23
+
24
+ ## πŸ› οΈ **Step 1: Prepare Deployment Files**
25
+
26
+ Run the automated preparation script:
27
+
28
+ ```bash
29
+ python deploy_to_hf.py
30
+ ```
31
+
32
+ This creates a `hf_spaces_deploy/` directory with all necessary files:
33
+ - `app.py` - Gradio + FastAPI hybrid interface
34
+ - `requirements.txt` - HF Spaces optimized dependencies
35
+ - `README.md` - HF Spaces documentation
36
+ - `config.py` - HF-optimized configuration
37
+ - All supporting modules
38
+
39
+ ## 🌐 **Step 2: Create Hugging Face Space**
40
+
41
+ 1. **Go to Hugging Face Spaces**
42
+ - Visit: https://huggingface.co/spaces
43
+ - Click "Create new Space"
44
+
45
+ 2. **Configure Your Space**
46
+ - **Name**: `video-transcription` (or your choice)
47
+ - **SDK**: Select "Gradio"
48
+ - **Hardware**: Start with "CPU basic" (free)
49
+ - **Visibility**: Public (for API access) or Private
50
+
51
+ 3. **Create Space**
52
+ - Click "Create Space"
53
+ - Note your Space URL: `https://username-spacename.hf.space`
54
+
55
+ ## πŸ“€ **Step 3: Deploy to Hugging Face Spaces**
56
+
57
+ ### **Option A: Git Deployment (Recommended)**
58
+
59
+ ```bash
60
+ cd hf_spaces_deploy
61
+ git init
62
+ git add .
63
+ git commit -m "Initial deployment of Video Transcription Service"
64
+ git remote add origin https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
65
+ git push -u origin main
66
+ ```
67
+
68
+ ### **Option B: Web Upload**
69
+
70
+ 1. Go to your Space page
71
+ 2. Click "Files" tab
72
+ 3. Upload all files from `hf_spaces_deploy/`
73
+ 4. Ensure `app.py` is in the root directory
74
+
75
+ ## ⏳ **Step 4: Monitor Deployment**
76
+
77
+ 1. **Check Build Logs**
78
+ - Go to "Logs" tab in your Space
79
+ - Monitor the build process (5-10 minutes)
80
+ - Look for successful model download
81
+
82
+ 2. **Expected Log Output**
83
+ ```
84
+ πŸš€ Starting Video Transcription Service on Hugging Face Spaces
85
+ πŸ€– Loading Whisper model for Hugging Face Spaces...
86
+ βœ… Model 'base' preloaded in 45.2 seconds
87
+ πŸš€ Starting FastAPI service...
88
+ Running on local URL: http://0.0.0.0:7860
89
+ ```
90
+
91
+ 3. **Troubleshoot Issues**
92
+ - Build failures: Check requirements.txt
93
+ - Memory issues: Switch to "CPU upgrade" hardware
94
+ - Model loading issues: Try `WHISPER_MODEL=tiny`
95
+
96
+ ## βœ… **Step 5: Test Your Deployment**
97
+
98
+ ### **Test Web Interface**
99
+
100
+ 1. **Visit Your Space**
101
+ - URL: `https://username-spacename.hf.space`
102
+ - Should see Gradio interface
103
+
104
+ 2. **Upload Test Video**
105
+ - Use "Upload & Transcribe" tab
106
+ - Select a small test video (< 50MB)
107
+ - Choose language or use auto-detect
108
+ - Click "Start Transcription"
109
+
110
+ 3. **Check Results**
111
+ - Note the transcription ID
112
+ - Use "Check Status" tab to monitor progress
113
+ - Verify transcription completes successfully
114
+
115
+ ### **Test API Functionality**
116
+
117
+ 1. **Health Check**
118
+ ```bash
119
+ curl "https://username-spacename.hf.space/api/health"
120
+ ```
121
+
122
+ 2. **Upload Video via API**
123
+ ```bash
124
+ curl -X POST "https://username-spacename.hf.space/api/transcribe" \
125
+ -F "file=@test_video.mp4" \
126
+ -F "language=en"
127
+ ```
128
+
129
+ 3. **Check Status via API**
130
+ ```bash
131
+ curl "https://username-spacename.hf.space/api/transcribe/1"
132
+ ```
133
+
134
+ 4. **Use Python Client**
135
+ ```bash
136
+ python hf_api_client.py https://username-spacename.hf.space test_video.mp4
137
+ ```
138
+
139
+ ## πŸ”§ **Step 6: Optimize Performance**
140
+
141
+ ### **Hardware Upgrades**
142
+
143
+ If you experience performance issues:
144
+
145
+ 1. **Go to Space Settings**
146
+ 2. **Hardware β†’ Upgrade**
147
+ 3. **Options:**
148
+ - CPU basic (free) - 2 vCPU, 16GB RAM
149
+ - CPU upgrade ($0.05/hour) - 8 vCPU, 32GB RAM
150
+ - GPU T4 small ($0.60/hour) - For heavy workloads
151
+
152
+ ### **Model Optimization**
153
+
154
+ Adjust model size based on your needs:
155
+
156
+ ```python
157
+ # In Space settings, add environment variable:
158
+ WHISPER_MODEL=tiny # Fastest, good quality
159
+ WHISPER_MODEL=base # Balanced (default)
160
+ WHISPER_MODEL=small # Better quality, slower
161
+ ```
162
+
163
+ ## πŸ“Š **Step 7: Compare Features**
164
+
165
+ | Feature | Render.com | Hugging Face Spaces |
166
+ |---------|------------|-------------------|
167
+ | **Memory** | 512MB | 16GB (basic) / 32GB (upgrade) |
168
+ | **CPU** | Shared | 2-8 vCPU dedicated |
169
+ | **Storage** | Ephemeral | Persistent |
170
+ | **GPU** | None | T4 available |
171
+ | **Interface** | API only | Gradio + API |
172
+ | **Community** | Limited | Built-in sharing |
173
+ | **Cost** | Free tier limited | More generous free tier |
174
+
175
+ ## πŸ”„ **Step 8: Migration Validation**
176
+
177
+ ### **Functionality Checklist**
178
+
179
+ - [ ] βœ… Web interface loads correctly
180
+ - [ ] βœ… Video upload works (multiple formats)
181
+ - [ ] βœ… Language detection/selection works
182
+ - [ ] βœ… Transcription processing completes
183
+ - [ ] βœ… Results display correctly
184
+ - [ ] βœ… API endpoints respond correctly
185
+ - [ ] βœ… Status checking works
186
+ - [ ] βœ… Error handling functions
187
+ - [ ] βœ… Automatic cleanup operates
188
+ - [ ] βœ… Logging provides good visibility
189
+
190
+ ### **Performance Validation**
191
+
192
+ - [ ] βœ… Model loads within 2-3 minutes
193
+ - [ ] βœ… First transcription completes successfully
194
+ - [ ] βœ… Subsequent transcriptions are faster
195
+ - [ ] βœ… Large files (up to 200MB) process correctly
196
+ - [ ] βœ… Multiple concurrent requests handled
197
+ - [ ] βœ… Memory usage stays within limits
198
+
199
+ ## 🌐 **Step 9: Update Your Applications**
200
+
201
+ ### **Update API Endpoints**
202
+
203
+ Replace your Render.com URLs:
204
+
205
+ ```python
206
+ # Old Render.com URL
207
+ OLD_URL = "https://your-service.onrender.com"
208
+
209
+ # New HF Spaces URL
210
+ NEW_URL = "https://username-spacename.hf.space"
211
+
212
+ # API endpoints remain the same:
213
+ # POST /api/transcribe
214
+ # GET /api/transcribe/{id}
215
+ # GET /api/health
216
+ ```
217
+
218
+ ### **Update Client Code**
219
+
220
+ ```python
221
+ # Use the new HF API client
222
+ from hf_api_client import HFTranscriptionClient
223
+
224
+ client = HFTranscriptionClient("https://username-spacename.hf.space")
225
+ result = client.transcribe_and_wait("video.mp4")
226
+ ```
227
+
228
+ ## πŸŽ‰ **Step 10: Go Live**
229
+
230
+ ### **Share Your Space**
231
+
232
+ 1. **Make Public** (if desired)
233
+ - Space Settings β†’ Visibility β†’ Public
234
+
235
+ 2. **Add to Profile**
236
+ - Pin to your HF profile
237
+ - Add description and tags
238
+
239
+ 3. **Share URL**
240
+ - Web interface: `https://username-spacename.hf.space`
241
+ - API base: `https://username-spacename.hf.space/api`
242
+
243
+ ### **Monitor Usage**
244
+
245
+ - Check Space analytics
246
+ - Monitor resource usage
247
+ - Review user feedback
248
+ - Update documentation as needed
249
+
250
+ ## πŸ”§ **Troubleshooting**
251
+
252
+ ### **Common Issues**
253
+
254
+ 1. **Build Fails**
255
+ ```
256
+ Solution: Check requirements.txt, ensure all dependencies are compatible
257
+ ```
258
+
259
+ 2. **Model Loading Timeout**
260
+ ```
261
+ Solution: Upgrade to CPU upgrade hardware or use WHISPER_MODEL=tiny
262
+ ```
263
+
264
+ 3. **API Not Accessible**
265
+ ```
266
+ Solution: Ensure Space is Public and FastAPI is running on port 7860
267
+ ```
268
+
269
+ 4. **Memory Issues**
270
+ ```
271
+ Solution: Upgrade hardware or reduce MAX_FILE_SIZE in config
272
+ ```
273
+
274
+ ## πŸ“ž **Support Resources**
275
+
276
+ - **HF Spaces Documentation**: https://huggingface.co/docs/hub/spaces
277
+ - **Gradio Documentation**: https://gradio.app/docs/
278
+ - **Community Forum**: https://discuss.huggingface.co/
279
+ - **Your Space Logs**: Available in Space dashboard
280
+
281
+ ## 🎯 **Next Steps**
282
+
283
+ After successful migration:
284
+
285
+ 1. **Decommission Render.com** service
286
+ 2. **Update documentation** with new URLs
287
+ 3. **Notify users** of the migration
288
+ 4. **Monitor performance** and optimize as needed
289
+ 5. **Consider GPU upgrade** for heavy workloads
290
+
291
+ ---
292
+
293
+ **πŸŽ‰ Congratulations! Your Video Transcription Service is now running on Hugging Face Spaces with enhanced capabilities and better performance!**
294
+
295
+ **Key Benefits Achieved:**
296
+ - βœ… Higher resource limits
297
+ - βœ… Beautiful Gradio web interface
298
+ - βœ… Full API compatibility maintained
299
+ - βœ… Better community integration
300
+ - βœ… More reliable performance
301
+ - βœ… Future GPU upgrade path
LOGGING_GUIDE.md ADDED
@@ -0,0 +1,277 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Comprehensive Logging Guide
2
+
3
+ The Video Transcription Service now includes detailed step-by-step logging to help you monitor and debug transcription progress.
4
+
5
+ ## 🎯 **What You Can Track**
6
+
7
+ ### Complete Transcription Journey
8
+ - βœ… File upload and validation
9
+ - βœ… Video processing steps
10
+ - βœ… Whisper model loading
11
+ - βœ… Audio extraction progress
12
+ - βœ… Transcription inference
13
+ - βœ… Results and cleanup
14
+ - βœ… Error handling and debugging
15
+
16
+ ### Real-time Progress Monitoring
17
+ - πŸ“Š Processing times for each step
18
+ - πŸ“ File sizes and durations
19
+ - 🌐 Language detection
20
+ - πŸ“ Text length and previews
21
+ - ⚠️ Warnings and errors
22
+
23
+ ## πŸš€ **Quick Start**
24
+
25
+ ### Basic Logging (Default)
26
+ ```bash
27
+ python main.py
28
+ ```
29
+
30
+ ### Debug Mode (Detailed Logs)
31
+ ```bash
32
+ DEBUG=true python main.py
33
+ ```
34
+
35
+ ### Log to File
36
+ ```bash
37
+ LOG_TO_FILE=true python main.py
38
+ ```
39
+
40
+ ### Combined (Debug + File)
41
+ ```bash
42
+ DEBUG=true LOG_TO_FILE=true python main.py
43
+ ```
44
+
45
+ ## πŸ“Š **Real-time Monitoring**
46
+
47
+ ### Monitor Service Health
48
+ ```bash
49
+ python log_monitor.py test
50
+ ```
51
+
52
+ ### Upload and Monitor Video
53
+ ```bash
54
+ python log_monitor.py upload video.mp4
55
+ ```
56
+
57
+ ### Monitor Existing Transcription
58
+ ```bash
59
+ python log_monitor.py monitor 123
60
+ ```
61
+
62
+ ## πŸ“‹ **Sample Log Output**
63
+
64
+ ### Service Startup
65
+ ```
66
+ 2024-01-15 10:30:00 - main - INFO - πŸš€ Starting Video Transcription Service
67
+ 2024-01-15 10:30:00 - main - INFO - ==================================================
68
+ 2024-01-15 10:30:00 - main - INFO - πŸ“‹ Service Configuration:
69
+ 2024-01-15 10:30:00 - main - INFO - πŸ€– Whisper Model: base
70
+ 2024-01-15 10:30:00 - main - INFO - πŸ“ Max File Size: 100MB
71
+ 2024-01-15 10:30:00 - main - INFO - πŸ•’ Cleanup Interval: 3.5 hours
72
+ 2024-01-15 10:30:00 - main - INFO - 🚦 Rate Limit: 10 requests/minute
73
+ 2024-01-15 10:30:00 - main - INFO - 🌐 Host: 0.0.0.0:8000
74
+ 2024-01-15 10:30:00 - main - INFO - πŸ“ Supported Formats: .mp4, .avi, .mov, .mkv, .wmv, .flv, .webm, .m4v
75
+ 2024-01-15 10:30:00 - main - INFO - ==================================================
76
+ ```
77
+
78
+ ### File Upload Process
79
+ ```
80
+ 2024-01-15 10:30:15 - main - INFO - πŸš€ Starting transcription request for file: video.mp4
81
+ 2024-01-15 10:30:15 - main - INFO - 🌐 Language specified: auto-detect
82
+ 2024-01-15 10:30:15 - main - INFO - πŸ“ Validating file: video.mp4
83
+ 2024-01-15 10:30:15 - main - INFO - πŸ” File extension: .mp4
84
+ 2024-01-15 10:30:15 - main - INFO - βœ… File format validation passed: .mp4
85
+ 2024-01-15 10:30:15 - main - INFO - πŸ“Š Reading file content for size validation...
86
+ 2024-01-15 10:30:15 - main - INFO - πŸ“ File size: 25.34MB (max: 100MB)
87
+ 2024-01-15 10:30:15 - main - INFO - βœ… File size validation passed: 25.34MB
88
+ ```
89
+
90
+ ### Storage Operations
91
+ ```
92
+ 2024-01-15 10:30:15 - storage - INFO - πŸ“ Creating new transcription entry with ID: 1
93
+ 2024-01-15 10:30:15 - storage - INFO - 🌐 Language: auto-detect
94
+ 2024-01-15 10:30:15 - storage - INFO - βœ… Transcription 1 created successfully
95
+ 2024-01-15 10:30:15 - storage - INFO - πŸ“Š Total active transcriptions: 1
96
+ ```
97
+
98
+ ### Video Processing
99
+ ```
100
+ 2024-01-15 10:30:15 - transcription_service - INFO - 🎬 Starting video transcription for ID: 1
101
+ 2024-01-15 10:30:15 - transcription_service - INFO - πŸ“Š Video size: 25.34MB
102
+ 2024-01-15 10:30:15 - transcription_service - INFO - 🌐 Language: auto-detect
103
+ 2024-01-15 10:30:15 - transcription_service - INFO - πŸ“ Updating status to PROCESSING for ID: 1
104
+ ```
105
+
106
+ ### Model Loading (First Time)
107
+ ```
108
+ 2024-01-15 10:30:15 - transcription_service - INFO - πŸ€– Loading Whisper model: base
109
+ 2024-01-15 10:30:15 - transcription_service - INFO - πŸ“₯ This may take 30-60 seconds for first-time download...
110
+ 2024-01-15 10:30:45 - transcription_service - INFO - βœ… Whisper model loaded successfully in 30.2 seconds
111
+ ```
112
+
113
+ ### Audio Extraction
114
+ ```
115
+ 2024-01-15 10:30:45 - transcription_service - INFO - 🎡 Extracting audio from video for transcription 1
116
+ 2024-01-15 10:30:45 - transcription_service - INFO - πŸ“ Creating temporary video file...
117
+ 2024-01-15 10:30:45 - transcription_service - INFO - πŸ“ Temporary files created - Video: /tmp/xyz.tmp, Audio: /tmp/abc.wav
118
+ 2024-01-15 10:30:45 - transcription_service - INFO - 🎡 Running FFmpeg to extract audio...
119
+ 2024-01-15 10:30:45 - transcription_service - INFO - πŸ”§ Configuring FFmpeg for audio extraction...
120
+ 2024-01-15 10:30:45 - transcription_service - INFO - - Codec: PCM 16-bit
121
+ 2024-01-15 10:30:45 - transcription_service - INFO - - Channels: 1 (mono)
122
+ 2024-01-15 10:30:45 - transcription_service - INFO - - Sample rate: 16kHz
123
+ 2024-01-15 10:30:48 - transcription_service - INFO - βœ… FFmpeg audio extraction completed
124
+ 2024-01-15 10:30:48 - transcription_service - INFO - βœ… Audio extraction successful - Size: 8.45MB
125
+ 2024-01-15 10:30:48 - transcription_service - INFO - βœ… Audio extraction completed in 3.1 seconds
126
+ ```
127
+
128
+ ### Transcription Process
129
+ ```
130
+ 2024-01-15 10:30:48 - transcription_service - INFO - πŸ—£οΈ Starting audio transcription for ID 1
131
+ 2024-01-15 10:30:48 - transcription_service - INFO - πŸ—£οΈ Starting Whisper transcription...
132
+ 2024-01-15 10:30:48 - transcription_service - INFO - 🎡 Audio file: /tmp/abc.wav
133
+ 2024-01-15 10:30:48 - transcription_service - INFO - 🌐 Language: auto-detect
134
+ 2024-01-15 10:30:48 - transcription_service - INFO - ⚑ Running transcription in background thread...
135
+ 2024-01-15 10:30:48 - transcription_service - INFO - πŸ€– Preparing Whisper transcription options...
136
+ 2024-01-15 10:30:48 - transcription_service - INFO - 🌐 Language: auto-detect
137
+ 2024-01-15 10:30:48 - transcription_service - INFO - 🎯 Starting Whisper model inference...
138
+ 2024-01-15 10:31:15 - transcription_service - INFO - βœ… Whisper inference completed in 27.3 seconds
139
+ 2024-01-15 10:31:15 - transcription_service - INFO - πŸ“ Text length: 1247 characters
140
+ 2024-01-15 10:31:15 - transcription_service - INFO - 🌐 Detected language: en
141
+ 2024-01-15 10:31:15 - transcription_service - INFO - ⏱️ Audio duration: 180.50 seconds
142
+ 2024-01-15 10:31:15 - transcription_service - INFO - πŸ“„ Text preview: Hello, welcome to this video tutorial where we'll be discussing...
143
+ ```
144
+
145
+ ### Completion
146
+ ```
147
+ 2024-01-15 10:31:15 - transcription_service - INFO - βœ… Transcription completed in 27.3 seconds
148
+ 2024-01-15 10:31:15 - transcription_service - INFO - πŸ’Ύ Saving transcription results for ID 1
149
+ 2024-01-15 10:31:15 - storage - INFO - πŸ“ Updated transcription 1
150
+ 2024-01-15 10:31:15 - storage - INFO - πŸ”„ Status changed: processing β†’ completed
151
+ 2024-01-15 10:31:15 - storage - INFO - πŸ“„ Text updated: Hello, welcome to this video tutorial where we'll...
152
+ 2024-01-15 10:31:15 - transcription_service - INFO - 🧹 Cleaning up temporary audio file
153
+ 2024-01-15 10:31:15 - transcription_service - INFO - πŸŽ‰ Transcription 1 completed successfully in 60.2 seconds total
154
+ ```
155
+
156
+ ## πŸ”§ **Log Levels**
157
+
158
+ ### INFO (Default)
159
+ - Service startup/shutdown
160
+ - Request processing
161
+ - Status updates
162
+ - Completion messages
163
+
164
+ ### DEBUG (Detailed)
165
+ - File validation details
166
+ - Temporary file paths
167
+ - FFmpeg configuration
168
+ - Model loading progress
169
+ - Memory usage info
170
+
171
+ ### WARNING
172
+ - Large file warnings
173
+ - Performance issues
174
+ - Non-critical errors
175
+
176
+ ### ERROR
177
+ - Processing failures
178
+ - File format issues
179
+ - System errors
180
+ - Transcription failures
181
+
182
+ ## πŸ“ **Log Files**
183
+
184
+ When `LOG_TO_FILE=true`, logs are saved to:
185
+ ```
186
+ transcription_service_YYYYMMDD_HHMMSS.log
187
+ ```
188
+
189
+ Example: `transcription_service_20240115_103000.log`
190
+
191
+ ## πŸ› οΈ **Troubleshooting with Logs**
192
+
193
+ ### Common Issues and Log Patterns
194
+
195
+ **1. NumPy Compatibility Error**
196
+ ```
197
+ ERROR - A module that was compiled using NumPy 1.x cannot be run in NumPy 2.2.6
198
+ ```
199
+ **Solution:** Run `python fix_numpy.py`
200
+
201
+ **2. FFmpeg Not Found**
202
+ ```
203
+ ERROR - FFmpeg audio extraction failed: [Errno 2] No such file or directory: 'ffmpeg'
204
+ ```
205
+ **Solution:** Install FFmpeg for your OS
206
+
207
+ **3. File Too Large**
208
+ ```
209
+ ERROR - File too large: 150.5MB > 100MB
210
+ ```
211
+ **Solution:** Compress video or increase limit in config.py
212
+
213
+ **4. Model Loading Issues**
214
+ ```
215
+ ERROR - Failed to load Whisper model: [Errno 28] No space left on device
216
+ ```
217
+ **Solution:** Free up disk space or use smaller model
218
+
219
+ **5. Memory Issues**
220
+ ```
221
+ ERROR - Process killed (signal 9)
222
+ ```
223
+ **Solution:** Use smaller files or increase available memory
224
+
225
+ ## 🎯 **Performance Monitoring**
226
+
227
+ ### Key Metrics to Watch
228
+ - **Model Loading Time**: Should be 15-60 seconds (first time only)
229
+ - **Audio Extraction**: Usually 1-5 seconds per minute of video
230
+ - **Transcription Speed**: Varies by model and content (typically 0.1-0.5x real-time)
231
+ - **Memory Usage**: Monitor for large files
232
+ - **Active Transcriptions**: Track concurrent processing
233
+
234
+ ### Optimization Tips
235
+ - Use `tiny` model for faster processing
236
+ - Compress videos before upload
237
+ - Monitor memory usage with large files
238
+ - Use DEBUG mode to identify bottlenecks
239
+
240
+ ## πŸ“Š **Integration Examples**
241
+
242
+ ### Parse Logs Programmatically
243
+ ```python
244
+ import re
245
+ from datetime import datetime
246
+
247
+ def parse_transcription_logs(log_file):
248
+ with open(log_file, 'r') as f:
249
+ for line in f:
250
+ if 'Transcription' in line and 'completed successfully' in line:
251
+ # Extract transcription ID and time
252
+ match = re.search(r'Transcription (\d+) completed.*in ([\d.]+) seconds', line)
253
+ if match:
254
+ tid, duration = match.groups()
255
+ print(f"ID {tid}: {duration}s")
256
+ ```
257
+
258
+ ### Monitor API Programmatically
259
+ ```python
260
+ import requests
261
+ import time
262
+
263
+ def monitor_service():
264
+ while True:
265
+ try:
266
+ response = requests.get('http://localhost:8000/health')
267
+ health = response.json()
268
+ print(f"Active: {health.get('active_transcriptions', 0)}")
269
+ time.sleep(30)
270
+ except Exception as e:
271
+ print(f"Service down: {e}")
272
+ time.sleep(60)
273
+ ```
274
+
275
+ ---
276
+
277
+ **With comprehensive logging, you now have complete visibility into your transcription service! πŸŽ‰**
QUICKSTART.md ADDED
@@ -0,0 +1,168 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Quick Start Guide
2
+
3
+ Get your Video Transcription Service running in 5 minutes!
4
+
5
+ ## πŸš€ Option 1: Automated Setup (Recommended)
6
+
7
+ ```bash
8
+ # 1. Run the setup script
9
+ python setup.py
10
+
11
+ # 2. Activate virtual environment
12
+ # Windows:
13
+ venv\Scripts\activate
14
+ # macOS/Linux:
15
+ source venv/bin/activate
16
+
17
+ # 3. Start the service (robust startup prevents restarts)
18
+ python start_robust.py
19
+ ```
20
+
21
+ ## πŸ› οΈ Option 2: Manual Setup
22
+
23
+ ```bash
24
+ # 1. Create virtual environment
25
+ python -m venv venv
26
+
27
+ # 2. Activate virtual environment
28
+ # Windows:
29
+ venv\Scripts\activate
30
+ # macOS/Linux:
31
+ source venv/bin/activate
32
+
33
+ # 3. Install dependencies
34
+ pip install -r requirements.txt
35
+
36
+ # 4. Install FFmpeg
37
+ # Windows: Download from https://ffmpeg.org/download.html
38
+ # macOS: brew install ffmpeg
39
+ # Linux: sudo apt-get install ffmpeg
40
+
41
+ # 5. Start the service
42
+ python start_robust.py # Prevents restarts
43
+ # OR
44
+ python main.py # Standard startup
45
+ ```
46
+
47
+ ## πŸ§ͺ Test Your Service
48
+
49
+ ### Option A: Web Interface
50
+ 1. Open http://localhost:8000/docs
51
+ 2. Click "Try it out" on POST /transcribe
52
+ 3. Upload a video file
53
+ 4. Copy the returned ID
54
+ 5. Use GET /transcribe/{id} to check status
55
+
56
+ ### Option B: Command Line
57
+ ```bash
58
+ # Test with example client
59
+ python example_client.py your_video.mp4
60
+
61
+ # Or test the API directly
62
+ python test_api.py your_video.mp4
63
+
64
+ # Monitor transcription progress in real-time
65
+ python log_monitor.py upload your_video.mp4
66
+ ```
67
+
68
+ ### Option C: cURL
69
+ ```bash
70
+ # Upload video
71
+ curl -X POST "http://localhost:8000/transcribe" \
72
+ -F "file=@your_video.mp4" \
73
+ -F "language=en"
74
+
75
+ # Check status (replace 1 with your ID)
76
+ curl "http://localhost:8000/transcribe/1"
77
+ ```
78
+
79
+ ## 🌐 Deploy to Render.com
80
+
81
+ ```bash
82
+ # 1. Push to GitHub
83
+ git init
84
+ git add .
85
+ git commit -m "Initial commit"
86
+ git remote add origin https://github.com/yourusername/your-repo.git
87
+ git push -u origin main
88
+
89
+ # 2. Go to render.com
90
+ # 3. Create new Web Service
91
+ # 4. Connect your GitHub repo
92
+ # 5. Deploy!
93
+ ```
94
+
95
+ ## πŸ“‹ What You Get
96
+
97
+ - **Free transcription** using OpenAI Whisper
98
+ - **No API limits** - completely free
99
+ - **Multiple formats** - MP4, AVI, MOV, etc.
100
+ - **Auto language detection** or specify language
101
+ - **REST API** with automatic documentation
102
+ - **Rate limiting** and error handling
103
+ - **Ready for production** deployment
104
+
105
+ ## πŸ”§ Configuration
106
+
107
+ Edit `config.py` to customize:
108
+ - File size limits
109
+ - Supported formats
110
+ - Whisper model size
111
+ - Rate limiting
112
+ - Cleanup intervals
113
+
114
+ ## πŸ“Š Monitoring & Logging
115
+
116
+ **Enable detailed logging:**
117
+ ```bash
118
+ DEBUG=true python main.py
119
+ ```
120
+
121
+ **Monitor transcription progress:**
122
+ ```bash
123
+ # Test service
124
+ python log_monitor.py test
125
+
126
+ # Upload and monitor
127
+ python log_monitor.py upload video.mp4
128
+
129
+ # Monitor existing transcription
130
+ python log_monitor.py monitor 123
131
+ ```
132
+
133
+ **Log to file:**
134
+ ```bash
135
+ LOG_TO_FILE=true python main.py
136
+ ```
137
+
138
+ ## πŸ“– Need Help?
139
+
140
+ - **Full documentation**: See README.md
141
+ - **Deployment guide**: See DEPLOYMENT.md
142
+ - **API docs**: http://localhost:8000/docs (when running)
143
+ - **Health check**: http://localhost:8000/health
144
+
145
+ ## 🎯 Common Issues
146
+
147
+ **"Service keeps restarting"**
148
+ - Run: `python start_robust.py` for automatic optimization
149
+ - See: [RESTART_TROUBLESHOOTING.md](RESTART_TROUBLESHOOTING.md)
150
+
151
+ **"NumPy compatibility error"**
152
+ - Run: `python fix_numpy.py` to fix automatically
153
+
154
+ **"FFmpeg not found"**
155
+ - Install FFmpeg for your OS (see setup instructions)
156
+
157
+ **"File too large"**
158
+ - Default limit is 100MB (configurable in config.py)
159
+
160
+ **"Service sleeping on Render"**
161
+ - Free tier sleeps after 15min inactivity (normal behavior)
162
+
163
+ **"Slow first request"**
164
+ - AI model loads on first use (30-60 seconds)
165
+
166
+ ---
167
+
168
+ **Ready to transcribe? Your service is now running at http://localhost:8000! πŸŽ‰**
README.md CHANGED
@@ -1,12 +1,367 @@
1
- ---
2
- title: TubeMate
3
- emoji: πŸ’»
4
- colorFrom: green
5
- colorTo: indigo
6
- sdk: gradio
7
- sdk_version: 5.34.0
8
- app_file: app.py
9
- pinned: false
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
+ # Video Transcription Service
2
+
3
+ A free, production-ready video transcription service built with FastAPI and OpenAI Whisper. Designed for deployment on Render.com's free tier with no transcription limits.
4
+
5
+ ## Features
6
+
7
+ - πŸŽ₯ **Multiple Video Formats**: Supports MP4, AVI, MOV, MKV, WMV, FLV, WebM, M4V
8
+ - πŸ—£οΈ **Free Speech-to-Text**: Uses OpenAI Whisper (completely free, no API limits)
9
+ - 🌐 **REST API**: Simple endpoints for uploading and retrieving transcriptions
10
+ - ⚑ **Async Processing**: Non-blocking transcription for better performance
11
+ - πŸ›‘οΈ **Rate Limiting**: Built-in protection against abuse
12
+ - 🧹 **Auto Cleanup**: Automatic removal of old transcriptions (3.5 hours)
13
+ - πŸ“ **Auto Documentation**: Interactive API docs at `/docs`
14
+ - πŸš€ **Render Ready**: Optimized for Render.com free tier deployment
15
+
16
+ ## Quick Start
17
+
18
+ ### Local Development
19
+
20
+ 1. **Clone and Setup**
21
+ ```bash
22
+ git clone <your-repo-url>
23
+ cd transcriber
24
+ python -m venv venv
25
+ source venv/bin/activate # On Windows: venv\Scripts\activate
26
+ pip install -r requirements.txt
27
+ ```
28
+
29
+ 2. **Install FFmpeg**
30
+ - **Windows**: Download from https://ffmpeg.org/download.html
31
+ - **macOS**: `brew install ffmpeg`
32
+ - **Linux**: `sudo apt-get install ffmpeg`
33
+
34
+ 3. **Run the Service**
35
+ ```bash
36
+ # Robust startup (recommended - prevents restarts)
37
+ python start_robust.py
38
+
39
+ # Or standard startup
40
+ python main.py
41
+ ```
42
+
43
+ 4. **Access the API**
44
+ - Service: http://localhost:8000
45
+ - Documentation: http://localhost:8000/docs
46
+ - Health Check: http://localhost:8000/health
47
+
48
+ ### Logging and Monitoring
49
+
50
+ The service provides comprehensive step-by-step logging to track transcription progress:
51
+
52
+ **Enable Debug Logging:**
53
+ ```bash
54
+ DEBUG=true python main.py
55
+ ```
56
+
57
+ **Enable File Logging:**
58
+ ```bash
59
+ LOG_TO_FILE=true python main.py
60
+ ```
61
+
62
+ **Sample Log Output:**
63
+ ```
64
+ 2024-01-15 10:30:00 - main - INFO - πŸš€ Starting transcription request for file: video.mp4
65
+ 2024-01-15 10:30:00 - main - INFO - 🌐 Language specified: auto-detect
66
+ 2024-01-15 10:30:00 - main - INFO - πŸ“ Validating file: video.mp4
67
+ 2024-01-15 10:30:00 - main - INFO - πŸ” File extension: .mp4
68
+ 2024-01-15 10:30:00 - main - INFO - βœ… File format validation passed: .mp4
69
+ 2024-01-15 10:30:00 - main - INFO - πŸ“Š Reading file content for size validation...
70
+ 2024-01-15 10:30:00 - main - INFO - πŸ“ File size: 25.3MB (max: 100MB)
71
+ 2024-01-15 10:30:00 - main - INFO - βœ… File size validation passed: 25.3MB
72
+ 2024-01-15 10:30:00 - storage - INFO - πŸ“ Creating new transcription entry with ID: 1
73
+ 2024-01-15 10:30:00 - transcription_service - INFO - 🎬 Starting video transcription for ID: 1
74
+ 2024-01-15 10:30:00 - transcription_service - INFO - πŸ€– Loading Whisper model: base
75
+ 2024-01-15 10:30:15 - transcription_service - INFO - βœ… Whisper model loaded successfully in 15.2 seconds
76
+ 2024-01-15 10:30:15 - transcription_service - INFO - 🎡 Extracting audio from video for transcription 1
77
+ 2024-01-15 10:30:18 - transcription_service - INFO - βœ… Audio extraction completed in 3.1 seconds
78
+ 2024-01-15 10:30:18 - transcription_service - INFO - πŸ—£οΈ Starting audio transcription for ID 1
79
+ 2024-01-15 10:30:45 - transcription_service - INFO - βœ… Transcription completed in 27.3 seconds
80
+ 2024-01-15 10:30:45 - transcription_service - INFO - πŸ“ Transcribed text length: 1247 characters
81
+ 2024-01-15 10:30:45 - transcription_service - INFO - 🌐 Detected language: en
82
+ 2024-01-15 10:30:45 - transcription_service - INFO - πŸŽ‰ Transcription 1 completed successfully in 45.6 seconds total
83
+ ```
84
+
85
+ ### Deploy to Render.com
86
+
87
+ 1. **Push to GitHub**
88
+ ```bash
89
+ git init
90
+ git add .
91
+ git commit -m "Initial commit"
92
+ git remote add origin <your-github-repo-url>
93
+ git push -u origin main
94
+ ```
95
+
96
+ 2. **Deploy on Render**
97
+ - Go to [Render.com](https://render.com)
98
+ - Click "New +" β†’ "Web Service"
99
+ - Connect your GitHub repository
100
+ - Render will automatically detect the `render.yaml` configuration
101
+ - Click "Deploy"
102
+
103
+ 3. **Configuration**
104
+ - The service will automatically use the free tier
105
+ - No environment variables needed (all configured automatically)
106
+ - Health checks are configured at `/health`
107
+
108
+ ## API Documentation
109
+
110
+ ### Base URL
111
+ - Local: `http://localhost:8000`
112
+ - Render: `https://your-service-name.onrender.com`
113
+
114
+ ### Endpoints
115
+
116
+ #### 1. Upload Video for Transcription
117
+
118
+ **POST** `/transcribe`
119
+
120
+ Upload a video file and get a transcription ID.
121
+
122
+ **Request:**
123
+ - **Content-Type**: `multipart/form-data`
124
+ - **file**: Video file (required) - Max 100MB
125
+ - **language**: Language code (optional) - e.g., 'en', 'es', 'fr'
126
+
127
+ **Response:**
128
+ ```json
129
+ {
130
+ "id": 123,
131
+ "status": "pending",
132
+ "message": "Transcription started. Use the ID to check status.",
133
+ "created_at": "2024-01-15T10:30:00Z"
134
+ }
135
+ ```
136
+
137
+ **Example using curl:**
138
+ ```bash
139
+ curl -X POST "http://localhost:8000/transcribe" \
140
141
+ -F "language=en"
142
+ ```
143
+
144
+ **Example using Python:**
145
+ ```python
146
+ import requests
147
+
148
+ with open('video.mp4', 'rb') as f:
149
+ response = requests.post(
150
+ 'http://localhost:8000/transcribe',
151
+ files={'file': f},
152
+ data={'language': 'en'} # optional
153
+ )
154
+
155
+ result = response.json()
156
+ transcription_id = result['id']
157
+ ```
158
+
159
+ #### 2. Get Transcription Status/Results
160
+
161
+ **GET** `/transcribe/{id}`
162
+
163
+ Check transcription status and retrieve results.
164
+
165
+ **Response:**
166
+ ```json
167
+ {
168
+ "id": 123,
169
+ "status": "completed",
170
+ "text": "Hello, this is the transcribed text from your video...",
171
+ "language": "en",
172
+ "duration": 45.6,
173
+ "created_at": "2024-01-15T10:30:00Z",
174
+ "completed_at": "2024-01-15T10:32:15Z",
175
+ "error_message": null
176
+ }
177
+ ```
178
+
179
+ **Status Values:**
180
+ - `pending`: Transcription queued
181
+ - `processing`: Currently transcribing
182
+ - `completed`: Transcription finished successfully
183
+ - `failed`: Transcription failed (check error_message)
184
+
185
+ **Example:**
186
+ ```bash
187
+ curl "http://localhost:8000/transcribe/123"
188
+ ```
189
+
190
+ #### 3. Health Check
191
+
192
+ **GET** `/health`
193
+
194
+ Check service health and get statistics.
195
+
196
+ **Response:**
197
+ ```json
198
+ {
199
+ "status": "healthy",
200
+ "timestamp": 5,
201
+ "active_transcriptions": 2
202
+ }
203
+ ```
204
+
205
+ ### Error Handling
206
+
207
+ All errors return a consistent format:
208
+ ```json
209
+ {
210
+ "id": 0,
211
+ "error": "error_type",
212
+ "message": "Human readable error message"
213
+ }
214
+ ```
215
+
216
+ **Common Error Codes:**
217
+ - `400`: Bad request (invalid file, unsupported format)
218
+ - `413`: File too large (>100MB)
219
+ - `404`: Transcription not found or expired
220
+ - `429`: Rate limit exceeded (>10 requests/minute)
221
+ - `500`: Internal server error
222
+
223
+ ## Supported Languages
224
+
225
+ Whisper supports 99+ languages including:
226
+ - English (en)
227
+ - Spanish (es)
228
+ - French (fr)
229
+ - German (de)
230
+ - Italian (it)
231
+ - Portuguese (pt)
232
+ - Russian (ru)
233
+ - Japanese (ja)
234
+ - Korean (ko)
235
+ - Chinese (zh)
236
+ - Arabic (ar)
237
+ - Hindi (hi)
238
+
239
+ Leave `language` empty for automatic detection.
240
+
241
+ ## Limitations
242
+
243
+ ### Free Tier Constraints
244
+ - **File Size**: 100MB maximum per video
245
+ - **Rate Limiting**: 10 requests per minute per IP
246
+ - **Storage**: Results stored for 3.5 hours only
247
+ - **Processing**: Sequential processing (one video at a time)
248
+ - **Cold Starts**: First request may take 30-60 seconds
249
+
250
+ ### Technical Limitations
251
+ - **Video Length**: Longer videos take more time to process
252
+ - **Memory**: Large videos may fail on free tier (512MB RAM limit)
253
+ - **CPU**: Processing speed limited by free tier CPU allocation
254
+
255
+ ## Troubleshooting
256
+
257
+ ### Common Issues
258
+
259
+ 1. **Service Restarts/Memory Issues**
260
+ ```
261
+ Process killed (signal 9) or frequent restarts
262
+ ```
263
+ **Solution:**
264
+ ```bash
265
+ # Use robust startup (automatically optimizes settings)
266
+ python start_robust.py
267
+
268
+ # Or manually use tiny model
269
+ WHISPER_MODEL=tiny MODEL_PRELOAD=true python main.py
270
+ ```
271
+ **See:** [RESTART_TROUBLESHOOTING.md](RESTART_TROUBLESHOOTING.md)
272
+
273
+ 2. **NumPy Compatibility Error**
274
+ ```
275
+ A module that was compiled using NumPy 1.x cannot be run in NumPy 2.2.6
276
+ ```
277
+ **Solution:**
278
+ ```bash
279
+ python fix_numpy.py
280
+ ```
281
+ Or manually:
282
+ ```bash
283
+ pip uninstall numpy
284
+ pip install 'numpy<2.0.0'
285
+ pip install --force-reinstall torch torchaudio openai-whisper
286
+ ```
287
+
288
+ 2. **"File too large" Error**
289
+ - Compress your video or use a shorter clip
290
+ - Maximum file size is 100MB
291
+
292
+ 3. **"Unsupported file format" Error**
293
+ - Convert to supported format: MP4, AVI, MOV, MKV, WMV, FLV, WebM, M4V
294
+
295
+ 4. **Slow Processing**
296
+ - First request loads the AI model (30-60 seconds)
297
+ - Subsequent requests are faster
298
+ - Longer videos take more time
299
+
300
+ 5. **"Transcription not found" Error**
301
+ - Transcriptions expire after 3.5 hours
302
+ - Check if the ID is correct
303
+
304
+ 6. **Rate Limit Exceeded**
305
+ - Wait 1 minute before making more requests
306
+ - Maximum 10 requests per minute per IP
307
+
308
+ ### Render.com Specific
309
+
310
+ 1. **Service Sleeping**
311
+ - Free tier services sleep after 15 minutes of inactivity
312
+ - First request after sleep takes 30-60 seconds
313
+
314
+ 2. **Build Failures**
315
+ - Check build logs in Render dashboard
316
+ - Ensure all dependencies are in requirements.txt
317
+
318
+ 3. **Memory Issues**
319
+ - Free tier has 512MB RAM limit
320
+ - Large videos may cause out-of-memory errors
321
+
322
+ ## Development
323
+
324
+ ### Project Structure
325
+ ```
326
+ transcriber/
327
+ β”œβ”€β”€ main.py # FastAPI application
328
+ β”œβ”€β”€ transcription_service.py # Core transcription logic
329
+ β”œβ”€β”€ storage.py # In-memory storage manager
330
+ β”œβ”€β”€ models.py # Pydantic data models
331
+ β”œβ”€β”€ config.py # Configuration settings
332
+ β”œβ”€β”€ requirements.txt # Python dependencies
333
+ β”œβ”€β”€ Dockerfile # Container configuration
334
+ β”œβ”€β”€ render.yaml # Render deployment config
335
+ └── README.md # This file
336
+ ```
337
+
338
+ ### Adding Features
339
+
340
+ 1. **New Video Formats**: Add to `ALLOWED_EXTENSIONS` in `config.py`
341
+ 2. **Different Models**: Change `WHISPER_MODEL` in `config.py`
342
+ 3. **Longer Storage**: Modify `CLEANUP_INTERVAL_HOURS` in `config.py`
343
+ 4. **Rate Limits**: Adjust `RATE_LIMIT_REQUESTS` in `config.py`
344
+
345
+ ### Testing
346
+
347
+ ```bash
348
+ # Install test dependencies
349
+ pip install pytest httpx
350
+
351
+ # Run tests (create test files as needed)
352
+ pytest
353
+ ```
354
+
355
+ ## License
356
+
357
+ MIT License - feel free to use for any purpose.
358
+
359
+ ## Support
360
+
361
+ - πŸ“– **Documentation**: Visit `/docs` endpoint for interactive API docs
362
+ - πŸ› **Issues**: Report bugs via GitHub issues
363
+ - πŸ’‘ **Features**: Suggest improvements via GitHub discussions
364
+
365
  ---
366
 
367
+ **Ready to transcribe? Upload your first video at `/docs` or use the API endpoints above!**
README_HF.md ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Video Transcription Service
3
+ emoji: 🎬
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 4.44.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ ---
12
+
13
+ # 🎬 Video Transcription Service
14
+
15
+ A powerful video transcription service using OpenAI Whisper, deployed on Hugging Face Spaces with both web interface and API access.
16
+
17
+ ## ✨ Features
18
+
19
+ - πŸŽ₯ **Multiple Video Formats**: MP4, AVI, MOV, MKV, WMV, FLV, WebM, M4V
20
+ - πŸ—£οΈ **Free Speech-to-Text**: OpenAI Whisper (no API limits)
21
+ - 🌐 **Language Support**: 99+ languages with auto-detection
22
+ - πŸ“± **Dual Interface**: Web UI + REST API
23
+ - ⚑ **Fast Processing**: Optimized for Hugging Face Spaces
24
+ - 🧹 **Auto Cleanup**: Results stored for 3.5 hours
25
+
26
+ ## πŸš€ Quick Start
27
+
28
+ ### Web Interface
29
+ 1. Upload your video file
30
+ 2. Select language (or use auto-detect)
31
+ 3. Click "Start Transcription"
32
+ 4. Use the transcription ID to check status
33
+
34
+ ### API Access
35
+
36
+ **Upload Video:**
37
+ ```bash
38
+ curl -X POST "https://your-space-name.hf.space/api/transcribe" \
39
40
+ -F "language=en"
41
+ ```
42
+
43
+ **Check Status:**
44
+ ```bash
45
+ curl "https://your-space-name.hf.space/api/transcribe/123"
46
+ ```
47
+
48
+ **Python Example:**
49
+ ```python
50
+ import requests
51
+
52
+ # Upload video
53
+ with open('video.mp4', 'rb') as f:
54
+ response = requests.post(
55
+ 'https://your-space-name.hf.space/api/transcribe',
56
+ files={'file': f},
57
+ data={'language': 'en'}
58
+ )
59
+
60
+ result = response.json()
61
+ transcription_id = result['id']
62
+
63
+ # Check status
64
+ import time
65
+ while True:
66
+ status_response = requests.get(
67
+ f'https://your-space-name.hf.space/api/transcribe/{transcription_id}'
68
+ )
69
+ status = status_response.json()
70
+
71
+ if status['status'] == 'completed':
72
+ print("Transcription:", status['text'])
73
+ break
74
+ elif status['status'] == 'failed':
75
+ print("Error:", status['error_message'])
76
+ break
77
+ else:
78
+ print("Status:", status['status'])
79
+ time.sleep(10)
80
+ ```
81
+
82
+ ## πŸ“‹ API Endpoints
83
+
84
+ | Endpoint | Method | Description |
85
+ |----------|--------|-------------|
86
+ | `/api/transcribe` | POST | Upload video for transcription |
87
+ | `/api/transcribe/{id}` | GET | Get transcription status/results |
88
+ | `/api/health` | GET | Service health check |
89
+
90
+ ## 🌐 Supported Languages
91
+
92
+ Auto-detection or specify: English, Spanish, French, German, Italian, Portuguese, Russian, Japanese, Korean, Chinese, Arabic, Hindi, and 87+ more languages.
93
+
94
+ ## πŸ“ Limitations
95
+
96
+ - **File Size**: 100MB maximum per video
97
+ - **Processing**: Sequential (one video at a time)
98
+ - **Storage**: Results expire after 3.5 hours
99
+ - **Rate Limiting**: Built-in protection against abuse
100
+
101
+ ## πŸ”§ Technical Details
102
+
103
+ - **Model**: OpenAI Whisper (base model for accuracy)
104
+ - **Backend**: FastAPI + Gradio
105
+ - **Processing**: Async with real-time status updates
106
+ - **Storage**: In-memory with automatic cleanup
107
+ - **Deployment**: Optimized for Hugging Face Spaces
108
+
109
+ ## πŸ“Š Response Format
110
+
111
+ **Upload Response:**
112
+ ```json
113
+ {
114
+ "id": 123,
115
+ "status": "pending",
116
+ "message": "Transcription started",
117
+ "created_at": "2024-01-15T10:30:00Z"
118
+ }
119
+ ```
120
+
121
+ **Status Response:**
122
+ ```json
123
+ {
124
+ "id": 123,
125
+ "status": "completed",
126
+ "text": "Hello, this is the transcribed text...",
127
+ "language": "en",
128
+ "duration": 45.6,
129
+ "created_at": "2024-01-15T10:30:00Z",
130
+ "completed_at": "2024-01-15T10:32:15Z"
131
+ }
132
+ ```
133
+
134
+ ## πŸ› οΈ Development
135
+
136
+ This service combines:
137
+ - **Gradio**: Beautiful web interface
138
+ - **FastAPI**: Robust API endpoints
139
+ - **OpenAI Whisper**: State-of-the-art transcription
140
+ - **Async Processing**: Non-blocking operations
141
+
142
+ ## πŸ“ž Support
143
+
144
+ - πŸ“– **Documentation**: Available in the API tab
145
+ - πŸ› **Issues**: Report via GitHub
146
+ - πŸ’‘ **Features**: Suggest improvements
147
+
148
+ ## πŸ“„ License
149
+
150
+ MIT License - free for any use.
151
+
152
+ ---
153
+
154
+ **Ready to transcribe? Upload your video or use the API endpoints above! πŸŽ‰**
RESTART_TROUBLESHOOTING.md ADDED
@@ -0,0 +1,295 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Restart Troubleshooting Guide
2
+
3
+ If your Video Transcription Service is getting restarted frequently, this guide will help you identify and fix the issue.
4
+
5
+ ## πŸ” **Common Restart Causes**
6
+
7
+ ### 1. **Memory Exhaustion (Most Common)**
8
+ **Symptoms:**
9
+ - Service restarts during model loading
10
+ - Restarts when processing large videos
11
+ - "Process killed (signal 9)" in logs
12
+
13
+ **Solutions:**
14
+ ```bash
15
+ # Use tiny model (uses less memory)
16
+ WHISPER_MODEL=tiny python main.py
17
+
18
+ # Or use the robust startup script
19
+ python start_robust.py
20
+ ```
21
+
22
+ ### 2. **Request Timeouts**
23
+ **Symptoms:**
24
+ - Restarts during first transcription request
25
+ - Long delays before restart
26
+ - No error messages, just restart
27
+
28
+ **Solutions:**
29
+ ```bash
30
+ # Enable model preloading
31
+ MODEL_PRELOAD=true python main.py
32
+
33
+ # Use robust startup (preloads automatically)
34
+ python start_robust.py
35
+ ```
36
+
37
+ ### 3. **Dependency Issues**
38
+ **Symptoms:**
39
+ - Restarts immediately after startup
40
+ - Import errors in logs
41
+ - NumPy compatibility errors
42
+
43
+ **Solutions:**
44
+ ```bash
45
+ # Fix NumPy compatibility
46
+ python fix_numpy.py
47
+
48
+ # Reinstall dependencies
49
+ pip install -r requirements.txt
50
+ ```
51
+
52
+ ## πŸ› οΈ **Quick Fixes**
53
+
54
+ ### **Option 1: Use Robust Startup (Recommended)**
55
+ ```bash
56
+ python start_robust.py
57
+ ```
58
+ This script automatically:
59
+ - Detects your environment (local/cloud/Render)
60
+ - Sets optimal configuration
61
+ - Preloads the model
62
+ - Uses memory-efficient settings
63
+
64
+ ### **Option 2: Manual Configuration**
65
+ ```bash
66
+ # For free tier / limited memory
67
+ WHISPER_MODEL=tiny MODEL_PRELOAD=true DEBUG=false python main.py
68
+
69
+ # For local development
70
+ WHISPER_MODEL=base MODEL_PRELOAD=true python main.py
71
+ ```
72
+
73
+ ### **Option 3: Environment Variables**
74
+ Create a `.env` file:
75
+ ```env
76
+ WHISPER_MODEL=tiny
77
+ MODEL_PRELOAD=true
78
+ DEBUG=false
79
+ MAX_FILE_SIZE=52428800
80
+ ```
81
+
82
+ ## πŸ“Š **Memory Optimization**
83
+
84
+ ### **Model Size Comparison**
85
+ | Model | Memory Usage | Speed | Accuracy |
86
+ |-------|-------------|-------|----------|
87
+ | tiny | ~39MB | Fast | Good |
88
+ | base | ~74MB | Medium| Better |
89
+ | small | ~244MB | Slow | Best |
90
+
91
+ **For free tier (512MB RAM limit): Use `tiny`**
92
+
93
+ ### **File Size Limits**
94
+ ```bash
95
+ # Conservative (recommended for free tier)
96
+ MAX_FILE_SIZE=50MB
97
+
98
+ # Standard (for paid tiers)
99
+ MAX_FILE_SIZE=100MB
100
+ ```
101
+
102
+ ## πŸ”§ **Render.com Specific Fixes**
103
+
104
+ ### **Update render.yaml**
105
+ ```yaml
106
+ services:
107
+ - type: web
108
+ name: video-transcription-service
109
+ env: docker
110
+ plan: free
111
+ dockerfilePath: ./Dockerfile
112
+ envVars:
113
+ - key: WHISPER_MODEL
114
+ value: tiny
115
+ - key: MODEL_PRELOAD
116
+ value: true
117
+ - key: DEBUG
118
+ value: false
119
+ healthCheckPath: /health
120
+ autoDeploy: true
121
+ ```
122
+
123
+ ### **Dockerfile Optimization**
124
+ The updated Dockerfile now includes:
125
+ - Memory-efficient settings
126
+ - Model preloading
127
+ - Robust startup script
128
+
129
+ ## πŸ“‹ **Diagnostic Commands**
130
+
131
+ ### **Check Service Health**
132
+ ```bash
133
+ curl http://localhost:8000/health
134
+ ```
135
+
136
+ **Healthy Response:**
137
+ ```json
138
+ {
139
+ "status": "healthy",
140
+ "model_status": "loaded",
141
+ "model_name": "tiny",
142
+ "active_transcriptions": 0
143
+ }
144
+ ```
145
+
146
+ ### **Monitor Memory Usage**
147
+ ```bash
148
+ # Local monitoring
149
+ python -c "
150
+ import psutil
151
+ p = psutil.Process()
152
+ print(f'Memory: {p.memory_info().rss / 1024**2:.1f}MB')
153
+ "
154
+ ```
155
+
156
+ ### **Test Model Loading**
157
+ ```bash
158
+ python -c "
159
+ import whisper
160
+ import time
161
+ start = time.time()
162
+ model = whisper.load_model('tiny')
163
+ print(f'Loaded in {time.time()-start:.1f}s')
164
+ "
165
+ ```
166
+
167
+ ## 🚨 **Emergency Fixes**
168
+
169
+ ### **If Service Won't Start**
170
+ 1. **Check dependencies:**
171
+ ```bash
172
+ python -c "import fastapi, whisper, torch; print('OK')"
173
+ ```
174
+
175
+ 2. **Fix NumPy issues:**
176
+ ```bash
177
+ python fix_numpy.py
178
+ ```
179
+
180
+ 3. **Use minimal configuration:**
181
+ ```bash
182
+ WHISPER_MODEL=tiny DEBUG=false python main.py
183
+ ```
184
+
185
+ ### **If Restarts During Requests**
186
+ 1. **Enable model preloading:**
187
+ ```bash
188
+ MODEL_PRELOAD=true python start_robust.py
189
+ ```
190
+
191
+ 2. **Reduce file size limit:**
192
+ ```bash
193
+ # Edit config.py
194
+ MAX_FILE_SIZE = 25 * 1024 * 1024 # 25MB
195
+ ```
196
+
197
+ 3. **Use tiny model:**
198
+ ```bash
199
+ WHISPER_MODEL=tiny python main.py
200
+ ```
201
+
202
+ ## πŸ“ˆ **Performance Monitoring**
203
+
204
+ ### **Log Analysis**
205
+ Look for these patterns in logs:
206
+
207
+ **Memory Issues:**
208
+ ```
209
+ ⚠️ High memory usage: 450.1MB (limit: 512MB)
210
+ ```
211
+
212
+ **Model Loading:**
213
+ ```
214
+ βœ… Whisper model preloaded successfully in 15.2 seconds
215
+ ```
216
+
217
+ **Successful Transcription:**
218
+ ```
219
+ πŸŽ‰ Transcription 1 completed successfully in 45.6 seconds total
220
+ ```
221
+
222
+ ### **Health Check Monitoring**
223
+ ```bash
224
+ # Continuous monitoring
225
+ while true; do
226
+ curl -s http://localhost:8000/health | jq '.model_status'
227
+ sleep 30
228
+ done
229
+ ```
230
+
231
+ ## 🎯 **Best Practices**
232
+
233
+ ### **For Free Tier Hosting**
234
+ 1. Use `WHISPER_MODEL=tiny`
235
+ 2. Enable `MODEL_PRELOAD=true`
236
+ 3. Set `DEBUG=false`
237
+ 4. Limit file sizes to 25-50MB
238
+ 5. Process one video at a time
239
+
240
+ ### **For Local Development**
241
+ 1. Use `WHISPER_MODEL=base` or `small`
242
+ 2. Enable `DEBUG=true` for detailed logs
243
+ 3. Use `LOG_TO_FILE=true` for persistent logs
244
+ 4. Monitor memory usage
245
+
246
+ ### **For Production**
247
+ 1. Use paid hosting with more memory
248
+ 2. Enable model preloading
249
+ 3. Set up proper monitoring
250
+ 4. Use load balancing for multiple instances
251
+
252
+ ## πŸ”„ **Restart Recovery**
253
+
254
+ ### **Automatic Recovery**
255
+ The service includes automatic recovery features:
256
+ - Graceful shutdown handling
257
+ - Model preloading on startup
258
+ - Memory usage monitoring
259
+ - Optimal settings detection
260
+
261
+ ### **Manual Recovery**
262
+ If the service keeps restarting:
263
+
264
+ 1. **Check logs for error patterns**
265
+ 2. **Reduce resource usage**
266
+ 3. **Use robust startup script**
267
+ 4. **Contact hosting support if needed**
268
+
269
+ ## πŸ“ž **Getting Help**
270
+
271
+ ### **Log Collection**
272
+ When reporting issues, include:
273
+ ```bash
274
+ # System info
275
+ python -c "import sys, platform; print(f'Python: {sys.version}'); print(f'Platform: {platform.platform()}')"
276
+
277
+ # Memory info
278
+ python -c "import psutil; m=psutil.virtual_memory(); print(f'Memory: {m.total/1024**3:.1f}GB total, {m.available/1024**3:.1f}GB available')"
279
+
280
+ # Service health
281
+ curl http://localhost:8000/health
282
+ ```
283
+
284
+ ### **Common Solutions Summary**
285
+ | Problem | Solution |
286
+ |---------|----------|
287
+ | Memory exhaustion | Use `WHISPER_MODEL=tiny` |
288
+ | Request timeouts | Enable `MODEL_PRELOAD=true` |
289
+ | NumPy errors | Run `python fix_numpy.py` |
290
+ | Frequent restarts | Use `python start_robust.py` |
291
+ | Large file issues | Reduce `MAX_FILE_SIZE` |
292
+
293
+ ---
294
+
295
+ **With these fixes, your service should run stably without restarts! πŸŽ‰**
app.py ADDED
@@ -0,0 +1,343 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Hugging Face Spaces app.py - Video Transcription Service
4
+ Combines Gradio interface with FastAPI for full functionality
5
+ """
6
+
7
+ import gradio as gr
8
+ import asyncio
9
+ import threading
10
+ import time
11
+ import os
12
+ import logging
13
+ from datetime import datetime
14
+ from typing import Optional, Tuple
15
+ import uvicorn
16
+ from fastapi import FastAPI, File, UploadFile, HTTPException
17
+ from fastapi.responses import JSONResponse
18
+ import tempfile
19
+
20
+ # Import our existing modules
21
+ from config import settings
22
+ from models import TranscriptionStatus, TranscriptionResponse, TranscriptionResult
23
+ from storage import storage
24
+ from transcription_service import transcription_service
25
+ from logging_config import setup_logging, log_step, log_success, log_error
26
+
27
+ # Setup logging for Hugging Face Spaces
28
+ setup_logging(level=logging.INFO, log_to_file=False)
29
+ logger = logging.getLogger(__name__)
30
+
31
+ # Configure for Hugging Face Spaces
32
+ os.environ.setdefault("WHISPER_MODEL", "base") # HF Spaces can handle base model
33
+ os.environ.setdefault("MODEL_PRELOAD", "true")
34
+ os.environ.setdefault("DEBUG", "false")
35
+
36
+ # FastAPI app for API functionality
37
+ api_app = FastAPI(
38
+ title="Video Transcription API",
39
+ description="API endpoints for video transcription",
40
+ version="1.0.0"
41
+ )
42
+
43
+ class TranscriptionManager:
44
+ def __init__(self):
45
+ self.model_loaded = False
46
+ self.model_loading = False
47
+
48
+ async def ensure_model_loaded(self):
49
+ """Ensure Whisper model is loaded"""
50
+ if self.model_loaded:
51
+ return True
52
+
53
+ if self.model_loading:
54
+ while self.model_loading:
55
+ await asyncio.sleep(0.1)
56
+ return self.model_loaded
57
+
58
+ self.model_loading = True
59
+ try:
60
+ logger.info("πŸ€– Loading Whisper model for Hugging Face Spaces...")
61
+ success = await transcription_service.preload_model()
62
+ self.model_loaded = success
63
+ return success
64
+ finally:
65
+ self.model_loading = False
66
+
67
+ # Global transcription manager
68
+ transcription_manager = TranscriptionManager()
69
+
70
+ # FastAPI endpoints (preserve existing API functionality)
71
+ @api_app.post("/transcribe")
72
+ async def api_transcribe(file: UploadFile = File(...), language: str = None):
73
+ """API endpoint for video transcription"""
74
+ try:
75
+ # Ensure model is loaded
76
+ if not await transcription_manager.ensure_model_loaded():
77
+ raise HTTPException(status_code=503, detail="Model not available")
78
+
79
+ # Validate file
80
+ if not file.filename:
81
+ raise HTTPException(status_code=400, detail="No file provided")
82
+
83
+ # Read file content
84
+ content = await file.read()
85
+ if len(content) > settings.MAX_FILE_SIZE:
86
+ raise HTTPException(status_code=413, detail="File too large")
87
+
88
+ # Create transcription
89
+ transcription_id = storage.create_transcription(language=language)
90
+
91
+ # Start transcription in background
92
+ asyncio.create_task(
93
+ transcription_service.transcribe_video(content, transcription_id, language)
94
+ )
95
+
96
+ return TranscriptionResponse(
97
+ id=transcription_id,
98
+ status=TranscriptionStatus.PENDING,
99
+ message="Transcription started",
100
+ created_at=storage.get_transcription(transcription_id).created_at
101
+ )
102
+
103
+ except HTTPException:
104
+ raise
105
+ except Exception as e:
106
+ logger.error(f"API transcription error: {e}")
107
+ raise HTTPException(status_code=500, detail=str(e))
108
+
109
+ @api_app.get("/transcribe/{transcription_id}")
110
+ async def api_get_transcription(transcription_id: int):
111
+ """API endpoint to get transcription status/results"""
112
+ result = storage.get_transcription(transcription_id)
113
+ if not result:
114
+ raise HTTPException(status_code=404, detail="Transcription not found")
115
+ return result
116
+
117
+ @api_app.get("/health")
118
+ async def api_health():
119
+ """API health check"""
120
+ return {
121
+ "status": "healthy",
122
+ "model_loaded": transcription_manager.model_loaded,
123
+ "active_transcriptions": len([
124
+ t for t in storage._storage.values()
125
+ if t.status in [TranscriptionStatus.PENDING, TranscriptionStatus.PROCESSING]
126
+ ]) if hasattr(storage, '_storage') else 0
127
+ }
128
+
129
+ # Gradio interface functions (sync versions for Gradio compatibility)
130
+ def gradio_transcribe(video_file, language):
131
+ """Gradio transcription function"""
132
+ if video_file is None:
133
+ return "❌ Please upload a video file", "", ""
134
+
135
+ try:
136
+ # Check if model is loaded (sync check)
137
+ if not transcription_manager.model_loaded:
138
+ return "❌ Model not loaded yet. Please wait and try again.", "", ""
139
+
140
+ # Read file
141
+ with open(video_file, 'rb') as f:
142
+ content = f.read()
143
+
144
+ if len(content) > settings.MAX_FILE_SIZE:
145
+ return f"❌ File too large. Maximum size: {settings.MAX_FILE_SIZE // (1024*1024)}MB", "", ""
146
+
147
+ # Create transcription
148
+ transcription_id = storage.create_transcription(language=language if language != "auto" else None)
149
+
150
+ # Start transcription in background
151
+ loop = asyncio.new_event_loop()
152
+ asyncio.set_event_loop(loop)
153
+ loop.run_in_executor(
154
+ None,
155
+ lambda: asyncio.run(transcription_service.transcribe_video(
156
+ content, transcription_id, language if language != "auto" else None
157
+ ))
158
+ )
159
+
160
+ return f"βœ… Transcription started with ID: {transcription_id}", str(transcription_id), "⏳ Processing..."
161
+
162
+ except Exception as e:
163
+ logger.error(f"Gradio transcription error: {e}")
164
+ return f"❌ Error: {str(e)}", "", ""
165
+
166
+ def gradio_check_status(transcription_id_str):
167
+ """Check transcription status for Gradio"""
168
+ if not transcription_id_str:
169
+ return "❌ Please provide a transcription ID"
170
+
171
+ try:
172
+ transcription_id = int(transcription_id_str)
173
+ result = storage.get_transcription(transcription_id)
174
+
175
+ if not result:
176
+ return "❌ Transcription not found or expired"
177
+
178
+ if result.status == TranscriptionStatus.COMPLETED:
179
+ return f"βœ… Completed!\n\nLanguage: {result.language}\nDuration: {result.duration}s\n\nText:\n{result.text}"
180
+ elif result.status == TranscriptionStatus.FAILED:
181
+ return f"❌ Failed: {result.error_message}"
182
+ elif result.status == TranscriptionStatus.PROCESSING:
183
+ return "⏳ Still processing... Please wait and check again."
184
+ else:
185
+ return "⏳ Pending... Please wait and check again."
186
+
187
+ except ValueError:
188
+ return "❌ Invalid transcription ID (must be a number)"
189
+ except Exception as e:
190
+ return f"❌ Error: {str(e)}"
191
+
192
+ # Create Gradio interface
193
+ def create_gradio_interface():
194
+ """Create the Gradio interface"""
195
+
196
+ with gr.Blocks(
197
+ title="Video Transcription Service",
198
+ theme=gr.themes.Soft(),
199
+ css="""
200
+ .gradio-container {
201
+ max-width: 1000px !important;
202
+ }
203
+ """
204
+ ) as interface:
205
+
206
+ gr.Markdown("""
207
+ # 🎬 Video Transcription Service
208
+
209
+ Upload your video files and get accurate transcriptions using OpenAI Whisper.
210
+
211
+ **Features:**
212
+ - πŸŽ₯ Multiple video formats (MP4, AVI, MOV, etc.)
213
+ - 🌐 Automatic language detection or manual selection
214
+ - πŸš€ Fast processing with OpenAI Whisper
215
+ - πŸ“± Both web interface and API access
216
+ """)
217
+
218
+ with gr.Tab("πŸ“€ Upload & Transcribe"):
219
+ with gr.Row():
220
+ with gr.Column():
221
+ video_input = gr.File(
222
+ label="Upload Video File",
223
+ file_types=["video"],
224
+ type="filepath"
225
+ )
226
+ language_input = gr.Dropdown(
227
+ choices=["auto", "en", "es", "fr", "de", "it", "pt", "ru", "ja", "ko", "zh", "ar", "hi"],
228
+ value="auto",
229
+ label="Language (auto-detect or specify)"
230
+ )
231
+ transcribe_btn = gr.Button("πŸš€ Start Transcription", variant="primary")
232
+
233
+ with gr.Column():
234
+ status_output = gr.Textbox(label="Status", lines=3)
235
+ transcription_id_output = gr.Textbox(label="Transcription ID", visible=True)
236
+ result_output = gr.Textbox(label="Progress", lines=2)
237
+
238
+ with gr.Tab("πŸ” Check Status"):
239
+ with gr.Row():
240
+ with gr.Column():
241
+ id_input = gr.Textbox(label="Transcription ID", placeholder="Enter transcription ID...")
242
+ check_btn = gr.Button("πŸ“Š Check Status", variant="secondary")
243
+
244
+ with gr.Column():
245
+ status_result = gr.Textbox(label="Result", lines=10)
246
+
247
+ with gr.Tab("πŸ”§ API Documentation"):
248
+ gr.Markdown("""
249
+ ## 🌐 API Endpoints
250
+
251
+ You can also use this service programmatically via API calls:
252
+
253
+ ### Upload Video for Transcription
254
+ ```bash
255
+ curl -X POST "https://your-space-name.hf.space/api/transcribe" \\
256
+ -F "[email protected]" \\
257
+ -F "language=en"
258
+ ```
259
+
260
+ ### Check Transcription Status
261
+ ```bash
262
+ curl "https://your-space-name.hf.space/api/transcribe/123"
263
+ ```
264
+
265
+ ### Health Check
266
+ ```bash
267
+ curl "https://your-space-name.hf.space/api/health"
268
+ ```
269
+
270
+ ### Python Example
271
+ ```python
272
+ import requests
273
+
274
+ # Upload video
275
+ with open('video.mp4', 'rb') as f:
276
+ response = requests.post(
277
+ 'https://your-space-name.hf.space/api/transcribe',
278
+ files={'file': f},
279
+ data={'language': 'en'}
280
+ )
281
+ transcription_id = response.json()['id']
282
+
283
+ # Check status
284
+ result = requests.get(f'https://your-space-name.hf.space/api/transcribe/{transcription_id}')
285
+ print(result.json())
286
+ ```
287
+ """)
288
+
289
+ # Event handlers
290
+ transcribe_btn.click(
291
+ fn=gradio_transcribe,
292
+ inputs=[video_input, language_input],
293
+ outputs=[status_output, transcription_id_output, result_output]
294
+ )
295
+
296
+ check_btn.click(
297
+ fn=gradio_check_status,
298
+ inputs=[id_input],
299
+ outputs=[status_result]
300
+ )
301
+
302
+ return interface
303
+
304
+ # Startup function
305
+ async def startup():
306
+ """Initialize services"""
307
+ logger.info("πŸš€ Starting Video Transcription Service on Hugging Face Spaces")
308
+
309
+ # Start storage cleanup
310
+ await storage.start_cleanup_task()
311
+
312
+ # Preload model
313
+ log_step("Preloading Whisper model")
314
+ success = await transcription_manager.ensure_model_loaded()
315
+ if success:
316
+ log_success("Model preloaded successfully")
317
+ else:
318
+ log_error("Model preload failed")
319
+
320
+ def run_fastapi():
321
+ """Run FastAPI in a separate thread"""
322
+ uvicorn.run(api_app, host="0.0.0.0", port=7860, log_level="info")
323
+
324
+ # Main execution
325
+ if __name__ == "__main__":
326
+ # Run startup
327
+ asyncio.run(startup())
328
+
329
+ # Start FastAPI in background thread for API access
330
+ api_thread = threading.Thread(target=run_fastapi, daemon=True)
331
+ api_thread.start()
332
+
333
+ # Create and launch Gradio interface
334
+ interface = create_gradio_interface()
335
+
336
+ # Launch with API access enabled
337
+ interface.launch(
338
+ server_name="0.0.0.0",
339
+ server_port=7860,
340
+ share=False, # HF Spaces handles sharing
341
+ show_api=True, # Enable API documentation
342
+ show_error=True
343
+ )
config.py ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from typing import List
3
+
4
+ class HuggingFaceSettings:
5
+ # File upload settings (HF Spaces can handle larger files)
6
+ MAX_FILE_SIZE = 200 * 1024 * 1024 # 200MB for HF Spaces
7
+ ALLOWED_EXTENSIONS = ['.mp4', '.avi', '.mov', '.mkv', '.wmv', '.flv', '.webm', '.m4v']
8
+
9
+ # Transcription settings (optimized for HF Spaces)
10
+ WHISPER_MODEL = os.getenv("WHISPER_MODEL", "base") # HF Spaces can handle base model
11
+ CLEANUP_INTERVAL_HOURS = 3.5 # Clean up after 3.5 hours
12
+
13
+ # Performance settings for HF Spaces
14
+ MODEL_PRELOAD = True # Always preload on HF Spaces
15
+ MAX_CONCURRENT_TRANSCRIPTIONS = 2 # HF Spaces can handle more
16
+ REQUEST_TIMEOUT_SECONDS = 600 # 10 minutes max per request
17
+
18
+ # Rate limiting (more generous on HF Spaces)
19
+ RATE_LIMIT_REQUESTS = 20 # requests per minute per IP
20
+
21
+ # Server settings
22
+ HOST = "0.0.0.0"
23
+ PORT = 7860 # Standard HF Spaces port
24
+
25
+ # Logging settings
26
+ DEBUG_MODE = os.getenv("DEBUG", "false").lower() == "true"
27
+ LOG_TO_FILE = False # No file logging on HF Spaces
28
+
29
+ # Hugging Face Spaces specific
30
+ HF_SPACE_ID = os.getenv("SPACE_ID", "your-username/video-transcription")
31
+ HF_SPACE_URL = f"https://{HF_SPACE_ID.replace('/', '-')}.hf.space" if "SPACE_ID" in os.environ else "http://localhost:7860"
32
+
33
+ # Use HF-optimized settings
34
+ settings = HuggingFaceSettings()
deploy_to_hf.py ADDED
@@ -0,0 +1,190 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Deployment script for Hugging Face Spaces
4
+ Prepares files and provides deployment instructions
5
+ """
6
+
7
+ import os
8
+ import shutil
9
+ import logging
10
+
11
+ logging.basicConfig(level=logging.INFO)
12
+ logger = logging.getLogger(__name__)
13
+
14
+ def prepare_hf_deployment():
15
+ """Prepare files for Hugging Face Spaces deployment"""
16
+
17
+ logger.info("πŸš€ Preparing Video Transcription Service for Hugging Face Spaces")
18
+ logger.info("=" * 60)
19
+
20
+ # Create deployment directory
21
+ deploy_dir = "hf_spaces_deploy"
22
+ if os.path.exists(deploy_dir):
23
+ shutil.rmtree(deploy_dir)
24
+ os.makedirs(deploy_dir)
25
+
26
+ # Files to copy/create for HF Spaces
27
+ files_to_copy = [
28
+ "app.py", # Main Gradio app
29
+ "config.py", # Configuration
30
+ "models.py", # Data models
31
+ "storage.py", # Storage management
32
+ "transcription_service.py", # Core transcription logic
33
+ "logging_config.py", # Logging configuration
34
+ "restart_handler.py" # Restart prevention
35
+ ]
36
+
37
+ # Copy core files
38
+ for file in files_to_copy:
39
+ if os.path.exists(file):
40
+ shutil.copy2(file, deploy_dir)
41
+ logger.info(f"βœ… Copied {file}")
42
+ else:
43
+ logger.warning(f"⚠️ File not found: {file}")
44
+
45
+ # Copy and rename HF-specific files
46
+ if os.path.exists("requirements_hf.txt"):
47
+ shutil.copy2("requirements_hf.txt", os.path.join(deploy_dir, "requirements.txt"))
48
+ logger.info("βœ… Copied requirements_hf.txt -> requirements.txt")
49
+
50
+ if os.path.exists("README_HF.md"):
51
+ shutil.copy2("README_HF.md", os.path.join(deploy_dir, "README.md"))
52
+ logger.info("βœ… Copied README_HF.md -> README.md")
53
+
54
+ if os.path.exists("config_hf.py"):
55
+ # Replace config.py with HF-optimized version
56
+ shutil.copy2("config_hf.py", os.path.join(deploy_dir, "config.py"))
57
+ logger.info("βœ… Using HF-optimized config.py")
58
+
59
+ # Create .gitignore for HF Spaces
60
+ gitignore_content = """
61
+ __pycache__/
62
+ *.py[cod]
63
+ *$py.class
64
+ *.so
65
+ .Python
66
+ *.log
67
+ .env
68
+ .venv
69
+ env/
70
+ venv/
71
+ .DS_Store
72
+ *.tmp
73
+ *.temp
74
+ flagged/
75
+ """
76
+
77
+ with open(os.path.join(deploy_dir, ".gitignore"), "w") as f:
78
+ f.write(gitignore_content.strip())
79
+ logger.info("βœ… Created .gitignore")
80
+
81
+ logger.info("\nπŸŽ‰ Deployment files prepared successfully!")
82
+ logger.info(f"πŸ“ Files are ready in: {deploy_dir}/")
83
+
84
+ return deploy_dir
85
+
86
+ def print_deployment_instructions(deploy_dir):
87
+ """Print step-by-step deployment instructions"""
88
+
89
+ instructions = f"""
90
+ πŸš€ HUGGING FACE SPACES DEPLOYMENT INSTRUCTIONS
91
+ {'=' * 50}
92
+
93
+ 1. πŸ“ PREPARE YOUR HUGGING FACE ACCOUNT
94
+ - Go to https://huggingface.co
95
+ - Sign up/login to your account
96
+ - Go to "Spaces" tab
97
+
98
+ 2. πŸ†• CREATE NEW SPACE
99
+ - Click "Create new Space"
100
+ - Choose a name: e.g., "video-transcription"
101
+ - Select "Gradio" as SDK
102
+ - Choose "Public" or "Private"
103
+ - Click "Create Space"
104
+
105
+ 3. πŸ“€ UPLOAD FILES
106
+ Option A - Web Interface:
107
+ - Upload all files from {deploy_dir}/ to your Space
108
+ - Make sure app.py is in the root directory
109
+
110
+ Option B - Git (Recommended):
111
+ ```bash
112
+ cd {deploy_dir}
113
+ git init
114
+ git add .
115
+ git commit -m "Initial commit"
116
+ git remote add origin https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
117
+ git push -u origin main
118
+ ```
119
+
120
+ 4. βš™οΈ CONFIGURE SPACE SETTINGS
121
+ - Go to your Space settings
122
+ - Set "Hardware" to "CPU basic" (free) or "CPU upgrade" (better performance)
123
+ - Enable "Public" if you want API access from external applications
124
+
125
+ 5. πŸš€ DEPLOY
126
+ - Your Space will automatically build and deploy
127
+ - Wait for the build to complete (5-10 minutes)
128
+ - Check logs for any errors
129
+
130
+ 6. βœ… TEST YOUR DEPLOYMENT
131
+ Web Interface:
132
+ - Visit: https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space
133
+ - Upload a test video file
134
+ - Verify transcription works
135
+
136
+ API Access:
137
+ ```bash
138
+ # Test health endpoint
139
+ curl "https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space/api/health"
140
+
141
+ # Test transcription
142
+ curl -X POST "https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space/api/transcribe" \\
143
+ -F "file=@test_video.mp4" \\
144
+ -F "language=en"
145
+ ```
146
+
147
+ 7. πŸ“Š MONITOR PERFORMANCE
148
+ - Check Space logs for any issues
149
+ - Monitor memory usage
150
+ - Test with different video formats
151
+
152
+ 🎯 IMPORTANT NOTES:
153
+ - First model load takes 2-3 minutes (downloads Whisper model)
154
+ - Subsequent requests are much faster
155
+ - API endpoints work exactly like your local FastAPI
156
+ - Both web interface and API are available simultaneously
157
+
158
+ πŸ”§ TROUBLESHOOTING:
159
+ - If build fails: Check requirements.txt and logs
160
+ - If model loading fails: Try WHISPER_MODEL=tiny in Space settings
161
+ - If memory issues: Upgrade to CPU upgrade hardware
162
+
163
+ πŸ“ž NEED HELP?
164
+ - Check Space logs in the "Logs" tab
165
+ - Visit Hugging Face Spaces documentation
166
+ - Test locally first: python app.py
167
+
168
+ πŸŽ‰ Your Video Transcription Service will be live at:
169
+ https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space
170
+ """
171
+
172
+ print(instructions)
173
+
174
+ def main():
175
+ """Main deployment preparation function"""
176
+ try:
177
+ deploy_dir = prepare_hf_deployment()
178
+ print_deployment_instructions(deploy_dir)
179
+
180
+ logger.info("\nβœ… Ready for Hugging Face Spaces deployment!")
181
+ logger.info(f"πŸ“ Next step: Upload files from {deploy_dir}/ to your HF Space")
182
+
183
+ except Exception as e:
184
+ logger.error(f"❌ Deployment preparation failed: {e}")
185
+ return False
186
+
187
+ return True
188
+
189
+ if __name__ == "__main__":
190
+ main()
example_client.py ADDED
@@ -0,0 +1,166 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Example client for the Video Transcription Service
4
+ Usage: python example_client.py <video_file> [language]
5
+ """
6
+
7
+ import requests
8
+ import time
9
+ import sys
10
+ import os
11
+
12
+ class TranscriptionClient:
13
+ def __init__(self, base_url="http://localhost:8000"):
14
+ self.base_url = base_url.rstrip('/')
15
+
16
+ def transcribe_video(self, video_path, language=None, poll_interval=10, max_wait_minutes=10):
17
+ """
18
+ Transcribe a video file and wait for results
19
+
20
+ Args:
21
+ video_path: Path to video file
22
+ language: Optional language code (e.g., 'en', 'es')
23
+ poll_interval: Seconds between status checks
24
+ max_wait_minutes: Maximum minutes to wait for completion
25
+
26
+ Returns:
27
+ dict: Transcription result or None if failed
28
+ """
29
+
30
+ if not os.path.exists(video_path):
31
+ print(f"Error: Video file '{video_path}' not found")
32
+ return None
33
+
34
+ file_size = os.path.getsize(video_path)
35
+ print(f"Uploading video: {video_path} ({file_size / (1024*1024):.1f} MB)")
36
+
37
+ # Upload video
38
+ try:
39
+ with open(video_path, 'rb') as f:
40
+ files = {'file': f}
41
+ data = {}
42
+ if language:
43
+ data['language'] = language
44
+
45
+ print("Uploading...")
46
+ response = requests.post(f"{self.base_url}/transcribe", files=files, data=data)
47
+
48
+ if response.status_code != 200:
49
+ print(f"Upload failed: {response.status_code}")
50
+ print(response.text)
51
+ return None
52
+
53
+ result = response.json()
54
+ transcription_id = result['id']
55
+ print(f"Upload successful! Transcription ID: {transcription_id}")
56
+ print(f"Status: {result['status']}")
57
+
58
+ except Exception as e:
59
+ print(f"Upload error: {e}")
60
+ return None
61
+
62
+ # Poll for results
63
+ print(f"Waiting for transcription (checking every {poll_interval} seconds)...")
64
+ max_attempts = (max_wait_minutes * 60) // poll_interval
65
+
66
+ for attempt in range(max_attempts):
67
+ try:
68
+ response = requests.get(f"{self.base_url}/transcribe/{transcription_id}")
69
+
70
+ if response.status_code != 200:
71
+ print(f"Status check failed: {response.status_code}")
72
+ return None
73
+
74
+ result = response.json()
75
+ status = result['status']
76
+
77
+ if status == 'completed':
78
+ print("βœ… Transcription completed!")
79
+ return result
80
+ elif status == 'failed':
81
+ print(f"❌ Transcription failed: {result.get('error_message', 'Unknown error')}")
82
+ return None
83
+ elif status in ['pending', 'processing']:
84
+ print(f"⏳ Status: {status} (attempt {attempt + 1}/{max_attempts})")
85
+ time.sleep(poll_interval)
86
+ else:
87
+ print(f"❌ Unknown status: {status}")
88
+ return None
89
+
90
+ except Exception as e:
91
+ print(f"Status check error: {e}")
92
+ return None
93
+
94
+ print(f"⏰ Transcription timed out after {max_wait_minutes} minutes")
95
+ return None
96
+
97
+ def get_transcription(self, transcription_id):
98
+ """Get transcription by ID"""
99
+ try:
100
+ response = requests.get(f"{self.base_url}/transcribe/{transcription_id}")
101
+ if response.status_code == 200:
102
+ return response.json()
103
+ else:
104
+ print(f"Error: {response.status_code}")
105
+ print(response.text)
106
+ return None
107
+ except Exception as e:
108
+ print(f"Error: {e}")
109
+ return None
110
+
111
+ def main():
112
+ if len(sys.argv) < 2:
113
+ print("Usage: python example_client.py <video_file> [language] [api_url]")
114
+ print("Examples:")
115
+ print(" python example_client.py video.mp4")
116
+ print(" python example_client.py video.mp4 en")
117
+ print(" python example_client.py video.mp4 es https://your-service.onrender.com")
118
+ sys.exit(1)
119
+
120
+ video_file = sys.argv[1]
121
+ language = sys.argv[2] if len(sys.argv) > 2 and not sys.argv[2].startswith('http') else None
122
+ api_url = sys.argv[3] if len(sys.argv) > 3 else sys.argv[2] if len(sys.argv) > 2 and sys.argv[2].startswith('http') else "http://localhost:8000"
123
+
124
+ print("Video Transcription Client")
125
+ print("=" * 30)
126
+ print(f"API URL: {api_url}")
127
+ print(f"Video: {video_file}")
128
+ print(f"Language: {language or 'auto-detect'}")
129
+ print()
130
+
131
+ client = TranscriptionClient(api_url)
132
+ result = client.transcribe_video(video_file, language)
133
+
134
+ if result:
135
+ print("\n" + "=" * 50)
136
+ print("TRANSCRIPTION RESULT")
137
+ print("=" * 50)
138
+ print(f"ID: {result['id']}")
139
+ print(f"Language: {result.get('language', 'N/A')}")
140
+ print(f"Duration: {result.get('duration', 'N/A')} seconds")
141
+ print(f"Created: {result['created_at']}")
142
+ print(f"Completed: {result.get('completed_at', 'N/A')}")
143
+ print()
144
+ print("TEXT:")
145
+ print("-" * 20)
146
+ print(result['text'])
147
+ print()
148
+
149
+ # Save to file
150
+ output_file = f"{os.path.splitext(video_file)[0]}_transcription.txt"
151
+ with open(output_file, 'w', encoding='utf-8') as f:
152
+ f.write(f"Transcription of: {video_file}\n")
153
+ f.write(f"Language: {result.get('language', 'N/A')}\n")
154
+ f.write(f"Duration: {result.get('duration', 'N/A')} seconds\n")
155
+ f.write(f"Created: {result['created_at']}\n")
156
+ f.write(f"Completed: {result.get('completed_at', 'N/A')}\n")
157
+ f.write("\n" + "=" * 50 + "\n")
158
+ f.write(result['text'])
159
+
160
+ print(f"πŸ’Ύ Transcription saved to: {output_file}")
161
+ else:
162
+ print("❌ Transcription failed")
163
+ sys.exit(1)
164
+
165
+ if __name__ == "__main__":
166
+ main()
fix_numpy.py ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Fix NumPy compatibility issue for Video Transcription Service
4
+ """
5
+
6
+ import subprocess
7
+ import sys
8
+ import os
9
+
10
+ def run_command(command, description):
11
+ """Run a command and handle errors"""
12
+ print(f"πŸ”§ {description}...")
13
+ try:
14
+ result = subprocess.run(command, shell=True, check=True, capture_output=True, text=True)
15
+ print(f"βœ… {description} completed")
16
+ if result.stdout.strip():
17
+ print(f" Output: {result.stdout.strip()}")
18
+ return True
19
+ except subprocess.CalledProcessError as e:
20
+ print(f"❌ {description} failed:")
21
+ print(f" Command: {command}")
22
+ print(f" Error: {e.stderr}")
23
+ return False
24
+
25
+ def check_numpy_version():
26
+ """Check current NumPy version"""
27
+ try:
28
+ import numpy as np
29
+ version = np.__version__
30
+ print(f"πŸ“Š Current NumPy version: {version}")
31
+
32
+ # Check if version is 2.x
33
+ major_version = int(version.split('.')[0])
34
+ if major_version >= 2:
35
+ print("⚠️ NumPy 2.x detected - this causes compatibility issues with PyTorch/Whisper")
36
+ return False
37
+ else:
38
+ print("βœ… NumPy version is compatible")
39
+ return True
40
+ except ImportError:
41
+ print("❌ NumPy not installed")
42
+ return False
43
+
44
+ def fix_numpy_compatibility():
45
+ """Fix NumPy compatibility by downgrading to 1.x"""
46
+ commands = [
47
+ ("pip uninstall -y numpy", "Uninstalling current NumPy"),
48
+ ("pip install 'numpy<2.0.0'", "Installing compatible NumPy version"),
49
+ ("pip install --force-reinstall torch==2.1.0 torchaudio==2.1.0", "Reinstalling PyTorch with compatible NumPy"),
50
+ ("pip install --force-reinstall openai-whisper==20231117", "Reinstalling Whisper with compatible NumPy")
51
+ ]
52
+
53
+ for command, description in commands:
54
+ if not run_command(command, description):
55
+ return False
56
+ return True
57
+
58
+ def verify_installation():
59
+ """Verify that everything works after the fix"""
60
+ print("\nπŸ§ͺ Testing installation...")
61
+
62
+ try:
63
+ # Test NumPy
64
+ import numpy as np
65
+ print(f"βœ… NumPy {np.__version__} imported successfully")
66
+
67
+ # Test PyTorch
68
+ import torch
69
+ print(f"βœ… PyTorch {torch.__version__} imported successfully")
70
+
71
+ # Test Whisper
72
+ import whisper
73
+ print("βœ… Whisper imported successfully")
74
+
75
+ # Test basic functionality
76
+ print("πŸ” Testing Whisper model loading...")
77
+ try:
78
+ # This will download the tiny model if not present (much faster than base)
79
+ model = whisper.load_model("tiny")
80
+ print("βœ… Whisper model loaded successfully")
81
+ return True
82
+ except Exception as e:
83
+ print(f"⚠️ Whisper model loading failed: {e}")
84
+ print(" This might be due to network issues - try running the service anyway")
85
+ return True
86
+
87
+ except Exception as e:
88
+ print(f"❌ Installation verification failed: {e}")
89
+ return False
90
+
91
+ def main():
92
+ print("πŸ”§ NumPy Compatibility Fix for Video Transcription Service")
93
+ print("=" * 60)
94
+
95
+ # Check current NumPy version
96
+ if check_numpy_version():
97
+ print("\nβœ… NumPy version is already compatible!")
98
+ print("If you're still getting errors, try restarting your service.")
99
+ return
100
+
101
+ print("\nπŸ”§ Fixing NumPy compatibility...")
102
+
103
+ # Fix NumPy compatibility
104
+ if not fix_numpy_compatibility():
105
+ print("\n❌ Failed to fix NumPy compatibility")
106
+ print("\nπŸ’‘ Manual fix:")
107
+ print("1. pip uninstall numpy")
108
+ print("2. pip install 'numpy<2.0.0'")
109
+ print("3. pip install --force-reinstall torch torchaudio openai-whisper")
110
+ sys.exit(1)
111
+
112
+ # Verify installation
113
+ if not verify_installation():
114
+ print("\n⚠️ Installation verification had issues")
115
+ print("Try running the service - it might still work")
116
+
117
+ print("\nπŸŽ‰ NumPy compatibility fix completed!")
118
+ print("=" * 40)
119
+ print("\nπŸ“‹ Next steps:")
120
+ print("1. Restart your transcription service:")
121
+ print(" python main.py")
122
+ print(" OR")
123
+ print(" python start.py")
124
+ print("2. Test with a video file")
125
+ print("\nπŸ’‘ If you still get errors, try:")
126
+ print("- Restart your terminal/command prompt")
127
+ print("- Deactivate and reactivate your virtual environment")
128
+
129
+ if __name__ == "__main__":
130
+ main()
hf_api_client.py ADDED
@@ -0,0 +1,255 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ API Client for Hugging Face Spaces Video Transcription Service
4
+ Test both web interface and API functionality
5
+ """
6
+
7
+ import requests
8
+ import time
9
+ import sys
10
+ import os
11
+ from datetime import datetime
12
+
13
+ class HFTranscriptionClient:
14
+ def __init__(self, space_url):
15
+ """
16
+ Initialize client for HF Spaces transcription service
17
+
18
+ Args:
19
+ space_url: Your HF Space URL (e.g., "https://username-spacename.hf.space")
20
+ """
21
+ self.base_url = space_url.rstrip('/')
22
+ self.api_base = f"{self.base_url}/api"
23
+
24
+ def health_check(self):
25
+ """Check if the service is healthy"""
26
+ try:
27
+ response = requests.get(f"{self.api_base}/health", timeout=10)
28
+ if response.status_code == 200:
29
+ health = response.json()
30
+ print("βœ… Service is healthy")
31
+ print(f" Model loaded: {health.get('model_loaded', False)}")
32
+ print(f" Active transcriptions: {health.get('active_transcriptions', 0)}")
33
+ return True
34
+ else:
35
+ print(f"❌ Health check failed: {response.status_code}")
36
+ return False
37
+ except requests.exceptions.RequestException as e:
38
+ print(f"❌ Cannot connect to service: {e}")
39
+ return False
40
+
41
+ def transcribe_video(self, video_path, language=None):
42
+ """
43
+ Upload video for transcription
44
+
45
+ Args:
46
+ video_path: Path to video file
47
+ language: Language code (e.g., 'en', 'es') or None for auto-detect
48
+
49
+ Returns:
50
+ dict: Response with transcription ID or error
51
+ """
52
+ if not os.path.exists(video_path):
53
+ return {"error": f"Video file not found: {video_path}"}
54
+
55
+ try:
56
+ print(f"πŸ“€ Uploading video: {video_path}")
57
+
58
+ with open(video_path, 'rb') as f:
59
+ files = {'file': f}
60
+ data = {}
61
+ if language:
62
+ data['language'] = language
63
+
64
+ response = requests.post(
65
+ f"{self.api_base}/transcribe",
66
+ files=files,
67
+ data=data,
68
+ timeout=60
69
+ )
70
+
71
+ if response.status_code == 200:
72
+ result = response.json()
73
+ print(f"βœ… Upload successful! Transcription ID: {result['id']}")
74
+ return result
75
+ else:
76
+ error_msg = f"Upload failed: {response.status_code}"
77
+ if response.text:
78
+ error_msg += f" - {response.text}"
79
+ print(f"❌ {error_msg}")
80
+ return {"error": error_msg}
81
+
82
+ except requests.exceptions.RequestException as e:
83
+ error_msg = f"Upload error: {e}"
84
+ print(f"❌ {error_msg}")
85
+ return {"error": error_msg}
86
+
87
+ def get_transcription_status(self, transcription_id):
88
+ """
89
+ Get transcription status and results
90
+
91
+ Args:
92
+ transcription_id: ID returned from transcribe_video
93
+
94
+ Returns:
95
+ dict: Transcription status and results
96
+ """
97
+ try:
98
+ response = requests.get(
99
+ f"{self.api_base}/transcribe/{transcription_id}",
100
+ timeout=10
101
+ )
102
+
103
+ if response.status_code == 200:
104
+ return response.json()
105
+ elif response.status_code == 404:
106
+ return {"error": "Transcription not found or expired"}
107
+ else:
108
+ return {"error": f"Status check failed: {response.status_code}"}
109
+
110
+ except requests.exceptions.RequestException as e:
111
+ return {"error": f"Status check error: {e}"}
112
+
113
+ def wait_for_completion(self, transcription_id, max_wait_minutes=15, poll_interval=10):
114
+ """
115
+ Wait for transcription to complete
116
+
117
+ Args:
118
+ transcription_id: ID to monitor
119
+ max_wait_minutes: Maximum time to wait
120
+ poll_interval: Seconds between status checks
121
+
122
+ Returns:
123
+ dict: Final transcription result
124
+ """
125
+ print(f"⏳ Waiting for transcription {transcription_id} to complete...")
126
+ print(f" Max wait time: {max_wait_minutes} minutes")
127
+ print(f" Checking every {poll_interval} seconds")
128
+
129
+ start_time = time.time()
130
+ max_wait_seconds = max_wait_minutes * 60
131
+
132
+ while time.time() - start_time < max_wait_seconds:
133
+ result = self.get_transcription_status(transcription_id)
134
+
135
+ if "error" in result:
136
+ print(f"❌ Error checking status: {result['error']}")
137
+ return result
138
+
139
+ status = result.get('status', 'unknown')
140
+ print(f" Status: {status}")
141
+
142
+ if status == 'completed':
143
+ print("πŸŽ‰ Transcription completed!")
144
+ return result
145
+ elif status == 'failed':
146
+ error_msg = result.get('error_message', 'Unknown error')
147
+ print(f"❌ Transcription failed: {error_msg}")
148
+ return result
149
+ elif status in ['pending', 'processing']:
150
+ time.sleep(poll_interval)
151
+ else:
152
+ print(f"❌ Unknown status: {status}")
153
+ return result
154
+
155
+ print(f"⏰ Transcription timed out after {max_wait_minutes} minutes")
156
+ return {"error": "Timeout waiting for completion"}
157
+
158
+ def transcribe_and_wait(self, video_path, language=None, max_wait_minutes=15):
159
+ """
160
+ Upload video and wait for transcription to complete
161
+
162
+ Args:
163
+ video_path: Path to video file
164
+ language: Language code or None for auto-detect
165
+ max_wait_minutes: Maximum time to wait
166
+
167
+ Returns:
168
+ dict: Complete transcription result
169
+ """
170
+ # Upload video
171
+ upload_result = self.transcribe_video(video_path, language)
172
+ if "error" in upload_result:
173
+ return upload_result
174
+
175
+ transcription_id = upload_result['id']
176
+
177
+ # Wait for completion
178
+ return self.wait_for_completion(transcription_id, max_wait_minutes)
179
+
180
+ def main():
181
+ """Main function for testing the HF Spaces API"""
182
+ if len(sys.argv) < 2:
183
+ print("Hugging Face Spaces Video Transcription API Client")
184
+ print("=" * 50)
185
+ print("Usage:")
186
+ print(" python hf_api_client.py <space_url> [video_file] [language]")
187
+ print()
188
+ print("Examples:")
189
+ print(" python hf_api_client.py https://username-spacename.hf.space")
190
+ print(" python hf_api_client.py https://username-spacename.hf.space video.mp4")
191
+ print(" python hf_api_client.py https://username-spacename.hf.space video.mp4 en")
192
+ print()
193
+ print("Commands:")
194
+ print(" health - Check service health")
195
+ print(" test - Run basic functionality test")
196
+ sys.exit(1)
197
+
198
+ space_url = sys.argv[1]
199
+ client = HFTranscriptionClient(space_url)
200
+
201
+ print(f"🌐 Connecting to: {space_url}")
202
+ print("=" * 50)
203
+
204
+ # Health check
205
+ if not client.health_check():
206
+ print("❌ Service is not available. Please check your Space URL and try again.")
207
+ sys.exit(1)
208
+
209
+ # If video file provided, transcribe it
210
+ if len(sys.argv) >= 3:
211
+ video_file = sys.argv[2]
212
+ language = sys.argv[3] if len(sys.argv) > 3 else None
213
+
214
+ print(f"\n🎬 Transcribing video: {video_file}")
215
+ if language:
216
+ print(f"🌐 Language: {language}")
217
+ else:
218
+ print("🌐 Language: auto-detect")
219
+
220
+ result = client.transcribe_and_wait(video_file, language)
221
+
222
+ if "error" in result:
223
+ print(f"❌ Transcription failed: {result['error']}")
224
+ else:
225
+ print("\nπŸŽ‰ Transcription Results:")
226
+ print("=" * 30)
227
+ print(f"ID: {result.get('id', 'N/A')}")
228
+ print(f"Language: {result.get('language', 'N/A')}")
229
+ print(f"Duration: {result.get('duration', 'N/A')} seconds")
230
+ print(f"Status: {result.get('status', 'N/A')}")
231
+ print("\nTranscribed Text:")
232
+ print("-" * 20)
233
+ print(result.get('text', 'No text available'))
234
+
235
+ # Save to file
236
+ if result.get('text'):
237
+ output_file = f"{os.path.splitext(video_file)[0]}_transcription.txt"
238
+ with open(output_file, 'w', encoding='utf-8') as f:
239
+ f.write(f"Transcription of: {video_file}\n")
240
+ f.write(f"Language: {result.get('language', 'N/A')}\n")
241
+ f.write(f"Duration: {result.get('duration', 'N/A')} seconds\n")
242
+ f.write(f"Completed: {datetime.now().isoformat()}\n")
243
+ f.write("\n" + "=" * 50 + "\n")
244
+ f.write(result['text'])
245
+ print(f"\nπŸ’Ύ Transcription saved to: {output_file}")
246
+
247
+ else:
248
+ print("\nβœ… Service is ready!")
249
+ print("🌐 Web interface:", space_url)
250
+ print("πŸ”— API base URL:", client.api_base)
251
+ print("\nπŸ“‹ To transcribe a video:")
252
+ print(f" python {sys.argv[0]} {space_url} your_video.mp4")
253
+
254
+ if __name__ == "__main__":
255
+ main()
hf_spaces_deploy/.gitignore ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ __pycache__/
2
+ *.py[cod]
3
+ *$py.class
4
+ *.so
5
+ .Python
6
+ *.log
7
+ .env
8
+ .venv
9
+ env/
10
+ venv/
11
+ .DS_Store
12
+ *.tmp
13
+ *.temp
14
+ flagged/
hf_spaces_deploy/README.md ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Video Transcription Service
3
+ emoji: 🎬
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 4.44.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ ---
12
+
13
+ # 🎬 Video Transcription Service
14
+
15
+ A powerful video transcription service using OpenAI Whisper, deployed on Hugging Face Spaces with both web interface and API access.
16
+
17
+ ## ✨ Features
18
+
19
+ - πŸŽ₯ **Multiple Video Formats**: MP4, AVI, MOV, MKV, WMV, FLV, WebM, M4V
20
+ - πŸ—£οΈ **Free Speech-to-Text**: OpenAI Whisper (no API limits)
21
+ - 🌐 **Language Support**: 99+ languages with auto-detection
22
+ - πŸ“± **Dual Interface**: Web UI + REST API
23
+ - ⚑ **Fast Processing**: Optimized for Hugging Face Spaces
24
+ - 🧹 **Auto Cleanup**: Results stored for 3.5 hours
25
+
26
+ ## πŸš€ Quick Start
27
+
28
+ ### Web Interface
29
+ 1. Upload your video file
30
+ 2. Select language (or use auto-detect)
31
+ 3. Click "Start Transcription"
32
+ 4. Use the transcription ID to check status
33
+
34
+ ### API Access
35
+
36
+ **Upload Video:**
37
+ ```bash
38
+ curl -X POST "https://your-space-name.hf.space/api/transcribe" \
39
40
+ -F "language=en"
41
+ ```
42
+
43
+ **Check Status:**
44
+ ```bash
45
+ curl "https://your-space-name.hf.space/api/transcribe/123"
46
+ ```
47
+
48
+ **Python Example:**
49
+ ```python
50
+ import requests
51
+
52
+ # Upload video
53
+ with open('video.mp4', 'rb') as f:
54
+ response = requests.post(
55
+ 'https://your-space-name.hf.space/api/transcribe',
56
+ files={'file': f},
57
+ data={'language': 'en'}
58
+ )
59
+
60
+ result = response.json()
61
+ transcription_id = result['id']
62
+
63
+ # Check status
64
+ import time
65
+ while True:
66
+ status_response = requests.get(
67
+ f'https://your-space-name.hf.space/api/transcribe/{transcription_id}'
68
+ )
69
+ status = status_response.json()
70
+
71
+ if status['status'] == 'completed':
72
+ print("Transcription:", status['text'])
73
+ break
74
+ elif status['status'] == 'failed':
75
+ print("Error:", status['error_message'])
76
+ break
77
+ else:
78
+ print("Status:", status['status'])
79
+ time.sleep(10)
80
+ ```
81
+
82
+ ## πŸ“‹ API Endpoints
83
+
84
+ | Endpoint | Method | Description |
85
+ |----------|--------|-------------|
86
+ | `/api/transcribe` | POST | Upload video for transcription |
87
+ | `/api/transcribe/{id}` | GET | Get transcription status/results |
88
+ | `/api/health` | GET | Service health check |
89
+
90
+ ## 🌐 Supported Languages
91
+
92
+ Auto-detection or specify: English, Spanish, French, German, Italian, Portuguese, Russian, Japanese, Korean, Chinese, Arabic, Hindi, and 87+ more languages.
93
+
94
+ ## πŸ“ Limitations
95
+
96
+ - **File Size**: 100MB maximum per video
97
+ - **Processing**: Sequential (one video at a time)
98
+ - **Storage**: Results expire after 3.5 hours
99
+ - **Rate Limiting**: Built-in protection against abuse
100
+
101
+ ## πŸ”§ Technical Details
102
+
103
+ - **Model**: OpenAI Whisper (base model for accuracy)
104
+ - **Backend**: FastAPI + Gradio
105
+ - **Processing**: Async with real-time status updates
106
+ - **Storage**: In-memory with automatic cleanup
107
+ - **Deployment**: Optimized for Hugging Face Spaces
108
+
109
+ ## πŸ“Š Response Format
110
+
111
+ **Upload Response:**
112
+ ```json
113
+ {
114
+ "id": 123,
115
+ "status": "pending",
116
+ "message": "Transcription started",
117
+ "created_at": "2024-01-15T10:30:00Z"
118
+ }
119
+ ```
120
+
121
+ **Status Response:**
122
+ ```json
123
+ {
124
+ "id": 123,
125
+ "status": "completed",
126
+ "text": "Hello, this is the transcribed text...",
127
+ "language": "en",
128
+ "duration": 45.6,
129
+ "created_at": "2024-01-15T10:30:00Z",
130
+ "completed_at": "2024-01-15T10:32:15Z"
131
+ }
132
+ ```
133
+
134
+ ## πŸ› οΈ Development
135
+
136
+ This service combines:
137
+ - **Gradio**: Beautiful web interface
138
+ - **FastAPI**: Robust API endpoints
139
+ - **OpenAI Whisper**: State-of-the-art transcription
140
+ - **Async Processing**: Non-blocking operations
141
+
142
+ ## πŸ“ž Support
143
+
144
+ - πŸ“– **Documentation**: Available in the API tab
145
+ - πŸ› **Issues**: Report via GitHub
146
+ - πŸ’‘ **Features**: Suggest improvements
147
+
148
+ ## πŸ“„ License
149
+
150
+ MIT License - free for any use.
151
+
152
+ ---
153
+
154
+ **Ready to transcribe? Upload your video or use the API endpoints above! πŸŽ‰**
hf_spaces_deploy/app.py ADDED
@@ -0,0 +1,343 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Hugging Face Spaces app.py - Video Transcription Service
4
+ Combines Gradio interface with FastAPI for full functionality
5
+ """
6
+
7
+ import gradio as gr
8
+ import asyncio
9
+ import threading
10
+ import time
11
+ import os
12
+ import logging
13
+ from datetime import datetime
14
+ from typing import Optional, Tuple
15
+ import uvicorn
16
+ from fastapi import FastAPI, File, UploadFile, HTTPException
17
+ from fastapi.responses import JSONResponse
18
+ import tempfile
19
+
20
+ # Import our existing modules
21
+ from config import settings
22
+ from models import TranscriptionStatus, TranscriptionResponse, TranscriptionResult
23
+ from storage import storage
24
+ from transcription_service import transcription_service
25
+ from logging_config import setup_logging, log_step, log_success, log_error
26
+
27
+ # Setup logging for Hugging Face Spaces
28
+ setup_logging(level=logging.INFO, log_to_file=False)
29
+ logger = logging.getLogger(__name__)
30
+
31
+ # Configure for Hugging Face Spaces
32
+ os.environ.setdefault("WHISPER_MODEL", "base") # HF Spaces can handle base model
33
+ os.environ.setdefault("MODEL_PRELOAD", "true")
34
+ os.environ.setdefault("DEBUG", "false")
35
+
36
+ # FastAPI app for API functionality
37
+ api_app = FastAPI(
38
+ title="Video Transcription API",
39
+ description="API endpoints for video transcription",
40
+ version="1.0.0"
41
+ )
42
+
43
+ class TranscriptionManager:
44
+ def __init__(self):
45
+ self.model_loaded = False
46
+ self.model_loading = False
47
+
48
+ async def ensure_model_loaded(self):
49
+ """Ensure Whisper model is loaded"""
50
+ if self.model_loaded:
51
+ return True
52
+
53
+ if self.model_loading:
54
+ while self.model_loading:
55
+ await asyncio.sleep(0.1)
56
+ return self.model_loaded
57
+
58
+ self.model_loading = True
59
+ try:
60
+ logger.info("πŸ€– Loading Whisper model for Hugging Face Spaces...")
61
+ success = await transcription_service.preload_model()
62
+ self.model_loaded = success
63
+ return success
64
+ finally:
65
+ self.model_loading = False
66
+
67
+ # Global transcription manager
68
+ transcription_manager = TranscriptionManager()
69
+
70
+ # FastAPI endpoints (preserve existing API functionality)
71
+ @api_app.post("/transcribe")
72
+ async def api_transcribe(file: UploadFile = File(...), language: str = None):
73
+ """API endpoint for video transcription"""
74
+ try:
75
+ # Ensure model is loaded
76
+ if not await transcription_manager.ensure_model_loaded():
77
+ raise HTTPException(status_code=503, detail="Model not available")
78
+
79
+ # Validate file
80
+ if not file.filename:
81
+ raise HTTPException(status_code=400, detail="No file provided")
82
+
83
+ # Read file content
84
+ content = await file.read()
85
+ if len(content) > settings.MAX_FILE_SIZE:
86
+ raise HTTPException(status_code=413, detail="File too large")
87
+
88
+ # Create transcription
89
+ transcription_id = storage.create_transcription(language=language)
90
+
91
+ # Start transcription in background
92
+ asyncio.create_task(
93
+ transcription_service.transcribe_video(content, transcription_id, language)
94
+ )
95
+
96
+ return TranscriptionResponse(
97
+ id=transcription_id,
98
+ status=TranscriptionStatus.PENDING,
99
+ message="Transcription started",
100
+ created_at=storage.get_transcription(transcription_id).created_at
101
+ )
102
+
103
+ except HTTPException:
104
+ raise
105
+ except Exception as e:
106
+ logger.error(f"API transcription error: {e}")
107
+ raise HTTPException(status_code=500, detail=str(e))
108
+
109
+ @api_app.get("/transcribe/{transcription_id}")
110
+ async def api_get_transcription(transcription_id: int):
111
+ """API endpoint to get transcription status/results"""
112
+ result = storage.get_transcription(transcription_id)
113
+ if not result:
114
+ raise HTTPException(status_code=404, detail="Transcription not found")
115
+ return result
116
+
117
+ @api_app.get("/health")
118
+ async def api_health():
119
+ """API health check"""
120
+ return {
121
+ "status": "healthy",
122
+ "model_loaded": transcription_manager.model_loaded,
123
+ "active_transcriptions": len([
124
+ t for t in storage._storage.values()
125
+ if t.status in [TranscriptionStatus.PENDING, TranscriptionStatus.PROCESSING]
126
+ ]) if hasattr(storage, '_storage') else 0
127
+ }
128
+
129
+ # Gradio interface functions (sync versions for Gradio compatibility)
130
+ def gradio_transcribe(video_file, language):
131
+ """Gradio transcription function"""
132
+ if video_file is None:
133
+ return "❌ Please upload a video file", "", ""
134
+
135
+ try:
136
+ # Check if model is loaded (sync check)
137
+ if not transcription_manager.model_loaded:
138
+ return "❌ Model not loaded yet. Please wait and try again.", "", ""
139
+
140
+ # Read file
141
+ with open(video_file, 'rb') as f:
142
+ content = f.read()
143
+
144
+ if len(content) > settings.MAX_FILE_SIZE:
145
+ return f"❌ File too large. Maximum size: {settings.MAX_FILE_SIZE // (1024*1024)}MB", "", ""
146
+
147
+ # Create transcription
148
+ transcription_id = storage.create_transcription(language=language if language != "auto" else None)
149
+
150
+ # Start transcription in background
151
+ loop = asyncio.new_event_loop()
152
+ asyncio.set_event_loop(loop)
153
+ loop.run_in_executor(
154
+ None,
155
+ lambda: asyncio.run(transcription_service.transcribe_video(
156
+ content, transcription_id, language if language != "auto" else None
157
+ ))
158
+ )
159
+
160
+ return f"βœ… Transcription started with ID: {transcription_id}", str(transcription_id), "⏳ Processing..."
161
+
162
+ except Exception as e:
163
+ logger.error(f"Gradio transcription error: {e}")
164
+ return f"❌ Error: {str(e)}", "", ""
165
+
166
+ def gradio_check_status(transcription_id_str):
167
+ """Check transcription status for Gradio"""
168
+ if not transcription_id_str:
169
+ return "❌ Please provide a transcription ID"
170
+
171
+ try:
172
+ transcription_id = int(transcription_id_str)
173
+ result = storage.get_transcription(transcription_id)
174
+
175
+ if not result:
176
+ return "❌ Transcription not found or expired"
177
+
178
+ if result.status == TranscriptionStatus.COMPLETED:
179
+ return f"βœ… Completed!\n\nLanguage: {result.language}\nDuration: {result.duration}s\n\nText:\n{result.text}"
180
+ elif result.status == TranscriptionStatus.FAILED:
181
+ return f"❌ Failed: {result.error_message}"
182
+ elif result.status == TranscriptionStatus.PROCESSING:
183
+ return "⏳ Still processing... Please wait and check again."
184
+ else:
185
+ return "⏳ Pending... Please wait and check again."
186
+
187
+ except ValueError:
188
+ return "❌ Invalid transcription ID (must be a number)"
189
+ except Exception as e:
190
+ return f"❌ Error: {str(e)}"
191
+
192
+ # Create Gradio interface
193
+ def create_gradio_interface():
194
+ """Create the Gradio interface"""
195
+
196
+ with gr.Blocks(
197
+ title="Video Transcription Service",
198
+ theme=gr.themes.Soft(),
199
+ css="""
200
+ .gradio-container {
201
+ max-width: 1000px !important;
202
+ }
203
+ """
204
+ ) as interface:
205
+
206
+ gr.Markdown("""
207
+ # 🎬 Video Transcription Service
208
+
209
+ Upload your video files and get accurate transcriptions using OpenAI Whisper.
210
+
211
+ **Features:**
212
+ - πŸŽ₯ Multiple video formats (MP4, AVI, MOV, etc.)
213
+ - 🌐 Automatic language detection or manual selection
214
+ - πŸš€ Fast processing with OpenAI Whisper
215
+ - πŸ“± Both web interface and API access
216
+ """)
217
+
218
+ with gr.Tab("πŸ“€ Upload & Transcribe"):
219
+ with gr.Row():
220
+ with gr.Column():
221
+ video_input = gr.File(
222
+ label="Upload Video File",
223
+ file_types=["video"],
224
+ type="filepath"
225
+ )
226
+ language_input = gr.Dropdown(
227
+ choices=["auto", "en", "es", "fr", "de", "it", "pt", "ru", "ja", "ko", "zh", "ar", "hi"],
228
+ value="auto",
229
+ label="Language (auto-detect or specify)"
230
+ )
231
+ transcribe_btn = gr.Button("πŸš€ Start Transcription", variant="primary")
232
+
233
+ with gr.Column():
234
+ status_output = gr.Textbox(label="Status", lines=3)
235
+ transcription_id_output = gr.Textbox(label="Transcription ID", visible=True)
236
+ result_output = gr.Textbox(label="Progress", lines=2)
237
+
238
+ with gr.Tab("πŸ” Check Status"):
239
+ with gr.Row():
240
+ with gr.Column():
241
+ id_input = gr.Textbox(label="Transcription ID", placeholder="Enter transcription ID...")
242
+ check_btn = gr.Button("πŸ“Š Check Status", variant="secondary")
243
+
244
+ with gr.Column():
245
+ status_result = gr.Textbox(label="Result", lines=10)
246
+
247
+ with gr.Tab("πŸ”§ API Documentation"):
248
+ gr.Markdown("""
249
+ ## 🌐 API Endpoints
250
+
251
+ You can also use this service programmatically via API calls:
252
+
253
+ ### Upload Video for Transcription
254
+ ```bash
255
+ curl -X POST "https://your-space-name.hf.space/api/transcribe" \\
256
+ -F "[email protected]" \\
257
+ -F "language=en"
258
+ ```
259
+
260
+ ### Check Transcription Status
261
+ ```bash
262
+ curl "https://your-space-name.hf.space/api/transcribe/123"
263
+ ```
264
+
265
+ ### Health Check
266
+ ```bash
267
+ curl "https://your-space-name.hf.space/api/health"
268
+ ```
269
+
270
+ ### Python Example
271
+ ```python
272
+ import requests
273
+
274
+ # Upload video
275
+ with open('video.mp4', 'rb') as f:
276
+ response = requests.post(
277
+ 'https://your-space-name.hf.space/api/transcribe',
278
+ files={'file': f},
279
+ data={'language': 'en'}
280
+ )
281
+ transcription_id = response.json()['id']
282
+
283
+ # Check status
284
+ result = requests.get(f'https://your-space-name.hf.space/api/transcribe/{transcription_id}')
285
+ print(result.json())
286
+ ```
287
+ """)
288
+
289
+ # Event handlers
290
+ transcribe_btn.click(
291
+ fn=gradio_transcribe,
292
+ inputs=[video_input, language_input],
293
+ outputs=[status_output, transcription_id_output, result_output]
294
+ )
295
+
296
+ check_btn.click(
297
+ fn=gradio_check_status,
298
+ inputs=[id_input],
299
+ outputs=[status_result]
300
+ )
301
+
302
+ return interface
303
+
304
+ # Startup function
305
+ async def startup():
306
+ """Initialize services"""
307
+ logger.info("πŸš€ Starting Video Transcription Service on Hugging Face Spaces")
308
+
309
+ # Start storage cleanup
310
+ await storage.start_cleanup_task()
311
+
312
+ # Preload model
313
+ log_step("Preloading Whisper model")
314
+ success = await transcription_manager.ensure_model_loaded()
315
+ if success:
316
+ log_success("Model preloaded successfully")
317
+ else:
318
+ log_error("Model preload failed")
319
+
320
+ def run_fastapi():
321
+ """Run FastAPI in a separate thread"""
322
+ uvicorn.run(api_app, host="0.0.0.0", port=7860, log_level="info")
323
+
324
+ # Main execution
325
+ if __name__ == "__main__":
326
+ # Run startup
327
+ asyncio.run(startup())
328
+
329
+ # Start FastAPI in background thread for API access
330
+ api_thread = threading.Thread(target=run_fastapi, daemon=True)
331
+ api_thread.start()
332
+
333
+ # Create and launch Gradio interface
334
+ interface = create_gradio_interface()
335
+
336
+ # Launch with API access enabled
337
+ interface.launch(
338
+ server_name="0.0.0.0",
339
+ server_port=7860,
340
+ share=False, # HF Spaces handles sharing
341
+ show_api=True, # Enable API documentation
342
+ show_error=True
343
+ )
hf_spaces_deploy/config.py ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from typing import List
3
+
4
+ class HuggingFaceSettings:
5
+ # File upload settings (HF Spaces can handle larger files)
6
+ MAX_FILE_SIZE = 200 * 1024 * 1024 # 200MB for HF Spaces
7
+ ALLOWED_EXTENSIONS = ['.mp4', '.avi', '.mov', '.mkv', '.wmv', '.flv', '.webm', '.m4v']
8
+
9
+ # Transcription settings (optimized for HF Spaces)
10
+ WHISPER_MODEL = os.getenv("WHISPER_MODEL", "base") # HF Spaces can handle base model
11
+ CLEANUP_INTERVAL_HOURS = 3.5 # Clean up after 3.5 hours
12
+
13
+ # Performance settings for HF Spaces
14
+ MODEL_PRELOAD = True # Always preload on HF Spaces
15
+ MAX_CONCURRENT_TRANSCRIPTIONS = 2 # HF Spaces can handle more
16
+ REQUEST_TIMEOUT_SECONDS = 600 # 10 minutes max per request
17
+
18
+ # Rate limiting (more generous on HF Spaces)
19
+ RATE_LIMIT_REQUESTS = 20 # requests per minute per IP
20
+
21
+ # Server settings
22
+ HOST = "0.0.0.0"
23
+ PORT = 7860 # Standard HF Spaces port
24
+
25
+ # Logging settings
26
+ DEBUG_MODE = os.getenv("DEBUG", "false").lower() == "true"
27
+ LOG_TO_FILE = False # No file logging on HF Spaces
28
+
29
+ # Hugging Face Spaces specific
30
+ HF_SPACE_ID = os.getenv("SPACE_ID", "your-username/video-transcription")
31
+ HF_SPACE_URL = f"https://{HF_SPACE_ID.replace('/', '-')}.hf.space" if "SPACE_ID" in os.environ else "http://localhost:7860"
32
+
33
+ # Use HF-optimized settings
34
+ settings = HuggingFaceSettings()
hf_spaces_deploy/logging_config.py ADDED
@@ -0,0 +1,136 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Logging configuration for Video Transcription Service
3
+ """
4
+
5
+ import logging
6
+ import sys
7
+ from datetime import datetime
8
+
9
+ def setup_logging(level=logging.INFO, log_to_file=False):
10
+ """
11
+ Setup comprehensive logging for the application
12
+
13
+ Args:
14
+ level: Logging level (DEBUG, INFO, WARNING, ERROR)
15
+ log_to_file: Whether to also log to a file
16
+ """
17
+
18
+ # Create formatter with emojis and detailed info
19
+ formatter = logging.Formatter(
20
+ '%(asctime)s - %(name)s - %(levelname)s - %(message)s',
21
+ datefmt='%Y-%m-%d %H:%M:%S'
22
+ )
23
+
24
+ # Setup console handler
25
+ console_handler = logging.StreamHandler(sys.stdout)
26
+ console_handler.setFormatter(formatter)
27
+ console_handler.setLevel(level)
28
+
29
+ handlers = [console_handler]
30
+
31
+ # Setup file handler if requested
32
+ if log_to_file:
33
+ log_filename = f"transcription_service_{datetime.now().strftime('%Y%m%d_%H%M%S')}.log"
34
+ file_handler = logging.FileHandler(log_filename)
35
+ file_handler.setFormatter(formatter)
36
+ file_handler.setLevel(level)
37
+ handlers.append(file_handler)
38
+
39
+ # Configure root logger
40
+ logging.basicConfig(
41
+ level=level,
42
+ handlers=handlers,
43
+ force=True # Override any existing configuration
44
+ )
45
+
46
+ # Set specific logger levels
47
+ loggers = [
48
+ 'main',
49
+ 'transcription_service',
50
+ 'storage',
51
+ 'uvicorn.access',
52
+ 'uvicorn.error'
53
+ ]
54
+
55
+ for logger_name in loggers:
56
+ logger = logging.getLogger(logger_name)
57
+ logger.setLevel(level)
58
+
59
+ # Reduce noise from some third-party libraries
60
+ logging.getLogger('httpx').setLevel(logging.WARNING)
61
+ logging.getLogger('httpcore').setLevel(logging.WARNING)
62
+
63
+ return logging.getLogger(__name__)
64
+
65
+ def get_progress_logger():
66
+ """Get a logger specifically for progress tracking"""
67
+ logger = logging.getLogger('progress')
68
+ logger.setLevel(logging.INFO)
69
+ return logger
70
+
71
+ # Progress tracking functions
72
+ def log_step(step_name: str, transcription_id: int = None):
73
+ """Log a processing step"""
74
+ logger = get_progress_logger()
75
+ if transcription_id:
76
+ logger.info(f"πŸ”„ [{transcription_id}] {step_name}")
77
+ else:
78
+ logger.info(f"πŸ”„ {step_name}")
79
+
80
+ def log_success(message: str, transcription_id: int = None):
81
+ """Log a success message"""
82
+ logger = get_progress_logger()
83
+ if transcription_id:
84
+ logger.info(f"βœ… [{transcription_id}] {message}")
85
+ else:
86
+ logger.info(f"βœ… {message}")
87
+
88
+ def log_error(message: str, transcription_id: int = None):
89
+ """Log an error message"""
90
+ logger = get_progress_logger()
91
+ if transcription_id:
92
+ logger.error(f"❌ [{transcription_id}] {message}")
93
+ else:
94
+ logger.error(f"❌ {message}")
95
+
96
+ def log_warning(message: str, transcription_id: int = None):
97
+ """Log a warning message"""
98
+ logger = get_progress_logger()
99
+ if transcription_id:
100
+ logger.warning(f"⚠️ [{transcription_id}] {message}")
101
+ else:
102
+ logger.warning(f"⚠️ {message}")
103
+
104
+ def log_info(message: str, transcription_id: int = None):
105
+ """Log an info message"""
106
+ logger = get_progress_logger()
107
+ if transcription_id:
108
+ logger.info(f"ℹ️ [{transcription_id}] {message}")
109
+ else:
110
+ logger.info(f"ℹ️ {message}")
111
+
112
+ def log_progress_summary(transcription_id: int, total_time: float, status: str):
113
+ """Log a summary of transcription progress"""
114
+ logger = get_progress_logger()
115
+ logger.info(f"πŸ“Š [{transcription_id}] SUMMARY:")
116
+ logger.info(f" Status: {status}")
117
+ logger.info(f" Total Time: {total_time:.2f} seconds")
118
+ logger.info(f" Timestamp: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
119
+
120
+ # Example usage and testing
121
+ if __name__ == "__main__":
122
+ # Test the logging configuration
123
+ setup_logging(level=logging.INFO)
124
+
125
+ logger = logging.getLogger(__name__)
126
+ logger.info("πŸ§ͺ Testing logging configuration...")
127
+
128
+ # Test progress logging
129
+ log_step("Starting test transcription", 123)
130
+ log_info("Processing video file", 123)
131
+ log_success("Audio extraction completed", 123)
132
+ log_warning("Large file detected", 123)
133
+ log_error("Test error message", 123)
134
+ log_progress_summary(123, 45.6, "completed")
135
+
136
+ logger.info("βœ… Logging test completed")
hf_spaces_deploy/models.py ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pydantic import BaseModel
2
+ from typing import Optional
3
+ from enum import Enum
4
+ from datetime import datetime
5
+
6
+ class TranscriptionStatus(str, Enum):
7
+ PENDING = "pending"
8
+ PROCESSING = "processing"
9
+ COMPLETED = "completed"
10
+ FAILED = "failed"
11
+
12
+ class TranscriptionRequest(BaseModel):
13
+ language: Optional[str] = None # Auto-detect if None
14
+
15
+ class TranscriptionResponse(BaseModel):
16
+ id: int
17
+ status: TranscriptionStatus
18
+ message: str
19
+ created_at: datetime
20
+
21
+ class TranscriptionResult(BaseModel):
22
+ id: int
23
+ status: TranscriptionStatus
24
+ text: Optional[str] = None
25
+ language: Optional[str] = None
26
+ duration: Optional[float] = None
27
+ created_at: datetime
28
+ completed_at: Optional[datetime] = None
29
+ error_message: Optional[str] = None
30
+
31
+ class ErrorResponse(BaseModel):
32
+ id: int = 0
33
+ error: str
34
+ message: str
hf_spaces_deploy/requirements.txt ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ gradio==4.44.0
2
+ fastapi==0.104.1
3
+ uvicorn[standard]==0.24.0
4
+ python-multipart==0.0.6
5
+ openai-whisper==20231117
6
+ torch==2.1.0
7
+ torchaudio==2.1.0
8
+ ffmpeg-python==0.2.0
9
+ pydantic==2.5.0
10
+ slowapi==0.1.9
11
+ aiofiles==23.2.1
12
+ httpx==0.25.2
13
+ numpy<2.0.0
14
+ psutil==5.9.6
hf_spaces_deploy/restart_handler.py ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Restart handler for Video Transcription Service
4
+ Helps prevent restarts due to memory/timeout issues
5
+ """
6
+
7
+ import os
8
+ import signal
9
+ import sys
10
+ import time
11
+ import logging
12
+ import psutil
13
+ from datetime import datetime
14
+
15
+ logger = logging.getLogger(__name__)
16
+
17
+ class RestartHandler:
18
+ def __init__(self):
19
+ self.start_time = time.time()
20
+ self.restart_count = 0
21
+ self.memory_warnings = 0
22
+
23
+ def setup_signal_handlers(self):
24
+ """Setup signal handlers for graceful shutdown"""
25
+ signal.signal(signal.SIGTERM, self._signal_handler)
26
+ signal.signal(signal.SIGINT, self._signal_handler)
27
+
28
+ def _signal_handler(self, signum, frame):
29
+ """Handle shutdown signals gracefully"""
30
+ logger.info(f"πŸ›‘ Received signal {signum}, shutting down gracefully...")
31
+
32
+ # Log service statistics
33
+ uptime = time.time() - self.start_time
34
+ logger.info(f"πŸ“Š Service uptime: {uptime:.1f} seconds")
35
+ logger.info(f"πŸ”„ Restart count: {self.restart_count}")
36
+ logger.info(f"⚠️ Memory warnings: {self.memory_warnings}")
37
+
38
+ sys.exit(0)
39
+
40
+ def check_memory_usage(self):
41
+ """Check memory usage and warn if high"""
42
+ try:
43
+ process = psutil.Process()
44
+ memory_info = process.memory_info()
45
+ memory_mb = memory_info.rss / (1024 * 1024)
46
+
47
+ # Warn if using more than 400MB (80% of 512MB limit)
48
+ if memory_mb > 400:
49
+ self.memory_warnings += 1
50
+ logger.warning(f"⚠️ High memory usage: {memory_mb:.1f}MB (limit: 512MB)")
51
+ logger.warning("πŸ’‘ Consider using 'tiny' model or smaller files")
52
+ return True
53
+ elif memory_mb > 300:
54
+ logger.info(f"πŸ“Š Memory usage: {memory_mb:.1f}MB")
55
+
56
+ return False
57
+ except Exception as e:
58
+ logger.error(f"❌ Error checking memory: {e}")
59
+ return False
60
+
61
+ def log_system_info(self):
62
+ """Log system information for debugging"""
63
+ try:
64
+ logger.info("πŸ–₯️ System Information:")
65
+ logger.info(f" Python: {sys.version.split()[0]}")
66
+ logger.info(f" Platform: {sys.platform}")
67
+
68
+ if hasattr(psutil, 'virtual_memory'):
69
+ memory = psutil.virtual_memory()
70
+ logger.info(f" Total Memory: {memory.total / (1024**3):.1f}GB")
71
+ logger.info(f" Available Memory: {memory.available / (1024**3):.1f}GB")
72
+
73
+ if hasattr(psutil, 'cpu_count'):
74
+ logger.info(f" CPU Cores: {psutil.cpu_count()}")
75
+
76
+ except Exception as e:
77
+ logger.warning(f"⚠️ Could not get system info: {e}")
78
+
79
+ def create_restart_prevention_tips(self):
80
+ """Create tips to prevent restarts"""
81
+ tips = [
82
+ "πŸ”§ Restart Prevention Tips:",
83
+ "1. Use WHISPER_MODEL=tiny for faster loading and less memory",
84
+ "2. Keep video files under 50MB for free tier",
85
+ "3. Process one video at a time",
86
+ "4. Enable model preloading: MODEL_PRELOAD=true",
87
+ "5. Monitor memory usage in logs",
88
+ "6. Use DEBUG=false in production to reduce log overhead"
89
+ ]
90
+
91
+ for tip in tips:
92
+ logger.info(tip)
93
+
94
+ # Global restart handler instance
95
+ restart_handler = RestartHandler()
96
+
97
+ def setup_restart_prevention():
98
+ """Setup restart prevention measures"""
99
+ restart_handler.setup_signal_handlers()
100
+ restart_handler.log_system_info()
101
+ restart_handler.create_restart_prevention_tips()
102
+
103
+ def check_service_health():
104
+ """Check service health and log warnings"""
105
+ return restart_handler.check_memory_usage()
106
+
107
+ # Environment variable helpers
108
+ def get_optimal_settings():
109
+ """Get optimal settings for the current environment"""
110
+ settings = {}
111
+
112
+ # Detect if running on free tier (limited memory)
113
+ try:
114
+ memory = psutil.virtual_memory()
115
+ total_gb = memory.total / (1024**3)
116
+
117
+ if total_gb < 1: # Less than 1GB = likely free tier
118
+ logger.info("πŸ” Detected limited memory environment")
119
+ settings.update({
120
+ "WHISPER_MODEL": "tiny",
121
+ "MAX_FILE_SIZE": 50 * 1024 * 1024, # 50MB
122
+ "MODEL_PRELOAD": "true",
123
+ "DEBUG": "false"
124
+ })
125
+ else:
126
+ logger.info("πŸ” Detected standard memory environment")
127
+ settings.update({
128
+ "WHISPER_MODEL": "base",
129
+ "MAX_FILE_SIZE": 100 * 1024 * 1024, # 100MB
130
+ "MODEL_PRELOAD": "true"
131
+ })
132
+
133
+ except Exception:
134
+ # Fallback to conservative settings
135
+ settings.update({
136
+ "WHISPER_MODEL": "tiny",
137
+ "MAX_FILE_SIZE": 50 * 1024 * 1024,
138
+ "MODEL_PRELOAD": "true"
139
+ })
140
+
141
+ return settings
142
+
143
+ def apply_optimal_settings():
144
+ """Apply optimal settings if not already set"""
145
+ optimal = get_optimal_settings()
146
+ applied = []
147
+
148
+ for key, value in optimal.items():
149
+ if not os.getenv(key):
150
+ os.environ[key] = str(value)
151
+ applied.append(f"{key}={value}")
152
+
153
+ if applied:
154
+ logger.info("βš™οΈ Applied optimal settings:")
155
+ for setting in applied:
156
+ logger.info(f" {setting}")
157
+
158
+ if __name__ == "__main__":
159
+ # Test the restart handler
160
+ logging.basicConfig(level=logging.INFO)
161
+
162
+ setup_restart_prevention()
163
+ apply_optimal_settings()
164
+
165
+ logger.info("βœ… Restart handler test completed")
hf_spaces_deploy/storage.py ADDED
@@ -0,0 +1,158 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import asyncio
2
+ from datetime import datetime, timedelta, timezone
3
+ from typing import Dict, Optional
4
+ from models import TranscriptionResult, TranscriptionStatus
5
+ from config import settings
6
+ import logging
7
+
8
+ # Configure logging for this module
9
+ logging.basicConfig(
10
+ level=logging.INFO,
11
+ format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
12
+ )
13
+ logger = logging.getLogger(__name__)
14
+
15
+ class InMemoryStorage:
16
+ def __init__(self):
17
+ self._storage: Dict[int, TranscriptionResult] = {}
18
+ self._next_id = 1
19
+ self._cleanup_task = None
20
+
21
+ async def start_cleanup_task(self):
22
+ """Start the background cleanup task"""
23
+ if self._cleanup_task is None:
24
+ logger.info("🧹 Starting automatic cleanup task")
25
+ logger.info(f"⏰ Cleanup interval: {settings.CLEANUP_INTERVAL_HOURS} hours")
26
+ self._cleanup_task = asyncio.create_task(self._cleanup_loop())
27
+ else:
28
+ logger.info("🧹 Cleanup task already running")
29
+
30
+ async def stop_cleanup_task(self):
31
+ """Stop the background cleanup task"""
32
+ if self._cleanup_task:
33
+ logger.info("πŸ›‘ Stopping cleanup task")
34
+ self._cleanup_task.cancel()
35
+ try:
36
+ await self._cleanup_task
37
+ except asyncio.CancelledError:
38
+ pass
39
+ self._cleanup_task = None
40
+ logger.info("βœ… Cleanup task stopped")
41
+ else:
42
+ logger.info("🧹 No cleanup task to stop")
43
+
44
+ def create_transcription(self, language: Optional[str] = None) -> int:
45
+ """Create a new transcription entry and return its ID"""
46
+ transcription_id = self._next_id
47
+ self._next_id += 1
48
+
49
+ logger.info(f"πŸ“ Creating new transcription entry with ID: {transcription_id}")
50
+ logger.info(f"🌐 Language: {language or 'auto-detect'}")
51
+
52
+ result = TranscriptionResult(
53
+ id=transcription_id,
54
+ status=TranscriptionStatus.PENDING,
55
+ language=language,
56
+ created_at=datetime.now(timezone.utc)
57
+ )
58
+
59
+ self._storage[transcription_id] = result
60
+ logger.info(f"βœ… Transcription {transcription_id} created successfully")
61
+ logger.info(f"πŸ“Š Total active transcriptions: {len(self._storage)}")
62
+ return transcription_id
63
+
64
+ def get_transcription(self, transcription_id: int) -> Optional[TranscriptionResult]:
65
+ """Get transcription by ID"""
66
+ logger.info(f"πŸ” Looking up transcription ID: {transcription_id}")
67
+ result = self._storage.get(transcription_id)
68
+ if result:
69
+ logger.info(f"βœ… Found transcription {transcription_id} with status: {result.status}")
70
+ else:
71
+ logger.warning(f"❌ Transcription {transcription_id} not found")
72
+ return result
73
+
74
+ def update_transcription(self, transcription_id: int, **kwargs) -> bool:
75
+ """Update transcription fields"""
76
+ if transcription_id not in self._storage:
77
+ logger.warning(f"❌ Cannot update transcription {transcription_id} - not found")
78
+ return False
79
+
80
+ result = self._storage[transcription_id]
81
+ old_status = result.status if hasattr(result, 'status') else 'unknown'
82
+
83
+ for key, value in kwargs.items():
84
+ if hasattr(result, key):
85
+ setattr(result, key, value)
86
+
87
+ new_status = result.status if hasattr(result, 'status') else 'unknown'
88
+ logger.info(f"πŸ“ Updated transcription {transcription_id}")
89
+
90
+ if 'status' in kwargs:
91
+ logger.info(f"πŸ”„ Status changed: {old_status} β†’ {new_status}")
92
+
93
+ # Log specific updates
94
+ for key, value in kwargs.items():
95
+ if key == 'text' and value:
96
+ text_preview = value[:50] + "..." if len(value) > 50 else value
97
+ logger.info(f"πŸ“„ Text updated: {text_preview}")
98
+ elif key == 'error_message' and value:
99
+ logger.error(f"❌ Error recorded: {value}")
100
+ elif key not in ['status', 'text', 'error_message']:
101
+ logger.info(f"πŸ“Š {key}: {value}")
102
+
103
+ return True
104
+
105
+ def delete_transcription(self, transcription_id: int) -> bool:
106
+ """Delete transcription by ID"""
107
+ if transcription_id in self._storage:
108
+ result = self._storage[transcription_id]
109
+ del self._storage[transcription_id]
110
+ logger.info(f"πŸ—‘οΈ Deleted transcription {transcription_id} (status: {result.status})")
111
+ logger.info(f"πŸ“Š Remaining transcriptions: {len(self._storage)}")
112
+ return True
113
+ else:
114
+ logger.warning(f"❌ Cannot delete transcription {transcription_id} - not found")
115
+ return False
116
+
117
+ async def _cleanup_loop(self):
118
+ """Background task to clean up old transcriptions"""
119
+ logger.info("🧹 Cleanup loop started")
120
+ while True:
121
+ try:
122
+ logger.info("😴 Cleanup sleeping for 1 hour...")
123
+ await asyncio.sleep(3600) # Check every hour
124
+ logger.info("⏰ Running scheduled cleanup...")
125
+ await self._cleanup_old_transcriptions()
126
+ except asyncio.CancelledError:
127
+ logger.info("πŸ›‘ Cleanup loop cancelled")
128
+ break
129
+ except Exception as e:
130
+ logger.error(f"❌ Error in cleanup loop: {e}")
131
+
132
+ async def _cleanup_old_transcriptions(self):
133
+ """Remove transcriptions older than the configured time"""
134
+ logger.info("🧹 Starting cleanup of old transcriptions...")
135
+ cutoff_time = datetime.now(timezone.utc) - timedelta(hours=settings.CLEANUP_INTERVAL_HOURS)
136
+ logger.info(f"⏰ Cutoff time: {cutoff_time} (older than {settings.CLEANUP_INTERVAL_HOURS} hours)")
137
+
138
+ to_delete = []
139
+
140
+ for transcription_id, result in self._storage.items():
141
+ age_hours = (datetime.now(timezone.utc) - result.created_at).total_seconds() / 3600
142
+ if result.created_at < cutoff_time:
143
+ logger.info(f"πŸ—‘οΈ Marking transcription {transcription_id} for deletion (age: {age_hours:.1f} hours)")
144
+ to_delete.append(transcription_id)
145
+
146
+ if not to_delete:
147
+ logger.info("βœ… No old transcriptions to clean up")
148
+ return
149
+
150
+ logger.info(f"🧹 Deleting {len(to_delete)} old transcriptions...")
151
+ for transcription_id in to_delete:
152
+ self.delete_transcription(transcription_id)
153
+
154
+ logger.info(f"βœ… Cleanup completed - removed {len(to_delete)} transcriptions")
155
+ logger.info(f"πŸ“Š Active transcriptions remaining: {len(self._storage)}")
156
+
157
+ # Global storage instance
158
+ storage = InMemoryStorage()
hf_spaces_deploy/transcription_service.py ADDED
@@ -0,0 +1,304 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import whisper
2
+ import ffmpeg
3
+ import tempfile
4
+ import os
5
+ import asyncio
6
+ import logging
7
+ import time
8
+ from typing import Optional
9
+ from datetime import datetime, timezone
10
+ from storage import storage
11
+ from models import TranscriptionStatus
12
+ from config import settings
13
+
14
+ # Configure logging for this module
15
+ logging.basicConfig(
16
+ level=logging.INFO,
17
+ format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
18
+ )
19
+ logger = logging.getLogger(__name__)
20
+
21
+ class TranscriptionService:
22
+ def __init__(self):
23
+ self._model = None
24
+ self._model_loading = False
25
+ self._model_load_error = None
26
+
27
+ async def preload_model(self):
28
+ """Preload Whisper model during startup to avoid request timeouts"""
29
+ if self._model is not None:
30
+ logger.info("πŸ€– Whisper model already loaded")
31
+ return True
32
+
33
+ if self._model_load_error:
34
+ logger.error(f"❌ Previous model load failed: {self._model_load_error}")
35
+ return False
36
+
37
+ try:
38
+ logger.info(f"πŸš€ Preloading Whisper model: {settings.WHISPER_MODEL}")
39
+ logger.info("πŸ“₯ This may take 30-60 seconds for first-time download...")
40
+ logger.info("⚑ Preloading during startup to avoid request timeouts...")
41
+
42
+ start_time = time.time()
43
+
44
+ # Run in thread pool to avoid blocking startup
45
+ loop = asyncio.get_event_loop()
46
+ self._model = await loop.run_in_executor(
47
+ None,
48
+ whisper.load_model,
49
+ settings.WHISPER_MODEL
50
+ )
51
+
52
+ load_time = time.time() - start_time
53
+ logger.info(f"βœ… Whisper model preloaded successfully in {load_time:.2f} seconds")
54
+ logger.info("🎯 Service ready for transcription requests!")
55
+ return True
56
+
57
+ except Exception as e:
58
+ error_msg = f"Failed to preload Whisper model: {str(e)}"
59
+ logger.error(f"❌ {error_msg}")
60
+ self._model_load_error = error_msg
61
+ return False
62
+
63
+ async def _load_model(self):
64
+ """Load Whisper model asynchronously (fallback if not preloaded)"""
65
+ if self._model is not None:
66
+ logger.info("πŸ€– Whisper model already loaded")
67
+ return
68
+
69
+ if self._model_load_error:
70
+ logger.error(f"❌ Model load error: {self._model_load_error}")
71
+ raise Exception(self._model_load_error)
72
+
73
+ if self._model_loading:
74
+ logger.info("⏳ Whisper model is currently loading, waiting...")
75
+ # Wait for model to load
76
+ while self._model_loading:
77
+ await asyncio.sleep(0.1)
78
+ if self._model is None:
79
+ raise Exception("Model loading failed")
80
+ logger.info("βœ… Whisper model loading completed (waited)")
81
+ return
82
+
83
+ # If we get here, model wasn't preloaded - try to load it now
84
+ logger.warning("⚠️ Model not preloaded, loading during request (may cause timeout)")
85
+ self._model_loading = True
86
+ try:
87
+ logger.info(f"πŸ€– Loading Whisper model: {settings.WHISPER_MODEL}")
88
+ start_time = time.time()
89
+
90
+ # Run in thread pool to avoid blocking
91
+ loop = asyncio.get_event_loop()
92
+ self._model = await loop.run_in_executor(
93
+ None,
94
+ whisper.load_model,
95
+ settings.WHISPER_MODEL
96
+ )
97
+
98
+ load_time = time.time() - start_time
99
+ logger.info(f"βœ… Whisper model loaded successfully in {load_time:.2f} seconds")
100
+ except Exception as e:
101
+ error_msg = f"Failed to load Whisper model: {str(e)}"
102
+ logger.error(f"❌ {error_msg}")
103
+ self._model_load_error = error_msg
104
+ raise Exception(error_msg)
105
+ finally:
106
+ self._model_loading = False
107
+
108
+ async def transcribe_video(self, video_content: bytes, transcription_id: int, language: Optional[str] = None):
109
+ """Transcribe video content asynchronously"""
110
+ start_time = time.time()
111
+ logger.info(f"🎬 Starting video transcription for ID: {transcription_id}")
112
+ logger.info(f"πŸ“Š Video size: {len(video_content) / (1024*1024):.2f}MB")
113
+ logger.info(f"🌐 Language: {language or 'auto-detect'}")
114
+
115
+ # Check memory before starting
116
+ from restart_handler import check_service_health
117
+ if check_service_health():
118
+ logger.warning(f"⚠️ High memory usage detected before transcription {transcription_id}")
119
+
120
+ try:
121
+ # Update status to processing
122
+ logger.info(f"πŸ“ Updating status to PROCESSING for ID: {transcription_id}")
123
+ storage.update_transcription(
124
+ transcription_id,
125
+ status=TranscriptionStatus.PROCESSING
126
+ )
127
+
128
+ # Load model if needed
129
+ logger.info(f"πŸ€– Loading Whisper model for transcription {transcription_id}")
130
+ await self._load_model()
131
+
132
+ # Extract audio from video
133
+ logger.info(f"🎡 Extracting audio from video for transcription {transcription_id}")
134
+ audio_start = time.time()
135
+ audio_path = await self._extract_audio(video_content)
136
+ audio_time = time.time() - audio_start
137
+ logger.info(f"βœ… Audio extraction completed in {audio_time:.2f} seconds")
138
+
139
+ try:
140
+ # Transcribe audio
141
+ logger.info(f"πŸ—£οΈ Starting audio transcription for ID {transcription_id}")
142
+ transcribe_start = time.time()
143
+ result = await self._transcribe_audio(audio_path, language)
144
+ transcribe_time = time.time() - transcribe_start
145
+
146
+ # Log transcription results
147
+ text_length = len(result["text"]) if result["text"] else 0
148
+ logger.info(f"βœ… Transcription completed in {transcribe_time:.2f} seconds")
149
+ logger.info(f"πŸ“ Transcribed text length: {text_length} characters")
150
+ logger.info(f"🌐 Detected language: {result.get('language', 'unknown')}")
151
+ logger.info(f"⏱️ Audio duration: {result.get('duration', 'unknown')} seconds")
152
+
153
+ # Update storage with results
154
+ logger.info(f"πŸ’Ύ Saving transcription results for ID {transcription_id}")
155
+ storage.update_transcription(
156
+ transcription_id,
157
+ status=TranscriptionStatus.COMPLETED,
158
+ text=result["text"],
159
+ language=result["language"],
160
+ duration=result.get("duration"),
161
+ completed_at=datetime.now(timezone.utc)
162
+ )
163
+
164
+ total_time = time.time() - start_time
165
+ logger.info(f"πŸŽ‰ Transcription {transcription_id} completed successfully in {total_time:.2f} seconds total")
166
+
167
+ finally:
168
+ # Clean up audio file
169
+ if os.path.exists(audio_path):
170
+ logger.info(f"🧹 Cleaning up temporary audio file")
171
+ os.unlink(audio_path)
172
+
173
+ except Exception as e:
174
+ total_time = time.time() - start_time
175
+ logger.error(f"❌ Transcription {transcription_id} failed after {total_time:.2f} seconds: {str(e)}")
176
+ logger.error(f"πŸ” Error type: {type(e).__name__}")
177
+ storage.update_transcription(
178
+ transcription_id,
179
+ status=TranscriptionStatus.FAILED,
180
+ error_message=str(e),
181
+ completed_at=datetime.now(timezone.utc)
182
+ )
183
+
184
+ async def _extract_audio(self, video_content: bytes) -> str:
185
+ """Extract audio from video content"""
186
+ logger.info("πŸ“ Creating temporary video file...")
187
+
188
+ # Create temporary files
189
+ with tempfile.NamedTemporaryFile(delete=False, suffix='.tmp') as video_file:
190
+ video_file.write(video_content)
191
+ video_path = video_file.name
192
+
193
+ audio_path = tempfile.mktemp(suffix='.wav')
194
+ logger.info(f"πŸ“ Temporary files created - Video: {video_path}, Audio: {audio_path}")
195
+
196
+ try:
197
+ # Extract audio using ffmpeg
198
+ logger.info("🎡 Running FFmpeg to extract audio...")
199
+ loop = asyncio.get_event_loop()
200
+ await loop.run_in_executor(
201
+ None,
202
+ self._extract_audio_sync,
203
+ video_path,
204
+ audio_path
205
+ )
206
+
207
+ # Check if audio file was created successfully
208
+ if os.path.exists(audio_path):
209
+ audio_size = os.path.getsize(audio_path)
210
+ logger.info(f"βœ… Audio extraction successful - Size: {audio_size / (1024*1024):.2f}MB")
211
+ else:
212
+ logger.error("❌ Audio file was not created")
213
+ raise Exception("Audio extraction failed - no output file")
214
+
215
+ return audio_path
216
+ finally:
217
+ # Clean up video file
218
+ if os.path.exists(video_path):
219
+ logger.info("🧹 Cleaning up temporary video file")
220
+ os.unlink(video_path)
221
+
222
+ def _extract_audio_sync(self, video_path: str, audio_path: str):
223
+ """Synchronous audio extraction"""
224
+ try:
225
+ logger.info("πŸ”§ Configuring FFmpeg for audio extraction...")
226
+ logger.info(" - Codec: PCM 16-bit")
227
+ logger.info(" - Channels: 1 (mono)")
228
+ logger.info(" - Sample rate: 16kHz")
229
+
230
+ (
231
+ ffmpeg
232
+ .input(video_path)
233
+ .output(audio_path, acodec='pcm_s16le', ac=1, ar='16000')
234
+ .overwrite_output()
235
+ .run(quiet=True)
236
+ )
237
+ logger.info("βœ… FFmpeg audio extraction completed")
238
+ except Exception as e:
239
+ logger.error(f"❌ FFmpeg audio extraction failed: {str(e)}")
240
+ raise
241
+
242
+ async def _transcribe_audio(self, audio_path: str, language: Optional[str] = None) -> dict:
243
+ """Transcribe audio file"""
244
+ logger.info(f"πŸ—£οΈ Starting Whisper transcription...")
245
+ logger.info(f"🎡 Audio file: {audio_path}")
246
+ logger.info(f"🌐 Language: {language or 'auto-detect'}")
247
+
248
+ loop = asyncio.get_event_loop()
249
+
250
+ # Run transcription in thread pool
251
+ logger.info("⚑ Running transcription in background thread...")
252
+ result = await loop.run_in_executor(
253
+ None,
254
+ self._transcribe_audio_sync,
255
+ audio_path,
256
+ language
257
+ )
258
+
259
+ logger.info("βœ… Whisper transcription completed")
260
+ return result
261
+
262
+ def _transcribe_audio_sync(self, audio_path: str, language: Optional[str] = None) -> dict:
263
+ """Synchronous audio transcription"""
264
+ try:
265
+ logger.info("πŸ€– Preparing Whisper transcription options...")
266
+ options = {}
267
+ if language:
268
+ options['language'] = language
269
+ logger.info(f"🌐 Language specified: {language}")
270
+ else:
271
+ logger.info("🌐 Language: auto-detect")
272
+
273
+ logger.info("🎯 Starting Whisper model inference...")
274
+ start_time = time.time()
275
+ result = self._model.transcribe(audio_path, **options)
276
+ inference_time = time.time() - start_time
277
+
278
+ # Log detailed results
279
+ text = result["text"].strip()
280
+ detected_language = result.get("language", "unknown")
281
+ duration = result.get("duration", 0)
282
+
283
+ logger.info(f"βœ… Whisper inference completed in {inference_time:.2f} seconds")
284
+ logger.info(f"πŸ“ Text length: {len(text)} characters")
285
+ logger.info(f"🌐 Detected language: {detected_language}")
286
+ logger.info(f"⏱️ Audio duration: {duration:.2f} seconds")
287
+
288
+ if len(text) > 100:
289
+ logger.info(f"πŸ“„ Text preview: {text[:100]}...")
290
+ else:
291
+ logger.info(f"πŸ“„ Full text: {text}")
292
+
293
+ return {
294
+ "text": text,
295
+ "language": detected_language,
296
+ "duration": duration
297
+ }
298
+ except Exception as e:
299
+ logger.error(f"❌ Whisper transcription failed: {str(e)}")
300
+ logger.error(f"πŸ” Error type: {type(e).__name__}")
301
+ raise
302
+
303
+ # Global service instance
304
+ transcription_service = TranscriptionService()
log_monitor.py ADDED
@@ -0,0 +1,195 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Real-time log monitor for Video Transcription Service
4
+ """
5
+
6
+ import requests
7
+ import time
8
+ import sys
9
+ import json
10
+ from datetime import datetime
11
+
12
+ class TranscriptionMonitor:
13
+ def __init__(self, base_url="http://localhost:8000"):
14
+ self.base_url = base_url.rstrip('/')
15
+ self.active_transcriptions = {}
16
+
17
+ def monitor_transcription(self, transcription_id, poll_interval=5):
18
+ """Monitor a specific transcription with real-time updates"""
19
+ print(f"πŸ” Monitoring transcription ID: {transcription_id}")
20
+ print(f"⏱️ Poll interval: {poll_interval} seconds")
21
+ print("=" * 50)
22
+
23
+ start_time = time.time()
24
+ last_status = None
25
+
26
+ while True:
27
+ try:
28
+ response = requests.get(f"{self.base_url}/transcribe/{transcription_id}")
29
+
30
+ if response.status_code == 404:
31
+ print(f"❌ Transcription {transcription_id} not found or expired")
32
+ break
33
+ elif response.status_code != 200:
34
+ print(f"❌ Error checking status: {response.status_code}")
35
+ break
36
+
37
+ result = response.json()
38
+ status = result['status']
39
+ elapsed = time.time() - start_time
40
+
41
+ # Only print updates when status changes or every 30 seconds
42
+ if status != last_status or elapsed % 30 < poll_interval:
43
+ timestamp = datetime.now().strftime("%H:%M:%S")
44
+ print(f"[{timestamp}] πŸ“Š Status: {status.upper()} (elapsed: {elapsed:.1f}s)")
45
+
46
+ if status == 'completed':
47
+ print("πŸŽ‰ Transcription completed!")
48
+ print(f"🌐 Language: {result.get('language', 'N/A')}")
49
+ print(f"⏱️ Duration: {result.get('duration', 'N/A')} seconds")
50
+ text = result.get('text', '')
51
+ if text:
52
+ preview = text[:100] + "..." if len(text) > 100 else text
53
+ print(f"πŸ“ Text preview: {preview}")
54
+ break
55
+ elif status == 'failed':
56
+ print(f"❌ Transcription failed: {result.get('error_message', 'Unknown error')}")
57
+ break
58
+
59
+ last_status = status
60
+ time.sleep(poll_interval)
61
+
62
+ except KeyboardInterrupt:
63
+ print("\nπŸ›‘ Monitoring stopped by user")
64
+ break
65
+ except Exception as e:
66
+ print(f"❌ Error: {e}")
67
+ time.sleep(poll_interval)
68
+
69
+ def list_active_transcriptions(self):
70
+ """List all active transcriptions by checking health endpoint"""
71
+ try:
72
+ response = requests.get(f"{self.base_url}/health")
73
+ if response.status_code == 200:
74
+ health = response.json()
75
+ active = health.get('active_transcriptions', 0)
76
+ print(f"πŸ“Š Active transcriptions: {active}")
77
+ return active
78
+ else:
79
+ print(f"❌ Cannot get health status: {response.status_code}")
80
+ return 0
81
+ except Exception as e:
82
+ print(f"❌ Error checking health: {e}")
83
+ return 0
84
+
85
+ def test_service(self):
86
+ """Test if the service is running"""
87
+ try:
88
+ response = requests.get(f"{self.base_url}/health", timeout=5)
89
+ if response.status_code == 200:
90
+ health = response.json()
91
+ print("βœ… Service is running")
92
+ print(f"πŸ“Š Status: {health.get('status', 'unknown')}")
93
+ print(f"πŸ“Š Active transcriptions: {health.get('active_transcriptions', 0)}")
94
+ return True
95
+ else:
96
+ print(f"❌ Service returned status: {response.status_code}")
97
+ return False
98
+ except requests.exceptions.ConnectionError:
99
+ print(f"❌ Cannot connect to service at {self.base_url}")
100
+ print(" Make sure the service is running with: python main.py")
101
+ return False
102
+ except Exception as e:
103
+ print(f"❌ Error testing service: {e}")
104
+ return False
105
+
106
+ def upload_and_monitor(self, video_file, language=None):
107
+ """Upload a video and monitor its transcription"""
108
+ if not self.test_service():
109
+ return
110
+
111
+ print(f"πŸ“€ Uploading video: {video_file}")
112
+
113
+ try:
114
+ with open(video_file, 'rb') as f:
115
+ files = {'file': f}
116
+ data = {}
117
+ if language:
118
+ data['language'] = language
119
+
120
+ response = requests.post(f"{self.base_url}/transcribe", files=files, data=data)
121
+
122
+ if response.status_code == 200:
123
+ result = response.json()
124
+ transcription_id = result['id']
125
+ print(f"βœ… Upload successful! ID: {transcription_id}")
126
+ print()
127
+ self.monitor_transcription(transcription_id)
128
+ else:
129
+ print(f"❌ Upload failed: {response.status_code}")
130
+ print(response.text)
131
+
132
+ except FileNotFoundError:
133
+ print(f"❌ Video file not found: {video_file}")
134
+ except Exception as e:
135
+ print(f"❌ Upload error: {e}")
136
+
137
+ def main():
138
+ if len(sys.argv) < 2:
139
+ print("Video Transcription Service - Log Monitor")
140
+ print("=" * 40)
141
+ print("Usage:")
142
+ print(" python log_monitor.py test # Test service")
143
+ print(" python log_monitor.py monitor <id> # Monitor transcription")
144
+ print(" python log_monitor.py upload <video_file> # Upload and monitor")
145
+ print(" python log_monitor.py upload <video_file> <lang> # Upload with language")
146
+ print()
147
+ print("Examples:")
148
+ print(" python log_monitor.py test")
149
+ print(" python log_monitor.py monitor 123")
150
+ print(" python log_monitor.py upload video.mp4")
151
+ print(" python log_monitor.py upload video.mp4 en")
152
+ sys.exit(1)
153
+
154
+ # Get API URL from environment or use default
155
+ api_url = sys.argv[-1] if sys.argv[-1].startswith('http') else "http://localhost:8000"
156
+ if api_url != "http://localhost:8000":
157
+ sys.argv = sys.argv[:-1] # Remove URL from args
158
+
159
+ monitor = TranscriptionMonitor(api_url)
160
+ command = sys.argv[1].lower()
161
+
162
+ if command == "test":
163
+ monitor.test_service()
164
+ monitor.list_active_transcriptions()
165
+
166
+ elif command == "monitor":
167
+ if len(sys.argv) < 3:
168
+ print("❌ Please provide transcription ID")
169
+ print("Usage: python log_monitor.py monitor <id>")
170
+ sys.exit(1)
171
+
172
+ try:
173
+ transcription_id = int(sys.argv[2])
174
+ monitor.monitor_transcription(transcription_id)
175
+ except ValueError:
176
+ print("❌ Invalid transcription ID (must be a number)")
177
+ sys.exit(1)
178
+
179
+ elif command == "upload":
180
+ if len(sys.argv) < 3:
181
+ print("❌ Please provide video file")
182
+ print("Usage: python log_monitor.py upload <video_file> [language]")
183
+ sys.exit(1)
184
+
185
+ video_file = sys.argv[2]
186
+ language = sys.argv[3] if len(sys.argv) > 3 else None
187
+ monitor.upload_and_monitor(video_file, language)
188
+
189
+ else:
190
+ print(f"❌ Unknown command: {command}")
191
+ print("Available commands: test, monitor, upload")
192
+ sys.exit(1)
193
+
194
+ if __name__ == "__main__":
195
+ main()
logging_config.py ADDED
@@ -0,0 +1,136 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Logging configuration for Video Transcription Service
3
+ """
4
+
5
+ import logging
6
+ import sys
7
+ from datetime import datetime
8
+
9
+ def setup_logging(level=logging.INFO, log_to_file=False):
10
+ """
11
+ Setup comprehensive logging for the application
12
+
13
+ Args:
14
+ level: Logging level (DEBUG, INFO, WARNING, ERROR)
15
+ log_to_file: Whether to also log to a file
16
+ """
17
+
18
+ # Create formatter with emojis and detailed info
19
+ formatter = logging.Formatter(
20
+ '%(asctime)s - %(name)s - %(levelname)s - %(message)s',
21
+ datefmt='%Y-%m-%d %H:%M:%S'
22
+ )
23
+
24
+ # Setup console handler
25
+ console_handler = logging.StreamHandler(sys.stdout)
26
+ console_handler.setFormatter(formatter)
27
+ console_handler.setLevel(level)
28
+
29
+ handlers = [console_handler]
30
+
31
+ # Setup file handler if requested
32
+ if log_to_file:
33
+ log_filename = f"transcription_service_{datetime.now().strftime('%Y%m%d_%H%M%S')}.log"
34
+ file_handler = logging.FileHandler(log_filename)
35
+ file_handler.setFormatter(formatter)
36
+ file_handler.setLevel(level)
37
+ handlers.append(file_handler)
38
+
39
+ # Configure root logger
40
+ logging.basicConfig(
41
+ level=level,
42
+ handlers=handlers,
43
+ force=True # Override any existing configuration
44
+ )
45
+
46
+ # Set specific logger levels
47
+ loggers = [
48
+ 'main',
49
+ 'transcription_service',
50
+ 'storage',
51
+ 'uvicorn.access',
52
+ 'uvicorn.error'
53
+ ]
54
+
55
+ for logger_name in loggers:
56
+ logger = logging.getLogger(logger_name)
57
+ logger.setLevel(level)
58
+
59
+ # Reduce noise from some third-party libraries
60
+ logging.getLogger('httpx').setLevel(logging.WARNING)
61
+ logging.getLogger('httpcore').setLevel(logging.WARNING)
62
+
63
+ return logging.getLogger(__name__)
64
+
65
+ def get_progress_logger():
66
+ """Get a logger specifically for progress tracking"""
67
+ logger = logging.getLogger('progress')
68
+ logger.setLevel(logging.INFO)
69
+ return logger
70
+
71
+ # Progress tracking functions
72
+ def log_step(step_name: str, transcription_id: int = None):
73
+ """Log a processing step"""
74
+ logger = get_progress_logger()
75
+ if transcription_id:
76
+ logger.info(f"πŸ”„ [{transcription_id}] {step_name}")
77
+ else:
78
+ logger.info(f"πŸ”„ {step_name}")
79
+
80
+ def log_success(message: str, transcription_id: int = None):
81
+ """Log a success message"""
82
+ logger = get_progress_logger()
83
+ if transcription_id:
84
+ logger.info(f"βœ… [{transcription_id}] {message}")
85
+ else:
86
+ logger.info(f"βœ… {message}")
87
+
88
+ def log_error(message: str, transcription_id: int = None):
89
+ """Log an error message"""
90
+ logger = get_progress_logger()
91
+ if transcription_id:
92
+ logger.error(f"❌ [{transcription_id}] {message}")
93
+ else:
94
+ logger.error(f"❌ {message}")
95
+
96
+ def log_warning(message: str, transcription_id: int = None):
97
+ """Log a warning message"""
98
+ logger = get_progress_logger()
99
+ if transcription_id:
100
+ logger.warning(f"⚠️ [{transcription_id}] {message}")
101
+ else:
102
+ logger.warning(f"⚠️ {message}")
103
+
104
+ def log_info(message: str, transcription_id: int = None):
105
+ """Log an info message"""
106
+ logger = get_progress_logger()
107
+ if transcription_id:
108
+ logger.info(f"ℹ️ [{transcription_id}] {message}")
109
+ else:
110
+ logger.info(f"ℹ️ {message}")
111
+
112
+ def log_progress_summary(transcription_id: int, total_time: float, status: str):
113
+ """Log a summary of transcription progress"""
114
+ logger = get_progress_logger()
115
+ logger.info(f"πŸ“Š [{transcription_id}] SUMMARY:")
116
+ logger.info(f" Status: {status}")
117
+ logger.info(f" Total Time: {total_time:.2f} seconds")
118
+ logger.info(f" Timestamp: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
119
+
120
+ # Example usage and testing
121
+ if __name__ == "__main__":
122
+ # Test the logging configuration
123
+ setup_logging(level=logging.INFO)
124
+
125
+ logger = logging.getLogger(__name__)
126
+ logger.info("πŸ§ͺ Testing logging configuration...")
127
+
128
+ # Test progress logging
129
+ log_step("Starting test transcription", 123)
130
+ log_info("Processing video file", 123)
131
+ log_success("Audio extraction completed", 123)
132
+ log_warning("Large file detected", 123)
133
+ log_error("Test error message", 123)
134
+ log_progress_summary(123, 45.6, "completed")
135
+
136
+ logger.info("βœ… Logging test completed")
main.py ADDED
@@ -0,0 +1,295 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import FastAPI, File, UploadFile, HTTPException, Request, Depends
2
+ from fastapi.responses import JSONResponse
3
+ from fastapi.middleware.cors import CORSMiddleware
4
+ import asyncio
5
+ import logging
6
+ import os
7
+ from pathlib import Path
8
+ from slowapi import Limiter, _rate_limit_exceeded_handler
9
+ from slowapi.util import get_remote_address
10
+ from slowapi.errors import RateLimitExceeded
11
+
12
+ from config import settings
13
+ from models import (
14
+ TranscriptionRequest, TranscriptionResponse, TranscriptionResult,
15
+ ErrorResponse, TranscriptionStatus
16
+ )
17
+ from storage import storage
18
+ from transcription_service import transcription_service
19
+
20
+ # Configure logging and restart prevention
21
+ from logging_config import setup_logging, log_step, log_success, log_error, log_info, log_progress_summary
22
+ from restart_handler import setup_restart_prevention, apply_optimal_settings, check_service_health
23
+
24
+ # Apply optimal settings for the environment
25
+ apply_optimal_settings()
26
+
27
+ # Setup logging (can be controlled via environment variable)
28
+ log_level = logging.DEBUG if os.getenv("DEBUG", "false").lower() == "true" else logging.INFO
29
+ setup_logging(level=log_level, log_to_file=os.getenv("LOG_TO_FILE", "false").lower() == "true")
30
+ logger = logging.getLogger(__name__)
31
+
32
+ # Setup restart prevention
33
+ setup_restart_prevention()
34
+
35
+ # Initialize rate limiter
36
+ limiter = Limiter(key_func=get_remote_address)
37
+
38
+ # Create FastAPI app
39
+ app = FastAPI(
40
+ title="Video Transcription Service",
41
+ description="A free video transcription service using OpenAI Whisper",
42
+ version="1.0.0",
43
+ docs_url="/docs",
44
+ redoc_url="/redoc"
45
+ )
46
+
47
+ # Add rate limiting
48
+ app.state.limiter = limiter
49
+ app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
50
+
51
+ # Add CORS middleware
52
+ app.add_middleware(
53
+ CORSMiddleware,
54
+ allow_origins=["*"],
55
+ allow_credentials=True,
56
+ allow_methods=["*"],
57
+ allow_headers=["*"],
58
+ )
59
+
60
+ @app.on_event("startup")
61
+ async def startup_event():
62
+ """Initialize services on startup"""
63
+ logger.info("πŸš€ Starting Video Transcription Service")
64
+ logger.info("=" * 50)
65
+ logger.info("πŸ“‹ Service Configuration:")
66
+ logger.info(f" πŸ€– Whisper Model: {settings.WHISPER_MODEL}")
67
+ logger.info(f" πŸ“ Max File Size: {settings.MAX_FILE_SIZE // (1024*1024)}MB")
68
+ logger.info(f" πŸ•’ Cleanup Interval: {settings.CLEANUP_INTERVAL_HOURS} hours")
69
+ logger.info(f" 🚦 Rate Limit: {settings.RATE_LIMIT_REQUESTS} requests/minute")
70
+ logger.info(f" 🌐 Host: {settings.HOST}:{settings.PORT}")
71
+ logger.info(f" πŸ“ Supported Formats: {', '.join(settings.ALLOWED_EXTENSIONS)}")
72
+ logger.info(f" ⚑ Model Preload: {settings.MODEL_PRELOAD}")
73
+ logger.info("=" * 50)
74
+
75
+ log_step("Initializing storage cleanup task")
76
+ await storage.start_cleanup_task()
77
+
78
+ # Preload Whisper model to avoid request timeouts
79
+ if settings.MODEL_PRELOAD:
80
+ log_step("Preloading Whisper model (prevents request timeouts)")
81
+ model_loaded = await transcription_service.preload_model()
82
+ if model_loaded:
83
+ log_success("Whisper model preloaded successfully")
84
+ else:
85
+ logger.warning("⚠️ Model preload failed - will try to load during requests")
86
+ else:
87
+ logger.info("⚠️ Model preload disabled - will load during first request")
88
+
89
+ log_success("Service startup completed")
90
+
91
+ @app.on_event("shutdown")
92
+ async def shutdown_event():
93
+ """Cleanup on shutdown"""
94
+ logger.info("πŸ›‘ Shutting down Video Transcription Service")
95
+ log_step("Stopping cleanup task")
96
+ await storage.stop_cleanup_task()
97
+ log_success("Service shutdown completed")
98
+
99
+ def validate_file(file: UploadFile) -> None:
100
+ """Validate uploaded file"""
101
+ logger.info(f"πŸ“ Validating file: {file.filename}")
102
+
103
+ if not file.filename:
104
+ logger.error("❌ No filename provided")
105
+ raise HTTPException(status_code=400, detail="No file provided")
106
+
107
+ # Check file extension
108
+ file_ext = Path(file.filename).suffix.lower()
109
+ logger.info(f"πŸ” File extension: {file_ext}")
110
+
111
+ if file_ext not in settings.ALLOWED_EXTENSIONS:
112
+ logger.error(f"❌ Unsupported file format: {file_ext}")
113
+ raise HTTPException(
114
+ status_code=400,
115
+ detail=f"Unsupported file format. Allowed: {', '.join(settings.ALLOWED_EXTENSIONS)}"
116
+ )
117
+
118
+ logger.info(f"βœ… File format validation passed: {file_ext}")
119
+
120
+ async def validate_file_size(file: UploadFile) -> bytes:
121
+ """Validate file size and return content"""
122
+ logger.info(f"πŸ“Š Reading file content for size validation...")
123
+ content = await file.read()
124
+ file_size_mb = len(content) / (1024 * 1024)
125
+ max_size_mb = settings.MAX_FILE_SIZE // (1024 * 1024)
126
+
127
+ logger.info(f"πŸ“ File size: {file_size_mb:.2f}MB (max: {max_size_mb}MB)")
128
+
129
+ if len(content) > settings.MAX_FILE_SIZE:
130
+ logger.error(f"❌ File too large: {file_size_mb:.2f}MB > {max_size_mb}MB")
131
+ raise HTTPException(
132
+ status_code=413,
133
+ detail=f"File too large. Maximum size: {max_size_mb}MB"
134
+ )
135
+
136
+ if len(content) == 0:
137
+ logger.error("❌ Empty file detected")
138
+ raise HTTPException(status_code=400, detail="Empty file")
139
+
140
+ logger.info(f"βœ… File size validation passed: {file_size_mb:.2f}MB")
141
+ return content
142
+
143
+ @app.get("/")
144
+ async def root():
145
+ """Health check endpoint"""
146
+ return {
147
+ "service": "Video Transcription Service",
148
+ "status": "running",
149
+ "version": "1.0.0",
150
+ "docs": "/docs"
151
+ }
152
+
153
+ @app.post("/transcribe", response_model=TranscriptionResponse)
154
+ @limiter.limit(f"{settings.RATE_LIMIT_REQUESTS}/minute")
155
+ async def transcribe_video(
156
+ request: Request,
157
+ file: UploadFile = File(...),
158
+ language: str = None
159
+ ):
160
+ """
161
+ Upload a video file for transcription
162
+
163
+ - **file**: Video file (MP4, AVI, MOV, etc.) - Max 100MB
164
+ - **language**: Optional language code (e.g., 'en', 'es', 'fr') - Auto-detect if not provided
165
+
166
+ Returns transcription ID for status checking
167
+ """
168
+ try:
169
+ logger.info(f"πŸš€ Starting transcription request for file: {file.filename}")
170
+ logger.info(f"🌐 Language specified: {language or 'auto-detect'}")
171
+
172
+ # Validate file
173
+ validate_file(file)
174
+
175
+ # Read and validate file content
176
+ content = await validate_file_size(file)
177
+
178
+ # Create transcription entry
179
+ logger.info("πŸ“ Creating transcription entry in storage...")
180
+ transcription_id = storage.create_transcription(language=language)
181
+ logger.info(f"πŸ†” Transcription ID created: {transcription_id}")
182
+
183
+ # Start transcription in background
184
+ logger.info(f"⚑ Starting background transcription task for ID: {transcription_id}")
185
+ asyncio.create_task(
186
+ transcription_service.transcribe_video(content, transcription_id, language)
187
+ )
188
+
189
+ logger.info(f"βœ… Transcription request accepted successfully - ID: {transcription_id}")
190
+ return TranscriptionResponse(
191
+ id=transcription_id,
192
+ status=TranscriptionStatus.PENDING,
193
+ message="Transcription started. Use the ID to check status.",
194
+ created_at=storage.get_transcription(transcription_id).created_at
195
+ )
196
+
197
+ except HTTPException:
198
+ raise
199
+ except Exception as e:
200
+ logger.error(f"Error in transcribe endpoint: {str(e)}")
201
+ return JSONResponse(
202
+ status_code=500,
203
+ content=ErrorResponse(
204
+ id=0,
205
+ error="internal_error",
206
+ message="An internal error occurred"
207
+ ).dict()
208
+ )
209
+
210
+ @app.get("/transcribe/{transcription_id}", response_model=TranscriptionResult)
211
+ async def get_transcription(transcription_id: int):
212
+ """
213
+ Get transcription status and results
214
+
215
+ - **transcription_id**: ID returned from the transcribe endpoint
216
+
217
+ Returns transcription status and text (if completed)
218
+ """
219
+ try:
220
+ logger.info(f"πŸ” Looking up transcription ID: {transcription_id}")
221
+ result = storage.get_transcription(transcription_id)
222
+
223
+ if not result:
224
+ logger.warning(f"❌ Transcription not found: {transcription_id}")
225
+ return JSONResponse(
226
+ status_code=404,
227
+ content=ErrorResponse(
228
+ id=0,
229
+ error="not_found",
230
+ message="Transcription not found or expired"
231
+ ).dict()
232
+ )
233
+
234
+ logger.info(f"πŸ“Š Transcription status for ID {transcription_id}: {result.status}")
235
+ if result.status == TranscriptionStatus.COMPLETED:
236
+ text_preview = result.text[:100] + "..." if result.text and len(result.text) > 100 else result.text
237
+ logger.info(f"βœ… Transcription completed - Preview: {text_preview}")
238
+ elif result.status == TranscriptionStatus.FAILED:
239
+ logger.error(f"❌ Transcription failed for ID {transcription_id}: {result.error_message}")
240
+
241
+ return result
242
+
243
+ except Exception as e:
244
+ logger.error(f"Error in get_transcription endpoint: {str(e)}")
245
+ return JSONResponse(
246
+ status_code=500,
247
+ content=ErrorResponse(
248
+ id=0,
249
+ error="internal_error",
250
+ message="An internal error occurred"
251
+ ).dict()
252
+ )
253
+
254
+ @app.get("/health")
255
+ async def health_check():
256
+ """Detailed health check"""
257
+ # Check model status
258
+ model_status = "not_loaded"
259
+ if transcription_service._model is not None:
260
+ model_status = "loaded"
261
+ elif transcription_service._model_loading:
262
+ model_status = "loading"
263
+ elif transcription_service._model_load_error:
264
+ model_status = "error"
265
+
266
+ active_transcriptions = 0
267
+ total_transcriptions = 0
268
+
269
+ if hasattr(storage, '_storage'):
270
+ total_transcriptions = len(storage._storage)
271
+ active_transcriptions = len([
272
+ t for t in storage._storage.values()
273
+ if t.status in [TranscriptionStatus.PENDING, TranscriptionStatus.PROCESSING]
274
+ ])
275
+
276
+ return {
277
+ "status": "healthy" if model_status in ["loaded", "loading"] else "degraded",
278
+ "model_status": model_status,
279
+ "model_name": settings.WHISPER_MODEL,
280
+ "model_error": transcription_service._model_load_error,
281
+ "total_transcriptions": total_transcriptions,
282
+ "active_transcriptions": active_transcriptions,
283
+ "max_file_size_mb": settings.MAX_FILE_SIZE // (1024*1024),
284
+ "supported_formats": settings.ALLOWED_EXTENSIONS,
285
+ "uptime_check": datetime.now().isoformat()
286
+ }
287
+
288
+ if __name__ == "__main__":
289
+ import uvicorn
290
+ uvicorn.run(
291
+ "main:app",
292
+ host=settings.HOST,
293
+ port=settings.PORT,
294
+ reload=False
295
+ )
models.py ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pydantic import BaseModel
2
+ from typing import Optional
3
+ from enum import Enum
4
+ from datetime import datetime
5
+
6
+ class TranscriptionStatus(str, Enum):
7
+ PENDING = "pending"
8
+ PROCESSING = "processing"
9
+ COMPLETED = "completed"
10
+ FAILED = "failed"
11
+
12
+ class TranscriptionRequest(BaseModel):
13
+ language: Optional[str] = None # Auto-detect if None
14
+
15
+ class TranscriptionResponse(BaseModel):
16
+ id: int
17
+ status: TranscriptionStatus
18
+ message: str
19
+ created_at: datetime
20
+
21
+ class TranscriptionResult(BaseModel):
22
+ id: int
23
+ status: TranscriptionStatus
24
+ text: Optional[str] = None
25
+ language: Optional[str] = None
26
+ duration: Optional[float] = None
27
+ created_at: datetime
28
+ completed_at: Optional[datetime] = None
29
+ error_message: Optional[str] = None
30
+
31
+ class ErrorResponse(BaseModel):
32
+ id: int = 0
33
+ error: str
34
+ message: str
requirements.txt ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ gradio==4.44.0
2
+ fastapi==0.104.1
3
+ uvicorn[standard]==0.24.0
4
+ python-multipart==0.0.6
5
+ openai-whisper==20231117
6
+ torch==2.1.0
7
+ torchaudio==2.1.0
8
+ ffmpeg-python==0.2.0
9
+ pydantic==2.5.0
10
+ slowapi==0.1.9
11
+ aiofiles==23.2.1
12
+ httpx==0.25.2
13
+ numpy<2.0.0
14
+ psutil==5.9.6
restart_handler.py ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Restart handler for Video Transcription Service
4
+ Helps prevent restarts due to memory/timeout issues
5
+ """
6
+
7
+ import os
8
+ import signal
9
+ import sys
10
+ import time
11
+ import logging
12
+ import psutil
13
+ from datetime import datetime
14
+
15
+ logger = logging.getLogger(__name__)
16
+
17
+ class RestartHandler:
18
+ def __init__(self):
19
+ self.start_time = time.time()
20
+ self.restart_count = 0
21
+ self.memory_warnings = 0
22
+
23
+ def setup_signal_handlers(self):
24
+ """Setup signal handlers for graceful shutdown"""
25
+ signal.signal(signal.SIGTERM, self._signal_handler)
26
+ signal.signal(signal.SIGINT, self._signal_handler)
27
+
28
+ def _signal_handler(self, signum, frame):
29
+ """Handle shutdown signals gracefully"""
30
+ logger.info(f"πŸ›‘ Received signal {signum}, shutting down gracefully...")
31
+
32
+ # Log service statistics
33
+ uptime = time.time() - self.start_time
34
+ logger.info(f"πŸ“Š Service uptime: {uptime:.1f} seconds")
35
+ logger.info(f"πŸ”„ Restart count: {self.restart_count}")
36
+ logger.info(f"⚠️ Memory warnings: {self.memory_warnings}")
37
+
38
+ sys.exit(0)
39
+
40
+ def check_memory_usage(self):
41
+ """Check memory usage and warn if high"""
42
+ try:
43
+ process = psutil.Process()
44
+ memory_info = process.memory_info()
45
+ memory_mb = memory_info.rss / (1024 * 1024)
46
+
47
+ # Warn if using more than 400MB (80% of 512MB limit)
48
+ if memory_mb > 400:
49
+ self.memory_warnings += 1
50
+ logger.warning(f"⚠️ High memory usage: {memory_mb:.1f}MB (limit: 512MB)")
51
+ logger.warning("πŸ’‘ Consider using 'tiny' model or smaller files")
52
+ return True
53
+ elif memory_mb > 300:
54
+ logger.info(f"πŸ“Š Memory usage: {memory_mb:.1f}MB")
55
+
56
+ return False
57
+ except Exception as e:
58
+ logger.error(f"❌ Error checking memory: {e}")
59
+ return False
60
+
61
+ def log_system_info(self):
62
+ """Log system information for debugging"""
63
+ try:
64
+ logger.info("πŸ–₯️ System Information:")
65
+ logger.info(f" Python: {sys.version.split()[0]}")
66
+ logger.info(f" Platform: {sys.platform}")
67
+
68
+ if hasattr(psutil, 'virtual_memory'):
69
+ memory = psutil.virtual_memory()
70
+ logger.info(f" Total Memory: {memory.total / (1024**3):.1f}GB")
71
+ logger.info(f" Available Memory: {memory.available / (1024**3):.1f}GB")
72
+
73
+ if hasattr(psutil, 'cpu_count'):
74
+ logger.info(f" CPU Cores: {psutil.cpu_count()}")
75
+
76
+ except Exception as e:
77
+ logger.warning(f"⚠️ Could not get system info: {e}")
78
+
79
+ def create_restart_prevention_tips(self):
80
+ """Create tips to prevent restarts"""
81
+ tips = [
82
+ "πŸ”§ Restart Prevention Tips:",
83
+ "1. Use WHISPER_MODEL=tiny for faster loading and less memory",
84
+ "2. Keep video files under 50MB for free tier",
85
+ "3. Process one video at a time",
86
+ "4. Enable model preloading: MODEL_PRELOAD=true",
87
+ "5. Monitor memory usage in logs",
88
+ "6. Use DEBUG=false in production to reduce log overhead"
89
+ ]
90
+
91
+ for tip in tips:
92
+ logger.info(tip)
93
+
94
+ # Global restart handler instance
95
+ restart_handler = RestartHandler()
96
+
97
+ def setup_restart_prevention():
98
+ """Setup restart prevention measures"""
99
+ restart_handler.setup_signal_handlers()
100
+ restart_handler.log_system_info()
101
+ restart_handler.create_restart_prevention_tips()
102
+
103
+ def check_service_health():
104
+ """Check service health and log warnings"""
105
+ return restart_handler.check_memory_usage()
106
+
107
+ # Environment variable helpers
108
+ def get_optimal_settings():
109
+ """Get optimal settings for the current environment"""
110
+ settings = {}
111
+
112
+ # Detect if running on free tier (limited memory)
113
+ try:
114
+ memory = psutil.virtual_memory()
115
+ total_gb = memory.total / (1024**3)
116
+
117
+ if total_gb < 1: # Less than 1GB = likely free tier
118
+ logger.info("πŸ” Detected limited memory environment")
119
+ settings.update({
120
+ "WHISPER_MODEL": "tiny",
121
+ "MAX_FILE_SIZE": 50 * 1024 * 1024, # 50MB
122
+ "MODEL_PRELOAD": "true",
123
+ "DEBUG": "false"
124
+ })
125
+ else:
126
+ logger.info("πŸ” Detected standard memory environment")
127
+ settings.update({
128
+ "WHISPER_MODEL": "base",
129
+ "MAX_FILE_SIZE": 100 * 1024 * 1024, # 100MB
130
+ "MODEL_PRELOAD": "true"
131
+ })
132
+
133
+ except Exception:
134
+ # Fallback to conservative settings
135
+ settings.update({
136
+ "WHISPER_MODEL": "tiny",
137
+ "MAX_FILE_SIZE": 50 * 1024 * 1024,
138
+ "MODEL_PRELOAD": "true"
139
+ })
140
+
141
+ return settings
142
+
143
+ def apply_optimal_settings():
144
+ """Apply optimal settings if not already set"""
145
+ optimal = get_optimal_settings()
146
+ applied = []
147
+
148
+ for key, value in optimal.items():
149
+ if not os.getenv(key):
150
+ os.environ[key] = str(value)
151
+ applied.append(f"{key}={value}")
152
+
153
+ if applied:
154
+ logger.info("βš™οΈ Applied optimal settings:")
155
+ for setting in applied:
156
+ logger.info(f" {setting}")
157
+
158
+ if __name__ == "__main__":
159
+ # Test the restart handler
160
+ logging.basicConfig(level=logging.INFO)
161
+
162
+ setup_restart_prevention()
163
+ apply_optimal_settings()
164
+
165
+ logger.info("βœ… Restart handler test completed")
setup.py ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Setup script for Video Transcription Service
4
+ """
5
+
6
+ import subprocess
7
+ import sys
8
+ import os
9
+ import platform
10
+
11
+ def run_command(command, description):
12
+ """Run a command and handle errors"""
13
+ print(f"πŸ“¦ {description}...")
14
+ try:
15
+ result = subprocess.run(command, shell=True, check=True, capture_output=True, text=True)
16
+ print(f"βœ… {description} completed")
17
+ return True
18
+ except subprocess.CalledProcessError as e:
19
+ print(f"❌ {description} failed:")
20
+ print(f" Command: {command}")
21
+ print(f" Error: {e.stderr}")
22
+ return False
23
+
24
+ def check_python_version():
25
+ """Check if Python version is compatible"""
26
+ version = sys.version_info
27
+ if version.major < 3 or (version.major == 3 and version.minor < 8):
28
+ print(f"❌ Python 3.8+ required, found {version.major}.{version.minor}")
29
+ return False
30
+ print(f"βœ… Python {version.major}.{version.minor}.{version.micro} is compatible")
31
+ return True
32
+
33
+ def install_python_dependencies():
34
+ """Install Python dependencies"""
35
+ commands = [
36
+ ("pip install --upgrade pip", "Upgrading pip"),
37
+ ("pip install 'numpy<2.0.0'", "Installing compatible NumPy version"),
38
+ ("pip install -r requirements.txt", "Installing Python packages")
39
+ ]
40
+
41
+ for command, description in commands:
42
+ if not run_command(command, description):
43
+ return False
44
+ return True
45
+
46
+ def check_ffmpeg():
47
+ """Check if FFmpeg is installed"""
48
+ try:
49
+ subprocess.run(['ffmpeg', '-version'], capture_output=True, check=True)
50
+ print("βœ… FFmpeg is installed")
51
+ return True
52
+ except (subprocess.CalledProcessError, FileNotFoundError):
53
+ print("❌ FFmpeg not found")
54
+ return False
55
+
56
+ def install_ffmpeg_instructions():
57
+ """Show FFmpeg installation instructions"""
58
+ system = platform.system().lower()
59
+
60
+ print("\nπŸ“‹ FFmpeg Installation Instructions:")
61
+ print("=" * 40)
62
+
63
+ if system == "windows":
64
+ print("Windows:")
65
+ print("1. Download FFmpeg from: https://ffmpeg.org/download.html")
66
+ print("2. Extract to C:\\ffmpeg")
67
+ print("3. Add C:\\ffmpeg\\bin to your PATH environment variable")
68
+ print("4. Restart your terminal/command prompt")
69
+ elif system == "darwin": # macOS
70
+ print("macOS:")
71
+ print("1. Install Homebrew if not already installed:")
72
+ print(" /bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"")
73
+ print("2. Install FFmpeg:")
74
+ print(" brew install ffmpeg")
75
+ else: # Linux
76
+ print("Linux (Ubuntu/Debian):")
77
+ print(" sudo apt-get update && sudo apt-get install ffmpeg")
78
+ print("\nLinux (CentOS/RHEL):")
79
+ print(" sudo yum install ffmpeg")
80
+ print("\nLinux (Arch):")
81
+ print(" sudo pacman -S ffmpeg")
82
+
83
+ def create_virtual_environment():
84
+ """Create and activate virtual environment"""
85
+ if os.path.exists('venv'):
86
+ print("βœ… Virtual environment already exists")
87
+ return True
88
+
89
+ if not run_command(f"{sys.executable} -m venv venv", "Creating virtual environment"):
90
+ return False
91
+
92
+ print("\nπŸ“ To activate the virtual environment:")
93
+ if platform.system().lower() == "windows":
94
+ print(" venv\\Scripts\\activate")
95
+ else:
96
+ print(" source venv/bin/activate")
97
+
98
+ return True
99
+
100
+ def main():
101
+ print("πŸš€ Video Transcription Service Setup")
102
+ print("=" * 40)
103
+
104
+ # Check Python version
105
+ if not check_python_version():
106
+ sys.exit(1)
107
+
108
+ # Create virtual environment
109
+ print("\n1. Setting up virtual environment...")
110
+ if not create_virtual_environment():
111
+ print("❌ Failed to create virtual environment")
112
+ sys.exit(1)
113
+
114
+ # Install Python dependencies
115
+ print("\n2. Installing Python dependencies...")
116
+ if not install_python_dependencies():
117
+ print("❌ Failed to install Python dependencies")
118
+ print("\nπŸ’‘ Try running these commands manually:")
119
+ print(" pip install --upgrade pip")
120
+ print(" pip install -r requirements.txt")
121
+ sys.exit(1)
122
+
123
+ # Check FFmpeg
124
+ print("\n3. Checking FFmpeg...")
125
+ if not check_ffmpeg():
126
+ install_ffmpeg_instructions()
127
+ print("\n⚠️ Please install FFmpeg and run this setup again")
128
+ sys.exit(1)
129
+
130
+ # Success
131
+ print("\nπŸŽ‰ Setup completed successfully!")
132
+ print("=" * 40)
133
+ print("\nπŸ“‹ Next steps:")
134
+ print("1. Activate virtual environment (if not already active)")
135
+ if platform.system().lower() == "windows":
136
+ print(" venv\\Scripts\\activate")
137
+ else:
138
+ print(" source venv/bin/activate")
139
+ print("2. Start the service:")
140
+ print(" python start.py")
141
+ print(" OR")
142
+ print(" python main.py")
143
+ print("3. Open your browser to:")
144
+ print(" http://localhost:8000/docs")
145
+ print("\nπŸ“– For deployment instructions, see DEPLOYMENT.md")
146
+
147
+ if __name__ == "__main__":
148
+ main()
start.py ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Development startup script for Video Transcription Service
4
+ """
5
+
6
+ import subprocess
7
+ import sys
8
+ import os
9
+ import time
10
+ import requests
11
+
12
+ def check_dependencies():
13
+ """Check if required dependencies are installed"""
14
+ print("Checking dependencies...")
15
+
16
+ # Check Python packages
17
+ try:
18
+ import fastapi
19
+ import whisper
20
+ import ffmpeg
21
+ print("βœ… Python packages installed")
22
+ except ImportError as e:
23
+ print(f"❌ Missing Python package: {e}")
24
+ print("Run: pip install -r requirements.txt")
25
+ return False
26
+
27
+ # Check FFmpeg
28
+ try:
29
+ subprocess.run(['ffmpeg', '-version'], capture_output=True, check=True)
30
+ print("βœ… FFmpeg installed")
31
+ except (subprocess.CalledProcessError, FileNotFoundError):
32
+ print("❌ FFmpeg not found")
33
+ print("Install FFmpeg:")
34
+ print(" Windows: Download from https://ffmpeg.org/download.html")
35
+ print(" macOS: brew install ffmpeg")
36
+ print(" Linux: sudo apt-get install ffmpeg")
37
+ return False
38
+
39
+ return True
40
+
41
+ def start_server():
42
+ """Start the development server"""
43
+ print("\nStarting Video Transcription Service...")
44
+ print("=" * 50)
45
+
46
+ try:
47
+ # Start the server
48
+ process = subprocess.Popen([
49
+ sys.executable, '-m', 'uvicorn',
50
+ 'main:app',
51
+ '--host', '0.0.0.0',
52
+ '--port', '8000',
53
+ '--reload'
54
+ ])
55
+
56
+ # Wait for server to start
57
+ print("Waiting for server to start...")
58
+ for i in range(30): # Wait up to 30 seconds
59
+ try:
60
+ response = requests.get('http://localhost:8000/health', timeout=1)
61
+ if response.status_code == 200:
62
+ break
63
+ except:
64
+ pass
65
+ time.sleep(1)
66
+ print(f" Attempt {i+1}/30...")
67
+ else:
68
+ print("❌ Server failed to start within 30 seconds")
69
+ process.terminate()
70
+ return False
71
+
72
+ print("\nπŸš€ Server started successfully!")
73
+ print("=" * 50)
74
+ print("πŸ“ Service URL: http://localhost:8000")
75
+ print("πŸ“– API Docs: http://localhost:8000/docs")
76
+ print("πŸ” Health Check: http://localhost:8000/health")
77
+ print("=" * 50)
78
+ print("\nPress Ctrl+C to stop the server")
79
+
80
+ # Wait for user to stop
81
+ try:
82
+ process.wait()
83
+ except KeyboardInterrupt:
84
+ print("\n\nStopping server...")
85
+ process.terminate()
86
+ process.wait()
87
+ print("βœ… Server stopped")
88
+
89
+ return True
90
+
91
+ except Exception as e:
92
+ print(f"❌ Failed to start server: {e}")
93
+ return False
94
+
95
+ def main():
96
+ print("Video Transcription Service - Development Startup")
97
+ print("=" * 50)
98
+
99
+ # Check if we're in the right directory
100
+ if not os.path.exists('main.py'):
101
+ print("❌ main.py not found. Make sure you're in the project directory.")
102
+ sys.exit(1)
103
+
104
+ # Check dependencies
105
+ if not check_dependencies():
106
+ sys.exit(1)
107
+
108
+ # Start server
109
+ if not start_server():
110
+ sys.exit(1)
111
+
112
+ if __name__ == "__main__":
113
+ main()
start_robust.py ADDED
@@ -0,0 +1,155 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Robust startup script for Video Transcription Service
4
+ Handles restarts and optimizes for free tier hosting
5
+ """
6
+
7
+ import os
8
+ import sys
9
+ import time
10
+ import subprocess
11
+ import logging
12
+ from datetime import datetime
13
+
14
+ # Configure logging
15
+ logging.basicConfig(
16
+ level=logging.INFO,
17
+ format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
18
+ )
19
+ logger = logging.getLogger(__name__)
20
+
21
+ def detect_environment():
22
+ """Detect if running on Render.com or locally"""
23
+ if os.getenv("RENDER"):
24
+ return "render"
25
+ elif os.getenv("PORT"):
26
+ return "cloud"
27
+ else:
28
+ return "local"
29
+
30
+ def get_optimal_env_vars():
31
+ """Get optimal environment variables for the detected environment"""
32
+ env = detect_environment()
33
+
34
+ base_vars = {
35
+ "PYTHONUNBUFFERED": "1",
36
+ "MODEL_PRELOAD": "true"
37
+ }
38
+
39
+ if env == "render":
40
+ logger.info("🌐 Detected Render.com environment")
41
+ base_vars.update({
42
+ "WHISPER_MODEL": "tiny", # Faster loading, less memory
43
+ "DEBUG": "false", # Reduce log overhead
44
+ "LOG_TO_FILE": "false" # No file logging on Render
45
+ })
46
+ elif env == "cloud":
47
+ logger.info("☁️ Detected cloud environment")
48
+ base_vars.update({
49
+ "WHISPER_MODEL": "tiny",
50
+ "DEBUG": "false"
51
+ })
52
+ else:
53
+ logger.info("πŸ’» Detected local environment")
54
+ base_vars.update({
55
+ "WHISPER_MODEL": os.getenv("WHISPER_MODEL", "base"),
56
+ "DEBUG": os.getenv("DEBUG", "true")
57
+ })
58
+
59
+ return base_vars
60
+
61
+ def preload_model():
62
+ """Preload the Whisper model to avoid request timeouts"""
63
+ try:
64
+ logger.info("πŸ€– Preloading Whisper model...")
65
+
66
+ # Import and load model
67
+ import whisper
68
+ model_name = os.getenv("WHISPER_MODEL", "tiny")
69
+
70
+ start_time = time.time()
71
+ model = whisper.load_model(model_name)
72
+ load_time = time.time() - start_time
73
+
74
+ logger.info(f"βœ… Model '{model_name}' preloaded in {load_time:.2f} seconds")
75
+ return True
76
+
77
+ except Exception as e:
78
+ logger.error(f"❌ Model preload failed: {e}")
79
+ return False
80
+
81
+ def start_service():
82
+ """Start the FastAPI service with optimal settings"""
83
+ env_vars = get_optimal_env_vars()
84
+
85
+ # Set environment variables
86
+ for key, value in env_vars.items():
87
+ if not os.getenv(key):
88
+ os.environ[key] = value
89
+ logger.info(f"βš™οΈ Set {key}={value}")
90
+
91
+ # Log configuration
92
+ logger.info("πŸ“‹ Service Configuration:")
93
+ logger.info(f" πŸ€– Whisper Model: {os.getenv('WHISPER_MODEL', 'base')}")
94
+ logger.info(f" πŸ”§ Debug Mode: {os.getenv('DEBUG', 'false')}")
95
+ logger.info(f" πŸ“₯ Model Preload: {os.getenv('MODEL_PRELOAD', 'true')}")
96
+ logger.info(f" 🌐 Port: {os.getenv('PORT', '8000')}")
97
+
98
+ # Preload model if enabled
99
+ if os.getenv("MODEL_PRELOAD", "true").lower() == "true":
100
+ if not preload_model():
101
+ logger.warning("⚠️ Continuing without model preload...")
102
+
103
+ # Start the service
104
+ try:
105
+ logger.info("πŸš€ Starting FastAPI service...")
106
+
107
+ # Use uvicorn directly
108
+ import uvicorn
109
+ from main import app
110
+
111
+ port = int(os.getenv("PORT", 8000))
112
+ host = os.getenv("HOST", "0.0.0.0")
113
+
114
+ uvicorn.run(
115
+ app,
116
+ host=host,
117
+ port=port,
118
+ log_level="info",
119
+ access_log=True,
120
+ timeout_keep_alive=30,
121
+ timeout_graceful_shutdown=30
122
+ )
123
+
124
+ except KeyboardInterrupt:
125
+ logger.info("πŸ›‘ Service stopped by user")
126
+ except Exception as e:
127
+ logger.error(f"❌ Service failed: {e}")
128
+ sys.exit(1)
129
+
130
+ def check_dependencies():
131
+ """Check if all dependencies are installed"""
132
+ try:
133
+ import fastapi
134
+ import whisper
135
+ import torch
136
+ logger.info("βœ… Core dependencies available")
137
+ return True
138
+ except ImportError as e:
139
+ logger.error(f"❌ Missing dependency: {e}")
140
+ logger.error("Run: pip install -r requirements.txt")
141
+ return False
142
+
143
+ def main():
144
+ logger.info("πŸš€ Video Transcription Service - Robust Startup")
145
+ logger.info("=" * 50)
146
+
147
+ # Check dependencies
148
+ if not check_dependencies():
149
+ sys.exit(1)
150
+
151
+ # Start service
152
+ start_service()
153
+
154
+ if __name__ == "__main__":
155
+ main()
storage.py ADDED
@@ -0,0 +1,158 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import asyncio
2
+ from datetime import datetime, timedelta, timezone
3
+ from typing import Dict, Optional
4
+ from models import TranscriptionResult, TranscriptionStatus
5
+ from config import settings
6
+ import logging
7
+
8
+ # Configure logging for this module
9
+ logging.basicConfig(
10
+ level=logging.INFO,
11
+ format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
12
+ )
13
+ logger = logging.getLogger(__name__)
14
+
15
+ class InMemoryStorage:
16
+ def __init__(self):
17
+ self._storage: Dict[int, TranscriptionResult] = {}
18
+ self._next_id = 1
19
+ self._cleanup_task = None
20
+
21
+ async def start_cleanup_task(self):
22
+ """Start the background cleanup task"""
23
+ if self._cleanup_task is None:
24
+ logger.info("🧹 Starting automatic cleanup task")
25
+ logger.info(f"⏰ Cleanup interval: {settings.CLEANUP_INTERVAL_HOURS} hours")
26
+ self._cleanup_task = asyncio.create_task(self._cleanup_loop())
27
+ else:
28
+ logger.info("🧹 Cleanup task already running")
29
+
30
+ async def stop_cleanup_task(self):
31
+ """Stop the background cleanup task"""
32
+ if self._cleanup_task:
33
+ logger.info("πŸ›‘ Stopping cleanup task")
34
+ self._cleanup_task.cancel()
35
+ try:
36
+ await self._cleanup_task
37
+ except asyncio.CancelledError:
38
+ pass
39
+ self._cleanup_task = None
40
+ logger.info("βœ… Cleanup task stopped")
41
+ else:
42
+ logger.info("🧹 No cleanup task to stop")
43
+
44
+ def create_transcription(self, language: Optional[str] = None) -> int:
45
+ """Create a new transcription entry and return its ID"""
46
+ transcription_id = self._next_id
47
+ self._next_id += 1
48
+
49
+ logger.info(f"πŸ“ Creating new transcription entry with ID: {transcription_id}")
50
+ logger.info(f"🌐 Language: {language or 'auto-detect'}")
51
+
52
+ result = TranscriptionResult(
53
+ id=transcription_id,
54
+ status=TranscriptionStatus.PENDING,
55
+ language=language,
56
+ created_at=datetime.now(timezone.utc)
57
+ )
58
+
59
+ self._storage[transcription_id] = result
60
+ logger.info(f"βœ… Transcription {transcription_id} created successfully")
61
+ logger.info(f"πŸ“Š Total active transcriptions: {len(self._storage)}")
62
+ return transcription_id
63
+
64
+ def get_transcription(self, transcription_id: int) -> Optional[TranscriptionResult]:
65
+ """Get transcription by ID"""
66
+ logger.info(f"πŸ” Looking up transcription ID: {transcription_id}")
67
+ result = self._storage.get(transcription_id)
68
+ if result:
69
+ logger.info(f"βœ… Found transcription {transcription_id} with status: {result.status}")
70
+ else:
71
+ logger.warning(f"❌ Transcription {transcription_id} not found")
72
+ return result
73
+
74
+ def update_transcription(self, transcription_id: int, **kwargs) -> bool:
75
+ """Update transcription fields"""
76
+ if transcription_id not in self._storage:
77
+ logger.warning(f"❌ Cannot update transcription {transcription_id} - not found")
78
+ return False
79
+
80
+ result = self._storage[transcription_id]
81
+ old_status = result.status if hasattr(result, 'status') else 'unknown'
82
+
83
+ for key, value in kwargs.items():
84
+ if hasattr(result, key):
85
+ setattr(result, key, value)
86
+
87
+ new_status = result.status if hasattr(result, 'status') else 'unknown'
88
+ logger.info(f"πŸ“ Updated transcription {transcription_id}")
89
+
90
+ if 'status' in kwargs:
91
+ logger.info(f"πŸ”„ Status changed: {old_status} β†’ {new_status}")
92
+
93
+ # Log specific updates
94
+ for key, value in kwargs.items():
95
+ if key == 'text' and value:
96
+ text_preview = value[:50] + "..." if len(value) > 50 else value
97
+ logger.info(f"πŸ“„ Text updated: {text_preview}")
98
+ elif key == 'error_message' and value:
99
+ logger.error(f"❌ Error recorded: {value}")
100
+ elif key not in ['status', 'text', 'error_message']:
101
+ logger.info(f"πŸ“Š {key}: {value}")
102
+
103
+ return True
104
+
105
+ def delete_transcription(self, transcription_id: int) -> bool:
106
+ """Delete transcription by ID"""
107
+ if transcription_id in self._storage:
108
+ result = self._storage[transcription_id]
109
+ del self._storage[transcription_id]
110
+ logger.info(f"πŸ—‘οΈ Deleted transcription {transcription_id} (status: {result.status})")
111
+ logger.info(f"πŸ“Š Remaining transcriptions: {len(self._storage)}")
112
+ return True
113
+ else:
114
+ logger.warning(f"❌ Cannot delete transcription {transcription_id} - not found")
115
+ return False
116
+
117
+ async def _cleanup_loop(self):
118
+ """Background task to clean up old transcriptions"""
119
+ logger.info("🧹 Cleanup loop started")
120
+ while True:
121
+ try:
122
+ logger.info("😴 Cleanup sleeping for 1 hour...")
123
+ await asyncio.sleep(3600) # Check every hour
124
+ logger.info("⏰ Running scheduled cleanup...")
125
+ await self._cleanup_old_transcriptions()
126
+ except asyncio.CancelledError:
127
+ logger.info("πŸ›‘ Cleanup loop cancelled")
128
+ break
129
+ except Exception as e:
130
+ logger.error(f"❌ Error in cleanup loop: {e}")
131
+
132
+ async def _cleanup_old_transcriptions(self):
133
+ """Remove transcriptions older than the configured time"""
134
+ logger.info("🧹 Starting cleanup of old transcriptions...")
135
+ cutoff_time = datetime.now(timezone.utc) - timedelta(hours=settings.CLEANUP_INTERVAL_HOURS)
136
+ logger.info(f"⏰ Cutoff time: {cutoff_time} (older than {settings.CLEANUP_INTERVAL_HOURS} hours)")
137
+
138
+ to_delete = []
139
+
140
+ for transcription_id, result in self._storage.items():
141
+ age_hours = (datetime.now(timezone.utc) - result.created_at).total_seconds() / 3600
142
+ if result.created_at < cutoff_time:
143
+ logger.info(f"πŸ—‘οΈ Marking transcription {transcription_id} for deletion (age: {age_hours:.1f} hours)")
144
+ to_delete.append(transcription_id)
145
+
146
+ if not to_delete:
147
+ logger.info("βœ… No old transcriptions to clean up")
148
+ return
149
+
150
+ logger.info(f"🧹 Deleting {len(to_delete)} old transcriptions...")
151
+ for transcription_id in to_delete:
152
+ self.delete_transcription(transcription_id)
153
+
154
+ logger.info(f"βœ… Cleanup completed - removed {len(to_delete)} transcriptions")
155
+ logger.info(f"πŸ“Š Active transcriptions remaining: {len(self._storage)}")
156
+
157
+ # Global storage instance
158
+ storage = InMemoryStorage()
test_api.py ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Simple test script for the Video Transcription Service
4
+ """
5
+
6
+ import requests
7
+ import time
8
+ import sys
9
+ import os
10
+
11
+ def test_transcription_service(base_url="http://localhost:8000", video_file=None):
12
+ """Test the transcription service with a video file"""
13
+
14
+ print(f"Testing Video Transcription Service at {base_url}")
15
+ print("=" * 50)
16
+
17
+ # Test 1: Health check
18
+ print("1. Testing health check...")
19
+ try:
20
+ response = requests.get(f"{base_url}/health")
21
+ if response.status_code == 200:
22
+ print("βœ… Health check passed")
23
+ print(f" Response: {response.json()}")
24
+ else:
25
+ print(f"❌ Health check failed: {response.status_code}")
26
+ return False
27
+ except Exception as e:
28
+ print(f"❌ Health check error: {e}")
29
+ return False
30
+
31
+ # Test 2: Root endpoint
32
+ print("\n2. Testing root endpoint...")
33
+ try:
34
+ response = requests.get(f"{base_url}/")
35
+ if response.status_code == 200:
36
+ print("βœ… Root endpoint passed")
37
+ print(f" Response: {response.json()}")
38
+ else:
39
+ print(f"❌ Root endpoint failed: {response.status_code}")
40
+ except Exception as e:
41
+ print(f"❌ Root endpoint error: {e}")
42
+
43
+ # Test 3: File upload (if video file provided)
44
+ if video_file and os.path.exists(video_file):
45
+ print(f"\n3. Testing video upload with {video_file}...")
46
+ try:
47
+ with open(video_file, 'rb') as f:
48
+ files = {'file': f}
49
+ data = {'language': 'en'}
50
+ response = requests.post(f"{base_url}/transcribe", files=files, data=data)
51
+
52
+ if response.status_code == 200:
53
+ result = response.json()
54
+ transcription_id = result['id']
55
+ print("βœ… Video upload successful")
56
+ print(f" Transcription ID: {transcription_id}")
57
+ print(f" Status: {result['status']}")
58
+
59
+ # Test 4: Check transcription status
60
+ print(f"\n4. Checking transcription status...")
61
+ max_attempts = 30 # Wait up to 5 minutes
62
+ for attempt in range(max_attempts):
63
+ try:
64
+ response = requests.get(f"{base_url}/transcribe/{transcription_id}")
65
+ if response.status_code == 200:
66
+ result = response.json()
67
+ status = result['status']
68
+ print(f" Attempt {attempt + 1}: Status = {status}")
69
+
70
+ if status == 'completed':
71
+ print("βœ… Transcription completed!")
72
+ print(f" Text: {result['text'][:100]}...")
73
+ print(f" Language: {result.get('language', 'N/A')}")
74
+ print(f" Duration: {result.get('duration', 'N/A')} seconds")
75
+ break
76
+ elif status == 'failed':
77
+ print(f"❌ Transcription failed: {result.get('error_message', 'Unknown error')}")
78
+ break
79
+ elif status in ['pending', 'processing']:
80
+ time.sleep(10) # Wait 10 seconds before next check
81
+ else:
82
+ print(f"❌ Unknown status: {status}")
83
+ break
84
+ else:
85
+ print(f"❌ Status check failed: {response.status_code}")
86
+ break
87
+ except Exception as e:
88
+ print(f"❌ Status check error: {e}")
89
+ break
90
+ else:
91
+ print("⏰ Transcription timed out (5 minutes)")
92
+
93
+ else:
94
+ print(f"❌ Video upload failed: {response.status_code}")
95
+ print(f" Response: {response.text}")
96
+
97
+ except Exception as e:
98
+ print(f"❌ Video upload error: {e}")
99
+ else:
100
+ print(f"\n3. Skipping video upload test (no video file provided)")
101
+ print(f" To test with a video file, run: python test_api.py <video_file>")
102
+
103
+ # Test 5: Invalid transcription ID
104
+ print(f"\n5. Testing invalid transcription ID...")
105
+ try:
106
+ response = requests.get(f"{base_url}/transcribe/99999")
107
+ if response.status_code == 404:
108
+ print("βœ… Invalid ID handling works correctly")
109
+ else:
110
+ print(f"❌ Invalid ID test failed: {response.status_code}")
111
+ except Exception as e:
112
+ print(f"❌ Invalid ID test error: {e}")
113
+
114
+ print("\n" + "=" * 50)
115
+ print("Test completed!")
116
+ return True
117
+
118
+ if __name__ == "__main__":
119
+ # Get base URL from environment or use default
120
+ base_url = os.getenv("API_URL", "http://localhost:8000")
121
+
122
+ # Get video file from command line argument
123
+ video_file = sys.argv[1] if len(sys.argv) > 1 else None
124
+
125
+ if video_file and not os.path.exists(video_file):
126
+ print(f"Error: Video file '{video_file}' not found")
127
+ sys.exit(1)
128
+
129
+ success = test_transcription_service(base_url, video_file)
130
+ sys.exit(0 if success else 1)
transcription_service.py ADDED
@@ -0,0 +1,304 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import whisper
2
+ import ffmpeg
3
+ import tempfile
4
+ import os
5
+ import asyncio
6
+ import logging
7
+ import time
8
+ from typing import Optional
9
+ from datetime import datetime, timezone
10
+ from storage import storage
11
+ from models import TranscriptionStatus
12
+ from config import settings
13
+
14
+ # Configure logging for this module
15
+ logging.basicConfig(
16
+ level=logging.INFO,
17
+ format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
18
+ )
19
+ logger = logging.getLogger(__name__)
20
+
21
+ class TranscriptionService:
22
+ def __init__(self):
23
+ self._model = None
24
+ self._model_loading = False
25
+ self._model_load_error = None
26
+
27
+ async def preload_model(self):
28
+ """Preload Whisper model during startup to avoid request timeouts"""
29
+ if self._model is not None:
30
+ logger.info("πŸ€– Whisper model already loaded")
31
+ return True
32
+
33
+ if self._model_load_error:
34
+ logger.error(f"❌ Previous model load failed: {self._model_load_error}")
35
+ return False
36
+
37
+ try:
38
+ logger.info(f"πŸš€ Preloading Whisper model: {settings.WHISPER_MODEL}")
39
+ logger.info("πŸ“₯ This may take 30-60 seconds for first-time download...")
40
+ logger.info("⚑ Preloading during startup to avoid request timeouts...")
41
+
42
+ start_time = time.time()
43
+
44
+ # Run in thread pool to avoid blocking startup
45
+ loop = asyncio.get_event_loop()
46
+ self._model = await loop.run_in_executor(
47
+ None,
48
+ whisper.load_model,
49
+ settings.WHISPER_MODEL
50
+ )
51
+
52
+ load_time = time.time() - start_time
53
+ logger.info(f"βœ… Whisper model preloaded successfully in {load_time:.2f} seconds")
54
+ logger.info("🎯 Service ready for transcription requests!")
55
+ return True
56
+
57
+ except Exception as e:
58
+ error_msg = f"Failed to preload Whisper model: {str(e)}"
59
+ logger.error(f"❌ {error_msg}")
60
+ self._model_load_error = error_msg
61
+ return False
62
+
63
+ async def _load_model(self):
64
+ """Load Whisper model asynchronously (fallback if not preloaded)"""
65
+ if self._model is not None:
66
+ logger.info("πŸ€– Whisper model already loaded")
67
+ return
68
+
69
+ if self._model_load_error:
70
+ logger.error(f"❌ Model load error: {self._model_load_error}")
71
+ raise Exception(self._model_load_error)
72
+
73
+ if self._model_loading:
74
+ logger.info("⏳ Whisper model is currently loading, waiting...")
75
+ # Wait for model to load
76
+ while self._model_loading:
77
+ await asyncio.sleep(0.1)
78
+ if self._model is None:
79
+ raise Exception("Model loading failed")
80
+ logger.info("βœ… Whisper model loading completed (waited)")
81
+ return
82
+
83
+ # If we get here, model wasn't preloaded - try to load it now
84
+ logger.warning("⚠️ Model not preloaded, loading during request (may cause timeout)")
85
+ self._model_loading = True
86
+ try:
87
+ logger.info(f"πŸ€– Loading Whisper model: {settings.WHISPER_MODEL}")
88
+ start_time = time.time()
89
+
90
+ # Run in thread pool to avoid blocking
91
+ loop = asyncio.get_event_loop()
92
+ self._model = await loop.run_in_executor(
93
+ None,
94
+ whisper.load_model,
95
+ settings.WHISPER_MODEL
96
+ )
97
+
98
+ load_time = time.time() - start_time
99
+ logger.info(f"βœ… Whisper model loaded successfully in {load_time:.2f} seconds")
100
+ except Exception as e:
101
+ error_msg = f"Failed to load Whisper model: {str(e)}"
102
+ logger.error(f"❌ {error_msg}")
103
+ self._model_load_error = error_msg
104
+ raise Exception(error_msg)
105
+ finally:
106
+ self._model_loading = False
107
+
108
+ async def transcribe_video(self, video_content: bytes, transcription_id: int, language: Optional[str] = None):
109
+ """Transcribe video content asynchronously"""
110
+ start_time = time.time()
111
+ logger.info(f"🎬 Starting video transcription for ID: {transcription_id}")
112
+ logger.info(f"πŸ“Š Video size: {len(video_content) / (1024*1024):.2f}MB")
113
+ logger.info(f"🌐 Language: {language or 'auto-detect'}")
114
+
115
+ # Check memory before starting
116
+ from restart_handler import check_service_health
117
+ if check_service_health():
118
+ logger.warning(f"⚠️ High memory usage detected before transcription {transcription_id}")
119
+
120
+ try:
121
+ # Update status to processing
122
+ logger.info(f"πŸ“ Updating status to PROCESSING for ID: {transcription_id}")
123
+ storage.update_transcription(
124
+ transcription_id,
125
+ status=TranscriptionStatus.PROCESSING
126
+ )
127
+
128
+ # Load model if needed
129
+ logger.info(f"πŸ€– Loading Whisper model for transcription {transcription_id}")
130
+ await self._load_model()
131
+
132
+ # Extract audio from video
133
+ logger.info(f"🎡 Extracting audio from video for transcription {transcription_id}")
134
+ audio_start = time.time()
135
+ audio_path = await self._extract_audio(video_content)
136
+ audio_time = time.time() - audio_start
137
+ logger.info(f"βœ… Audio extraction completed in {audio_time:.2f} seconds")
138
+
139
+ try:
140
+ # Transcribe audio
141
+ logger.info(f"πŸ—£οΈ Starting audio transcription for ID {transcription_id}")
142
+ transcribe_start = time.time()
143
+ result = await self._transcribe_audio(audio_path, language)
144
+ transcribe_time = time.time() - transcribe_start
145
+
146
+ # Log transcription results
147
+ text_length = len(result["text"]) if result["text"] else 0
148
+ logger.info(f"βœ… Transcription completed in {transcribe_time:.2f} seconds")
149
+ logger.info(f"πŸ“ Transcribed text length: {text_length} characters")
150
+ logger.info(f"🌐 Detected language: {result.get('language', 'unknown')}")
151
+ logger.info(f"⏱️ Audio duration: {result.get('duration', 'unknown')} seconds")
152
+
153
+ # Update storage with results
154
+ logger.info(f"πŸ’Ύ Saving transcription results for ID {transcription_id}")
155
+ storage.update_transcription(
156
+ transcription_id,
157
+ status=TranscriptionStatus.COMPLETED,
158
+ text=result["text"],
159
+ language=result["language"],
160
+ duration=result.get("duration"),
161
+ completed_at=datetime.now(timezone.utc)
162
+ )
163
+
164
+ total_time = time.time() - start_time
165
+ logger.info(f"πŸŽ‰ Transcription {transcription_id} completed successfully in {total_time:.2f} seconds total")
166
+
167
+ finally:
168
+ # Clean up audio file
169
+ if os.path.exists(audio_path):
170
+ logger.info(f"🧹 Cleaning up temporary audio file")
171
+ os.unlink(audio_path)
172
+
173
+ except Exception as e:
174
+ total_time = time.time() - start_time
175
+ logger.error(f"❌ Transcription {transcription_id} failed after {total_time:.2f} seconds: {str(e)}")
176
+ logger.error(f"πŸ” Error type: {type(e).__name__}")
177
+ storage.update_transcription(
178
+ transcription_id,
179
+ status=TranscriptionStatus.FAILED,
180
+ error_message=str(e),
181
+ completed_at=datetime.now(timezone.utc)
182
+ )
183
+
184
+ async def _extract_audio(self, video_content: bytes) -> str:
185
+ """Extract audio from video content"""
186
+ logger.info("πŸ“ Creating temporary video file...")
187
+
188
+ # Create temporary files
189
+ with tempfile.NamedTemporaryFile(delete=False, suffix='.tmp') as video_file:
190
+ video_file.write(video_content)
191
+ video_path = video_file.name
192
+
193
+ audio_path = tempfile.mktemp(suffix='.wav')
194
+ logger.info(f"πŸ“ Temporary files created - Video: {video_path}, Audio: {audio_path}")
195
+
196
+ try:
197
+ # Extract audio using ffmpeg
198
+ logger.info("🎡 Running FFmpeg to extract audio...")
199
+ loop = asyncio.get_event_loop()
200
+ await loop.run_in_executor(
201
+ None,
202
+ self._extract_audio_sync,
203
+ video_path,
204
+ audio_path
205
+ )
206
+
207
+ # Check if audio file was created successfully
208
+ if os.path.exists(audio_path):
209
+ audio_size = os.path.getsize(audio_path)
210
+ logger.info(f"βœ… Audio extraction successful - Size: {audio_size / (1024*1024):.2f}MB")
211
+ else:
212
+ logger.error("❌ Audio file was not created")
213
+ raise Exception("Audio extraction failed - no output file")
214
+
215
+ return audio_path
216
+ finally:
217
+ # Clean up video file
218
+ if os.path.exists(video_path):
219
+ logger.info("🧹 Cleaning up temporary video file")
220
+ os.unlink(video_path)
221
+
222
+ def _extract_audio_sync(self, video_path: str, audio_path: str):
223
+ """Synchronous audio extraction"""
224
+ try:
225
+ logger.info("πŸ”§ Configuring FFmpeg for audio extraction...")
226
+ logger.info(" - Codec: PCM 16-bit")
227
+ logger.info(" - Channels: 1 (mono)")
228
+ logger.info(" - Sample rate: 16kHz")
229
+
230
+ (
231
+ ffmpeg
232
+ .input(video_path)
233
+ .output(audio_path, acodec='pcm_s16le', ac=1, ar='16000')
234
+ .overwrite_output()
235
+ .run(quiet=True)
236
+ )
237
+ logger.info("βœ… FFmpeg audio extraction completed")
238
+ except Exception as e:
239
+ logger.error(f"❌ FFmpeg audio extraction failed: {str(e)}")
240
+ raise
241
+
242
+ async def _transcribe_audio(self, audio_path: str, language: Optional[str] = None) -> dict:
243
+ """Transcribe audio file"""
244
+ logger.info(f"πŸ—£οΈ Starting Whisper transcription...")
245
+ logger.info(f"🎡 Audio file: {audio_path}")
246
+ logger.info(f"🌐 Language: {language or 'auto-detect'}")
247
+
248
+ loop = asyncio.get_event_loop()
249
+
250
+ # Run transcription in thread pool
251
+ logger.info("⚑ Running transcription in background thread...")
252
+ result = await loop.run_in_executor(
253
+ None,
254
+ self._transcribe_audio_sync,
255
+ audio_path,
256
+ language
257
+ )
258
+
259
+ logger.info("βœ… Whisper transcription completed")
260
+ return result
261
+
262
+ def _transcribe_audio_sync(self, audio_path: str, language: Optional[str] = None) -> dict:
263
+ """Synchronous audio transcription"""
264
+ try:
265
+ logger.info("πŸ€– Preparing Whisper transcription options...")
266
+ options = {}
267
+ if language:
268
+ options['language'] = language
269
+ logger.info(f"🌐 Language specified: {language}")
270
+ else:
271
+ logger.info("🌐 Language: auto-detect")
272
+
273
+ logger.info("🎯 Starting Whisper model inference...")
274
+ start_time = time.time()
275
+ result = self._model.transcribe(audio_path, **options)
276
+ inference_time = time.time() - start_time
277
+
278
+ # Log detailed results
279
+ text = result["text"].strip()
280
+ detected_language = result.get("language", "unknown")
281
+ duration = result.get("duration", 0)
282
+
283
+ logger.info(f"βœ… Whisper inference completed in {inference_time:.2f} seconds")
284
+ logger.info(f"πŸ“ Text length: {len(text)} characters")
285
+ logger.info(f"🌐 Detected language: {detected_language}")
286
+ logger.info(f"⏱️ Audio duration: {duration:.2f} seconds")
287
+
288
+ if len(text) > 100:
289
+ logger.info(f"πŸ“„ Text preview: {text[:100]}...")
290
+ else:
291
+ logger.info(f"πŸ“„ Full text: {text}")
292
+
293
+ return {
294
+ "text": text,
295
+ "language": detected_language,
296
+ "duration": duration
297
+ }
298
+ except Exception as e:
299
+ logger.error(f"❌ Whisper transcription failed: {str(e)}")
300
+ logger.error(f"πŸ” Error type: {type(e).__name__}")
301
+ raise
302
+
303
+ # Global service instance
304
+ transcription_service = TranscriptionService()