File size: 9,858 Bytes
74708f4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
# Comprehensive Logging Guide

The Video Transcription Service now includes detailed step-by-step logging to help you monitor and debug transcription progress.

## 🎯 **What You Can Track**

### Complete Transcription Journey
- βœ… File upload and validation
- βœ… Video processing steps
- βœ… Whisper model loading
- βœ… Audio extraction progress
- βœ… Transcription inference
- βœ… Results and cleanup
- βœ… Error handling and debugging

### Real-time Progress Monitoring
- πŸ“Š Processing times for each step
- πŸ“ File sizes and durations
- 🌐 Language detection
- πŸ“ Text length and previews
- ⚠️ Warnings and errors

## πŸš€ **Quick Start**

### Basic Logging (Default)
```bash
python main.py
```

### Debug Mode (Detailed Logs)
```bash
DEBUG=true python main.py
```

### Log to File
```bash
LOG_TO_FILE=true python main.py
```

### Combined (Debug + File)
```bash
DEBUG=true LOG_TO_FILE=true python main.py
```

## πŸ“Š **Real-time Monitoring**

### Monitor Service Health
```bash
python log_monitor.py test
```

### Upload and Monitor Video
```bash
python log_monitor.py upload video.mp4
```

### Monitor Existing Transcription
```bash
python log_monitor.py monitor 123
```

## πŸ“‹ **Sample Log Output**

### Service Startup
```
2024-01-15 10:30:00 - main - INFO - πŸš€ Starting Video Transcription Service
2024-01-15 10:30:00 - main - INFO - ==================================================
2024-01-15 10:30:00 - main - INFO - πŸ“‹ Service Configuration:
2024-01-15 10:30:00 - main - INFO -    πŸ€– Whisper Model: base
2024-01-15 10:30:00 - main - INFO -    πŸ“ Max File Size: 100MB
2024-01-15 10:30:00 - main - INFO -    πŸ•’ Cleanup Interval: 3.5 hours
2024-01-15 10:30:00 - main - INFO -    🚦 Rate Limit: 10 requests/minute
2024-01-15 10:30:00 - main - INFO -    🌐 Host: 0.0.0.0:8000
2024-01-15 10:30:00 - main - INFO -    πŸ“ Supported Formats: .mp4, .avi, .mov, .mkv, .wmv, .flv, .webm, .m4v
2024-01-15 10:30:00 - main - INFO - ==================================================
```

### File Upload Process
```
2024-01-15 10:30:15 - main - INFO - πŸš€ Starting transcription request for file: video.mp4
2024-01-15 10:30:15 - main - INFO - 🌐 Language specified: auto-detect
2024-01-15 10:30:15 - main - INFO - πŸ“ Validating file: video.mp4
2024-01-15 10:30:15 - main - INFO - πŸ” File extension: .mp4
2024-01-15 10:30:15 - main - INFO - βœ… File format validation passed: .mp4
2024-01-15 10:30:15 - main - INFO - πŸ“Š Reading file content for size validation...
2024-01-15 10:30:15 - main - INFO - πŸ“ File size: 25.34MB (max: 100MB)
2024-01-15 10:30:15 - main - INFO - βœ… File size validation passed: 25.34MB
```

### Storage Operations
```
2024-01-15 10:30:15 - storage - INFO - πŸ“ Creating new transcription entry with ID: 1
2024-01-15 10:30:15 - storage - INFO - 🌐 Language: auto-detect
2024-01-15 10:30:15 - storage - INFO - βœ… Transcription 1 created successfully
2024-01-15 10:30:15 - storage - INFO - πŸ“Š Total active transcriptions: 1
```

### Video Processing
```
2024-01-15 10:30:15 - transcription_service - INFO - 🎬 Starting video transcription for ID: 1
2024-01-15 10:30:15 - transcription_service - INFO - πŸ“Š Video size: 25.34MB
2024-01-15 10:30:15 - transcription_service - INFO - 🌐 Language: auto-detect
2024-01-15 10:30:15 - transcription_service - INFO - πŸ“ Updating status to PROCESSING for ID: 1
```

### Model Loading (First Time)
```
2024-01-15 10:30:15 - transcription_service - INFO - πŸ€– Loading Whisper model: base
2024-01-15 10:30:15 - transcription_service - INFO - πŸ“₯ This may take 30-60 seconds for first-time download...
2024-01-15 10:30:45 - transcription_service - INFO - βœ… Whisper model loaded successfully in 30.2 seconds
```

### Audio Extraction
```
2024-01-15 10:30:45 - transcription_service - INFO - 🎡 Extracting audio from video for transcription 1
2024-01-15 10:30:45 - transcription_service - INFO - πŸ“ Creating temporary video file...
2024-01-15 10:30:45 - transcription_service - INFO - πŸ“ Temporary files created - Video: /tmp/xyz.tmp, Audio: /tmp/abc.wav
2024-01-15 10:30:45 - transcription_service - INFO - 🎡 Running FFmpeg to extract audio...
2024-01-15 10:30:45 - transcription_service - INFO - πŸ”§ Configuring FFmpeg for audio extraction...
2024-01-15 10:30:45 - transcription_service - INFO -    - Codec: PCM 16-bit
2024-01-15 10:30:45 - transcription_service - INFO -    - Channels: 1 (mono)
2024-01-15 10:30:45 - transcription_service - INFO -    - Sample rate: 16kHz
2024-01-15 10:30:48 - transcription_service - INFO - βœ… FFmpeg audio extraction completed
2024-01-15 10:30:48 - transcription_service - INFO - βœ… Audio extraction successful - Size: 8.45MB
2024-01-15 10:30:48 - transcription_service - INFO - βœ… Audio extraction completed in 3.1 seconds
```

### Transcription Process
```
2024-01-15 10:30:48 - transcription_service - INFO - πŸ—£οΈ Starting audio transcription for ID 1
2024-01-15 10:30:48 - transcription_service - INFO - πŸ—£οΈ Starting Whisper transcription...
2024-01-15 10:30:48 - transcription_service - INFO - 🎡 Audio file: /tmp/abc.wav
2024-01-15 10:30:48 - transcription_service - INFO - 🌐 Language: auto-detect
2024-01-15 10:30:48 - transcription_service - INFO - ⚑ Running transcription in background thread...
2024-01-15 10:30:48 - transcription_service - INFO - πŸ€– Preparing Whisper transcription options...
2024-01-15 10:30:48 - transcription_service - INFO - 🌐 Language: auto-detect
2024-01-15 10:30:48 - transcription_service - INFO - 🎯 Starting Whisper model inference...
2024-01-15 10:31:15 - transcription_service - INFO - βœ… Whisper inference completed in 27.3 seconds
2024-01-15 10:31:15 - transcription_service - INFO - πŸ“ Text length: 1247 characters
2024-01-15 10:31:15 - transcription_service - INFO - 🌐 Detected language: en
2024-01-15 10:31:15 - transcription_service - INFO - ⏱️ Audio duration: 180.50 seconds
2024-01-15 10:31:15 - transcription_service - INFO - πŸ“„ Text preview: Hello, welcome to this video tutorial where we'll be discussing...
```

### Completion
```
2024-01-15 10:31:15 - transcription_service - INFO - βœ… Transcription completed in 27.3 seconds
2024-01-15 10:31:15 - transcription_service - INFO - πŸ’Ύ Saving transcription results for ID 1
2024-01-15 10:31:15 - storage - INFO - πŸ“ Updated transcription 1
2024-01-15 10:31:15 - storage - INFO - πŸ”„ Status changed: processing β†’ completed
2024-01-15 10:31:15 - storage - INFO - πŸ“„ Text updated: Hello, welcome to this video tutorial where we'll...
2024-01-15 10:31:15 - transcription_service - INFO - 🧹 Cleaning up temporary audio file
2024-01-15 10:31:15 - transcription_service - INFO - πŸŽ‰ Transcription 1 completed successfully in 60.2 seconds total
```

## πŸ”§ **Log Levels**

### INFO (Default)
- Service startup/shutdown
- Request processing
- Status updates
- Completion messages

### DEBUG (Detailed)
- File validation details
- Temporary file paths
- FFmpeg configuration
- Model loading progress
- Memory usage info

### WARNING
- Large file warnings
- Performance issues
- Non-critical errors

### ERROR
- Processing failures
- File format issues
- System errors
- Transcription failures

## πŸ“ **Log Files**

When `LOG_TO_FILE=true`, logs are saved to:
```
transcription_service_YYYYMMDD_HHMMSS.log
```

Example: `transcription_service_20240115_103000.log`

## πŸ› οΈ **Troubleshooting with Logs**

### Common Issues and Log Patterns

**1. NumPy Compatibility Error**
```
ERROR - A module that was compiled using NumPy 1.x cannot be run in NumPy 2.2.6
```
**Solution:** Run `python fix_numpy.py`

**2. FFmpeg Not Found**
```
ERROR - FFmpeg audio extraction failed: [Errno 2] No such file or directory: 'ffmpeg'
```
**Solution:** Install FFmpeg for your OS

**3. File Too Large**
```
ERROR - File too large: 150.5MB > 100MB
```
**Solution:** Compress video or increase limit in config.py

**4. Model Loading Issues**
```
ERROR - Failed to load Whisper model: [Errno 28] No space left on device
```
**Solution:** Free up disk space or use smaller model

**5. Memory Issues**
```
ERROR - Process killed (signal 9)
```
**Solution:** Use smaller files or increase available memory

## 🎯 **Performance Monitoring**

### Key Metrics to Watch
- **Model Loading Time**: Should be 15-60 seconds (first time only)
- **Audio Extraction**: Usually 1-5 seconds per minute of video
- **Transcription Speed**: Varies by model and content (typically 0.1-0.5x real-time)
- **Memory Usage**: Monitor for large files
- **Active Transcriptions**: Track concurrent processing

### Optimization Tips
- Use `tiny` model for faster processing
- Compress videos before upload
- Monitor memory usage with large files
- Use DEBUG mode to identify bottlenecks

## πŸ“Š **Integration Examples**

### Parse Logs Programmatically
```python
import re
from datetime import datetime

def parse_transcription_logs(log_file):
    with open(log_file, 'r') as f:
        for line in f:
            if 'Transcription' in line and 'completed successfully' in line:
                # Extract transcription ID and time
                match = re.search(r'Transcription (\d+) completed.*in ([\d.]+) seconds', line)
                if match:
                    tid, duration = match.groups()
                    print(f"ID {tid}: {duration}s")
```

### Monitor API Programmatically
```python
import requests
import time

def monitor_service():
    while True:
        try:
            response = requests.get('http://localhost:8000/health')
            health = response.json()
            print(f"Active: {health.get('active_transcriptions', 0)}")
            time.sleep(30)
        except Exception as e:
            print(f"Service down: {e}")
            time.sleep(60)
```

---

**With comprehensive logging, you now have complete visibility into your transcription service! πŸŽ‰**