dfdfdsfgs commited on
Commit
92ea014
Β·
1 Parent(s): 8d41069

Deploy: Fix all errors & make HF Spaces ready - Fixed Gradio interface, frame constants, ElevenLabs API, Arrow3D parameters, added comprehensive error handling and demo mode. All deployment tests passing!

Browse files
Dockerfile ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11-slim
2
+
3
+ # Set working directory
4
+ WORKDIR /app
5
+
6
+ # Install system dependencies for Manim and video processing
7
+ RUN apt-get update && apt-get install -y \
8
+ ffmpeg \
9
+ libcairo2-dev \
10
+ libpango1.0-dev \
11
+ libgdk-pixbuf2.0-dev \
12
+ libffi-dev \
13
+ shared-mime-info \
14
+ texlive \
15
+ texlive-latex-extra \
16
+ texlive-fonts-extra \
17
+ texlive-latex-recommended \
18
+ texlive-science \
19
+ tipa \
20
+ build-essential \
21
+ git \
22
+ && rm -rf /var/lib/apt/lists/*
23
+
24
+ # Copy requirements first for better caching
25
+ COPY requirements.txt .
26
+
27
+ # Install Python dependencies
28
+ RUN pip install --no-cache-dir -r requirements.txt
29
+
30
+ # Copy the rest of the application
31
+ COPY . .
32
+
33
+ # Create necessary directories
34
+ RUN mkdir -p output data/rag logs
35
+
36
+ # Set environment variables
37
+ ENV PYTHONPATH=/app
38
+ ENV GRADIO_SERVER_NAME=0.0.0.0
39
+ ENV GRADIO_SERVER_PORT=7860
40
+
41
+ # Expose port
42
+ EXPOSE 7860
43
+
44
+ # Run the application
45
+ CMD ["python", "app.py"]
README.md CHANGED
@@ -1,3 +1,213 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # TheoremExplainAgent (TEA) 🍡
2
  [![arXiv](https://img.shields.io/badge/arXiv-2502.19400-b31b1b.svg)](https://arxiv.org/abs/2502.19400)
3
  <a href='https://huggingface.co/papers/2502.19400'><img src='https://img.shields.io/static/v1?label=Paper&message=Huggingface&color=orange'></a>
@@ -53,11 +263,7 @@ sudo apt-get install portaudio19-dev
53
  sudo apt-get install libsdl-pango-dev
54
  ```
55
 
56
- 3. Then Download the Kokoro model and voices using the commands to enable TTS service.
57
-
58
- ```shell
59
- mkdir -p models && wget -P models https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/kokoro-v0_19.onnx && wget -P models https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/voices.bin
60
- ```
61
 
62
  4. Create `.env` based on `.env.template`, filling in the environmental variables according to the models you choose to use.
63
  See [LiteLLM](https://docs.litellm.ai/docs/providers) for reference.
@@ -82,17 +288,15 @@ VERTEXAI_PROJECT=""
82
  VERTEXAI_LOCATION=""
83
  GOOGLE_APPLICATION_CREDENTIALS=""
84
 
85
- # Google Gemini
86
- GEMINI_API_KEY=""
 
87
 
88
  ...
89
 
90
- # Kokoro TTS Settings
91
- KOKORO_MODEL_PATH="models/kokoro-v0_19.onnx"
92
- KOKORO_VOICES_PATH="models/voices.bin"
93
- KOKORO_DEFAULT_VOICE="af"
94
- KOKORO_DEFAULT_SPEED="1.0"
95
- KOKORO_DEFAULT_LANG="en-us"
96
  ```
97
  Fill in the API keys according to the model you wanted to use.
98
 
@@ -300,7 +504,7 @@ DatasetDict({
300
 
301
  The FAQ should cover the most common errors you could encounter. If you see something new, report it on issues.
302
 
303
- Q: Error `src.utils.kokoro_voiceover import KokoroService # You MUST import like this as this is our custom voiceover service. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ModuleNotFoundError: No module named 'src'`. <br>
304
  A: Please run `export PYTHONPATH=$(pwd):$PYTHONPATH` when you start a new terminal. <br>
305
 
306
  Q: Error `Files not found` <br>
 
1
+ ---
2
+ title: Theorem Explanation Agent
3
+ emoji: πŸŽ“
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 4.44.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ python_version: 3.11
12
+ ---
13
+
14
+ # πŸŽ“ Theorem Explanation Agent
15
+
16
+ An AI-powered web application that generates educational videos explaining mathematical theorems and concepts using Manim animations and voiceovers.
17
+
18
+ ## 🌟 Features
19
+
20
+ - **Interactive Web Interface**: User-friendly Gradio interface for easy video generation
21
+ - **Multiple AI Models**: Support for various LLMs including Gemini, GPT-4, and Claude
22
+ - **Automated Video Generation**: Creates complete educational videos with animations and voiceovers
23
+ - **API Endpoints**: RESTful API for programmatic access
24
+ - **Real-time Progress Tracking**: Monitor video generation status in real-time
25
+ - **Educational Content**: Covers mathematics, physics, and other STEM topics
26
+
27
+ ## πŸš€ Quick Start
28
+
29
+ ### Using the Web Interface
30
+
31
+ 1. **Initialize the System**: Click "Initialize System" to set up the video generator
32
+ 2. **Enter Topic**: Provide the topic you want explained (e.g., "velocity", "Pythagorean theorem")
33
+ 3. **Add Context**: Optionally provide additional context or specific requirements
34
+ 4. **Select Models**: Choose your preferred AI models for generation
35
+ 5. **Generate Video**: Click "Generate Video" and monitor the progress
36
+ 6. **Download Results**: Access generated videos from the output directory
37
+
38
+ ### Using the API
39
+
40
+ The application provides RESTful API endpoints for programmatic access:
41
+
42
+ ```python
43
+ import requests
44
+
45
+ # Generate a video
46
+ response = requests.post("http://localhost:7860/api/generate", json={
47
+ "topic": "velocity",
48
+ "context": "explain with detailed examples",
49
+ "model": "gemini/gemini-2.0-flash",
50
+ "max_scenes": 5
51
+ })
52
+
53
+ # Check status
54
+ session_id = response.json()["session_id"]
55
+ status = requests.get(f"http://localhost:7860/api/status/{session_id}")
56
+ ```
57
+
58
+ ## πŸ› οΈ Installation & Setup
59
+
60
+ ### Local Development
61
+
62
+ 1. **Clone the Repository**:
63
+ ```bash
64
+ git clone https://github.com/your-repo/theorem-explain-agent.git
65
+ cd theorem-explain-agent
66
+ ```
67
+
68
+ 2. **Install Dependencies**:
69
+ ```bash
70
+ pip install -r requirements.txt
71
+ ```
72
+
73
+ 3. **Set Up Environment Variables**:
74
+ ```bash
75
+ cp .env.template .env
76
+ # Edit .env with your API keys
77
+ ```
78
+
79
+ 4. **Run the Application**:
80
+ ```bash
81
+ python app.py
82
+ ```
83
+
84
+ ### Docker Deployment
85
+
86
+ ```bash
87
+ docker build -t theorem-explanation-agent .
88
+ docker run -p 7860:7860 theorem-explanation-agent
89
+ ```
90
+
91
+ ### Hugging Face Spaces
92
+
93
+ This application is deployed on Hugging Face Spaces and can be accessed directly through the web interface. Simply visit the space URL and start generating educational videos!
94
+
95
+ ## πŸ”§ Configuration
96
+
97
+ ### Environment Variables
98
+
99
+ - `GEMINI_API_KEY`: Google Gemini API key (supports comma-separated multiple keys)
100
+ - `OPENAI_API_KEY`: OpenAI API key
101
+ - `ANTHROPIC_API_KEY`: Anthropic Claude API key
102
+ - `ELEVENLABS_API_KEY`: ElevenLabs TTS API key
103
+ - `ELEVENLABS_DEFAULT_VOICE_ID`: Default voice ID for TTS
104
+
105
+ ### Model Support
106
+
107
+ The application supports various AI models:
108
+
109
+ - **Gemini Models**: `gemini/gemini-2.0-flash`, `gemini/gemini-1.5-pro`
110
+ - **OpenAI Models**: `openai/gpt-4o`, `openai/gpt-4`
111
+ - **Anthropic Models**: `anthropic/claude-3-sonnet`, `anthropic/claude-3-haiku`
112
+
113
+ ## πŸ“– API Documentation
114
+
115
+ ### Endpoints
116
+
117
+ #### POST `/api/generate`
118
+ Generate an educational video for a given topic.
119
+
120
+ **Request Body**:
121
+ ```json
122
+ {
123
+ "topic": "string",
124
+ "context": "string (optional)",
125
+ "model": "string",
126
+ "max_scenes": "integer"
127
+ }
128
+ ```
129
+
130
+ **Response**:
131
+ ```json
132
+ {
133
+ "success": true,
134
+ "session_id": "string",
135
+ "message": "string"
136
+ }
137
+ ```
138
+
139
+ #### GET `/api/status/{session_id}`
140
+ Check the status of video generation.
141
+
142
+ **Response**:
143
+ ```json
144
+ {
145
+ "status": "string",
146
+ "progress": "integer",
147
+ "message": "string",
148
+ "result": "object (when completed)"
149
+ }
150
+ ```
151
+
152
+ ## 🎯 Example Topics
153
+
154
+ - **Mathematics**: Pythagorean Theorem, Quadratic Formula, Derivatives, Logarithms
155
+ - **Physics**: Velocity, Newton's Laws, Wave Motion, Thermodynamics
156
+ - **Statistics**: Probability, Normal Distribution, Hypothesis Testing
157
+ - **Geometry**: Circle Properties, Triangle Theorems, Transformations
158
+
159
+ ## πŸ—οΈ Architecture
160
+
161
+ The application consists of several components:
162
+
163
+ 1. **Video Generator**: Core engine for planning and generating educational content
164
+ 2. **Code Generator**: Creates Manim animation code from AI-generated plans
165
+ 3. **Video Renderer**: Renders Manim animations into video files
166
+ 4. **TTS Service**: Generates voiceovers using ElevenLabs API
167
+ 5. **Web Interface**: Gradio-based user interface
168
+ 6. **API Layer**: RESTful endpoints for programmatic access
169
+
170
+ ## πŸ› Troubleshooting
171
+
172
+ ### Common Issues
173
+
174
+ 1. **Manim Rendering Errors**:
175
+ - Ensure all system dependencies are installed (FFmpeg, LaTeX, Cairo)
176
+ - Check that frame constants are properly defined in generated code
177
+
178
+ 2. **TTS Connection Issues**:
179
+ - Verify ElevenLabs API key is valid
180
+ - Check network connectivity
181
+ - The system will fallback to silent audio if TTS fails
182
+
183
+ 3. **Model API Errors**:
184
+ - Confirm API keys are set correctly
185
+ - Check API rate limits and quotas
186
+ - Ensure model names are valid
187
+
188
+ ### Error Recovery
189
+
190
+ The application includes robust error handling:
191
+ - Automatic retries for API failures
192
+ - Fallback mechanisms for TTS issues
193
+ - Comprehensive error logging
194
+ - Graceful degradation when services are unavailable
195
+
196
+ ## 🀝 Contributing
197
+
198
+ We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
199
+
200
+ ## πŸ“„ License
201
+
202
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
203
+
204
+ ## πŸ™ Acknowledgments
205
+
206
+ - [Manim Community](https://www.manim.community/) for the animation framework
207
+ - [ElevenLabs](https://elevenlabs.io/) for text-to-speech services
208
+ - [Gradio](https://gradio.app/) for the web interface framework
209
+ - [Hugging Face](https://huggingface.co/) for hosting and deployment
210
+
211
  # TheoremExplainAgent (TEA) 🍡
212
  [![arXiv](https://img.shields.io/badge/arXiv-2502.19400-b31b1b.svg)](https://arxiv.org/abs/2502.19400)
213
  <a href='https://huggingface.co/papers/2502.19400'><img src='https://img.shields.io/static/v1?label=Paper&message=Huggingface&color=orange'></a>
 
263
  sudo apt-get install libsdl-pango-dev
264
  ```
265
 
266
+ 3. The project now uses ElevenLabs for TTS service. Make sure you have a valid ElevenLabs API key.
 
 
 
 
267
 
268
  4. Create `.env` based on `.env.template`, filling in the environmental variables according to the models you choose to use.
269
  See [LiteLLM](https://docs.litellm.ai/docs/providers) for reference.
 
288
  VERTEXAI_LOCATION=""
289
  GOOGLE_APPLICATION_CREDENTIALS=""
290
 
291
+ # Google Gemini (supports comma-separated fallback keys)
292
+ # Get your API key from: https://aistudio.google.com/app/apikey
293
+ GEMINI_API_KEY="your_api_key_here"
294
 
295
  ...
296
 
297
+ # ElevenLabs TTS Settings
298
+ ELEVENLABS_API_KEY=""
299
+ ELEVENLABS_DEFAULT_VOICE_ID="EXAVITQu4vr4xnSDxMaL" # Bella voice (default)
 
 
 
300
  ```
301
  Fill in the API keys according to the model you wanted to use.
302
 
 
504
 
505
  The FAQ should cover the most common errors you could encounter. If you see something new, report it on issues.
506
 
507
+ Q: Error `src.utils.elevenlabs_voiceover import ElevenLabsService # You MUST import like this as this is our custom voiceover service. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ModuleNotFoundError: No module named 'src'`. <br>
508
  A: Please run `export PYTHONPATH=$(pwd):$PYTHONPATH` when you start a new terminal. <br>
509
 
510
  Q: Error `Files not found` <br>
app.py ADDED
@@ -0,0 +1,492 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Theorem Explanation Agent - Gradio Interface
4
+ A web interface for generating educational videos explaining mathematical theorems and concepts.
5
+ """
6
+
7
+ import os
8
+ import sys
9
+ import json
10
+ import traceback
11
+ import tempfile
12
+ import shutil
13
+ from typing import Optional, List, Dict, Any
14
+ import gradio as gr
15
+ from pathlib import Path
16
+ import asyncio
17
+ import threading
18
+ from datetime import datetime
19
+
20
+ # Add the project root to Python path
21
+ project_root = Path(__file__).parent
22
+ sys.path.insert(0, str(project_root))
23
+
24
+ # Demo mode flag - set to True for deployment environments with limited resources
25
+ DEMO_MODE = os.getenv("DEMO_MODE", "false").lower() == "true"
26
+
27
+ # Global variables for managing video generation
28
+ video_generator = None
29
+ generation_status = {}
30
+
31
+ def initialize_video_generator():
32
+ """Initialize the video generator with default settings."""
33
+ global video_generator
34
+ try:
35
+ if DEMO_MODE:
36
+ return "βœ… Demo mode - Video generator simulation enabled"
37
+
38
+ from generate_video import VideoGenerator
39
+
40
+ video_generator = VideoGenerator(
41
+ planner_model="gemini/gemini-2.0-flash",
42
+ helper_model="gemini/gemini-2.0-flash",
43
+ scene_model="gemini/gemini-2.0-flash",
44
+ output_dir="output",
45
+ use_rag=False,
46
+ use_context_learning=False,
47
+ use_visual_fix_code=False,
48
+ print_response=False
49
+ )
50
+ return "βœ… Video generator initialized successfully"
51
+ except Exception as e:
52
+ return f"❌ Failed to initialize video generator: {str(e)}\n\nπŸ”§ Try enabling demo mode by setting DEMO_MODE=true"
53
+
54
+ def simulate_video_generation(topic: str, context: str, max_scenes: int) -> Dict[str, Any]:
55
+ """Simulate video generation for demo purposes."""
56
+ import time
57
+ import random
58
+
59
+ # Simulate different stages
60
+ stages = [
61
+ ("Planning video structure", 20),
62
+ ("Generating scene outlines", 40),
63
+ ("Creating animations", 60),
64
+ ("Rendering videos", 80),
65
+ ("Finalizing output", 100)
66
+ ]
67
+
68
+ for stage, progress in stages:
69
+ time.sleep(random.uniform(0.5, 1.5)) # Simulate processing time
70
+
71
+ return {
72
+ "success": True,
73
+ "message": f"Demo video generated for topic: {topic}",
74
+ "scenes_created": max_scenes,
75
+ "total_duration": "2.5 minutes",
76
+ "demo_note": "This is a simulated result. In production, actual Manim videos would be generated."
77
+ }
78
+
79
+ def generate_video_async(
80
+ topic: str,
81
+ context: str,
82
+ model_name: str,
83
+ helper_model: str,
84
+ max_scenes: int,
85
+ session_id: str
86
+ ) -> Dict[str, Any]:
87
+ """Generate video asynchronously with progress tracking."""
88
+ global generation_status, video_generator
89
+
90
+ try:
91
+ # Update status
92
+ generation_status[session_id] = {
93
+ "status": "initializing",
94
+ "progress": 0,
95
+ "message": "Starting video generation...",
96
+ "start_time": datetime.now().isoformat()
97
+ }
98
+
99
+ if DEMO_MODE:
100
+ # Simulate video generation
101
+ generation_status[session_id]["status"] = "planning"
102
+ generation_status[session_id]["progress"] = 10
103
+ generation_status[session_id]["message"] = "Planning video structure (Demo Mode)..."
104
+
105
+ result = simulate_video_generation(topic, context, max_scenes)
106
+
107
+ generation_status[session_id]["status"] = "completed"
108
+ generation_status[session_id]["progress"] = 100
109
+ generation_status[session_id]["message"] = "Demo video generation completed!"
110
+ generation_status[session_id]["result"] = result
111
+
112
+ return {
113
+ "success": True,
114
+ "message": "Demo video generated successfully!",
115
+ "result": result,
116
+ "session_id": session_id
117
+ }
118
+ else:
119
+ # Real video generation
120
+ if video_generator is None:
121
+ from generate_video import VideoGenerator
122
+ video_generator = VideoGenerator(
123
+ planner_model=model_name,
124
+ helper_model=helper_model,
125
+ scene_model=model_name,
126
+ output_dir="output",
127
+ use_rag=False,
128
+ use_context_learning=False,
129
+ use_visual_fix_code=False,
130
+ print_response=False
131
+ )
132
+
133
+ generation_status[session_id]["status"] = "planning"
134
+ generation_status[session_id]["progress"] = 10
135
+ generation_status[session_id]["message"] = "Planning video structure..."
136
+
137
+ result = video_generator.generate_video(
138
+ topic=topic,
139
+ context=context,
140
+ max_scenes=max_scenes
141
+ )
142
+
143
+ generation_status[session_id]["status"] = "completed"
144
+ generation_status[session_id]["progress"] = 100
145
+ generation_status[session_id]["message"] = "Video generation completed!"
146
+ generation_status[session_id]["result"] = result
147
+
148
+ return {
149
+ "success": True,
150
+ "message": "Video generated successfully!",
151
+ "result": result,
152
+ "session_id": session_id
153
+ }
154
+
155
+ except Exception as e:
156
+ generation_status[session_id] = {
157
+ "status": "error",
158
+ "progress": 0,
159
+ "message": f"Error: {str(e)}",
160
+ "error": str(e),
161
+ "traceback": traceback.format_exc()
162
+ }
163
+
164
+ return {
165
+ "success": False,
166
+ "message": f"Generation failed: {str(e)}",
167
+ "error": str(e),
168
+ "session_id": session_id
169
+ }
170
+
171
+ def start_video_generation(
172
+ topic: str,
173
+ context: str,
174
+ model_name: str,
175
+ helper_model: str,
176
+ max_scenes: int
177
+ ) -> tuple:
178
+ """Start video generation and return session ID for tracking."""
179
+ if not topic.strip():
180
+ return "❌ Please enter a topic to explain", "", "Topic is required"
181
+
182
+ # Generate unique session ID
183
+ session_id = f"session_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{hash(topic) % 10000}"
184
+
185
+ # Start generation in background thread
186
+ thread = threading.Thread(
187
+ target=generate_video_async,
188
+ args=(topic, context, model_name, helper_model, max_scenes, session_id)
189
+ )
190
+ thread.daemon = True
191
+ thread.start()
192
+
193
+ mode_note = " (Demo Mode)" if DEMO_MODE else ""
194
+ return (
195
+ f"πŸš€ Video generation started{mode_note}! Session ID: {session_id}",
196
+ session_id,
197
+ "Generation in progress... Please check status below."
198
+ )
199
+
200
+ def check_generation_status(session_id: str) -> tuple:
201
+ """Check the status of video generation."""
202
+ if not session_id:
203
+ return "No session ID provided", "0%", ""
204
+
205
+ if session_id not in generation_status:
206
+ return "Session not found", "0%", ""
207
+
208
+ status = generation_status[session_id]
209
+
210
+ mode_note = " (Demo Mode)" if DEMO_MODE else ""
211
+ status_message = f"Status: {status['status'].title()}{mode_note}\n"
212
+ status_message += f"Progress: {status['progress']}%\n"
213
+ status_message += f"Message: {status['message']}"
214
+
215
+ if status['status'] == 'error':
216
+ status_message += f"\nError: {status.get('error', 'Unknown error')}"
217
+
218
+ result_info = ""
219
+ if status['status'] == 'completed' and 'result' in status:
220
+ result_info = f"βœ… Video generation completed successfully!\n"
221
+ if DEMO_MODE:
222
+ result_info += f"Demo mode: Simulation completed.\n"
223
+ else:
224
+ result_info += f"Check the output directory for generated videos.\n"
225
+
226
+ if 'result' in status:
227
+ result_info += f"\nResult details: {json.dumps(status['result'], indent=2)}"
228
+
229
+ return status_message, f"{status['progress']}%", result_info
230
+
231
+ def list_available_models() -> List[str]:
232
+ """Get list of available models."""
233
+ return [
234
+ "gemini/gemini-2.0-flash",
235
+ "gemini/gemini-1.5-pro",
236
+ "gemini/gemini-1.5-flash",
237
+ "openai/gpt-4o",
238
+ "openai/gpt-4",
239
+ "anthropic/claude-3-sonnet",
240
+ "anthropic/claude-3-haiku"
241
+ ]
242
+
243
+ def get_example_topics() -> List[List[str]]:
244
+ """Get example topics for the interface."""
245
+ return [
246
+ ["Velocity", "Explain velocity in physics with detailed examples"],
247
+ ["Pythagorean Theorem", "Explain the Pythagorean theorem with visual proof"],
248
+ ["Derivatives", "Explain derivatives in calculus with geometric interpretation"],
249
+ ["Quadratic Formula", "Derive and explain the quadratic formula"],
250
+ ["Newton's Laws", "Explain Newton's three laws of motion"],
251
+ ["Logarithms", "Explain logarithms and their properties"],
252
+ ["Trigonometry", "Explain basic trigonometric functions"],
253
+ ["Probability", "Explain basic probability concepts"]
254
+ ]
255
+
256
+ def create_gradio_interface():
257
+ """Create the main Gradio interface."""
258
+
259
+ demo_warning = """
260
+ ⚠️ **Demo Mode Active** - This is a simulation for demonstration purposes.
261
+ To enable full video generation, ensure all dependencies are installed and set DEMO_MODE=false.
262
+ """ if DEMO_MODE else ""
263
+
264
+ with gr.Blocks(
265
+ title="Theorem Explanation Agent",
266
+ theme=gr.themes.Soft(),
267
+ css="""
268
+ .gradio-container {
269
+ max-width: 1200px;
270
+ margin: auto;
271
+ }
272
+ .header {
273
+ text-align: center;
274
+ margin-bottom: 30px;
275
+ }
276
+ .demo-warning {
277
+ background-color: #fff3cd;
278
+ border: 1px solid #ffeaa7;
279
+ border-radius: 5px;
280
+ padding: 10px;
281
+ margin: 10px 0;
282
+ color: #856404;
283
+ }
284
+ """
285
+ ) as interface:
286
+
287
+ # Header
288
+ gr.HTML(f"""
289
+ <div class="header">
290
+ <h1>πŸŽ“ Theorem Explanation Agent</h1>
291
+ <p>Generate educational videos explaining mathematical theorems and concepts using AI</p>
292
+ {f'<div class="demo-warning">{demo_warning}</div>' if DEMO_MODE else ''}
293
+ </div>
294
+ """)
295
+
296
+ # Initialization status
297
+ with gr.Row():
298
+ init_status = gr.Textbox(
299
+ label="System Status",
300
+ value="Click 'Initialize System' to start",
301
+ interactive=False
302
+ )
303
+ init_btn = gr.Button("Initialize System", variant="primary")
304
+
305
+ # Main interface
306
+ with gr.Row():
307
+ with gr.Column(scale=2):
308
+ gr.HTML("<h3>πŸ“ Video Generation Settings</h3>")
309
+
310
+ # Topic input
311
+ topic_input = gr.Textbox(
312
+ label="Topic to Explain",
313
+ placeholder="Enter the topic you want to explain (e.g., 'velocity', 'pythagorean theorem')",
314
+ lines=1
315
+ )
316
+
317
+ # Context input
318
+ context_input = gr.Textbox(
319
+ label="Additional Context",
320
+ placeholder="Provide additional context or specific requirements for the explanation",
321
+ lines=3
322
+ )
323
+
324
+ # Model selection
325
+ with gr.Row():
326
+ model_dropdown = gr.Dropdown(
327
+ label="Primary Model",
328
+ choices=list_available_models(),
329
+ value="gemini/gemini-2.0-flash"
330
+ )
331
+ helper_model_dropdown = gr.Dropdown(
332
+ label="Helper Model",
333
+ choices=list_available_models(),
334
+ value="gemini/gemini-2.0-flash"
335
+ )
336
+
337
+ # Max scenes
338
+ max_scenes_slider = gr.Slider(
339
+ label="Maximum Number of Scenes",
340
+ minimum=1,
341
+ maximum=10,
342
+ value=5,
343
+ step=1
344
+ )
345
+
346
+ # Example topics
347
+ gr.HTML("<h4>πŸ’‘ Example Topics</h4>")
348
+ examples = gr.Examples(
349
+ examples=get_example_topics(),
350
+ inputs=[topic_input, context_input]
351
+ )
352
+
353
+ # Generate button
354
+ generate_btn = gr.Button(
355
+ f"πŸš€ Generate Video{' (Demo)' if DEMO_MODE else ''}",
356
+ variant="primary",
357
+ size="lg"
358
+ )
359
+
360
+ with gr.Column(scale=1):
361
+ gr.HTML("<h3>πŸ“Š Generation Status</h3>")
362
+
363
+ # Session info
364
+ session_id_display = gr.Textbox(
365
+ label="Session ID",
366
+ interactive=False
367
+ )
368
+
369
+ # Status display
370
+ status_display = gr.Textbox(
371
+ label="Current Status",
372
+ lines=5,
373
+ interactive=False
374
+ )
375
+
376
+ # Progress info
377
+ progress_info = gr.Textbox(
378
+ label="Progress",
379
+ value="0%",
380
+ interactive=False
381
+ )
382
+
383
+ # Result display
384
+ result_display = gr.Textbox(
385
+ label="Generation Result",
386
+ lines=10,
387
+ interactive=False
388
+ )
389
+
390
+ # Refresh button
391
+ refresh_btn = gr.Button("πŸ”„ Refresh Status")
392
+
393
+ # Event handlers
394
+ init_btn.click(
395
+ fn=initialize_video_generator,
396
+ outputs=init_status
397
+ )
398
+
399
+ generate_btn.click(
400
+ fn=start_video_generation,
401
+ inputs=[
402
+ topic_input,
403
+ context_input,
404
+ model_dropdown,
405
+ helper_model_dropdown,
406
+ max_scenes_slider
407
+ ],
408
+ outputs=[
409
+ status_display,
410
+ session_id_display,
411
+ result_display
412
+ ]
413
+ )
414
+
415
+ refresh_btn.click(
416
+ fn=check_generation_status,
417
+ inputs=session_id_display,
418
+ outputs=[
419
+ status_display,
420
+ progress_info,
421
+ result_display
422
+ ]
423
+ )
424
+
425
+ return interface
426
+
427
+ # API endpoints for programmatic access
428
+ def create_api_endpoints():
429
+ """Create API endpoints using Gradio's API functionality."""
430
+
431
+ def api_generate_video(topic: str, context: str = "", model: str = "gemini/gemini-2.0-flash", max_scenes: int = 5):
432
+ """API endpoint for video generation."""
433
+ try:
434
+ session_id = f"api_session_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{hash(topic) % 10000}"
435
+ result = generate_video_async(topic, context, model, model, max_scenes, session_id)
436
+ return result
437
+ except Exception as e:
438
+ return {
439
+ "success": False,
440
+ "error": str(e),
441
+ "message": "API generation failed"
442
+ }
443
+
444
+ def api_check_status(session_id: str):
445
+ """API endpoint for checking generation status."""
446
+ if session_id not in generation_status:
447
+ return {"error": "Session not found"}
448
+ return generation_status[session_id]
449
+
450
+ # Create API interface
451
+ api_interface = gr.Interface(
452
+ fn=api_generate_video,
453
+ inputs=[
454
+ gr.Textbox(label="Topic"),
455
+ gr.Textbox(label="Context", value=""),
456
+ gr.Dropdown(label="Model", choices=list_available_models(), value="gemini/gemini-2.0-flash"),
457
+ gr.Slider(label="Max Scenes", minimum=1, maximum=10, value=5, step=1)
458
+ ],
459
+ outputs=gr.JSON(label="Result"),
460
+ title="Theorem Explanation Agent API",
461
+ description="API endpoint for generating educational videos"
462
+ )
463
+
464
+ return api_interface
465
+
466
+ def main():
467
+ """Main function to launch the application."""
468
+ # Create the main interface
469
+ main_interface = create_gradio_interface()
470
+
471
+ # Create API interface
472
+ api_interface = create_api_endpoints()
473
+
474
+ # Combine interfaces
475
+ combined_interface = gr.TabbedInterface(
476
+ [main_interface, api_interface],
477
+ ["πŸŽ“ Main Interface", "πŸ”§ API"],
478
+ title="Theorem Explanation Agent"
479
+ )
480
+
481
+ # Launch the interface
482
+ combined_interface.launch(
483
+ server_name="0.0.0.0",
484
+ server_port=7860,
485
+ share=True,
486
+ show_error=True,
487
+ enable_queue=True,
488
+ max_threads=10
489
+ )
490
+
491
+ if __name__ == "__main__":
492
+ main()
requirements.txt CHANGED
@@ -32,7 +32,7 @@ multipledispatch~=1.0.0
32
  mutagen~=1.47.0
33
  networkx~=3.4.2
34
  numpy~=2.2.2
35
- pillow
36
  proto-plus~=1.25.0
37
  protobuf~=5.28.3
38
  pyasn1~=0.6.1
@@ -41,26 +41,26 @@ PyAudio~=0.2.14 #required brew install portaudio for mac
41
  pycairo~=1.27.0
42
  pydantic~=2.9.2
43
  pydantic_core~=2.23.4
44
- pydub~=0.25.1
45
  pyglet~=2.0.18
46
  Pygments~=2.18.0
47
  #pyobjc-core~=10.3.1 # only for mac
48
  #pyobjc-framework-Cocoa~=10.3.1 # only for mac
49
  pyparsing~=3.2.0
50
  pyrr~=0.10.3
51
- python-dotenv~=0.21.1
52
  python-slugify~=8.0.4
53
- requests~=2.32.3
54
  rich~=13.9.3
55
  rsa~=4.9
56
- scipy~=1.14.1
57
  screeninfo~=0.8.1
58
  skia-pathops~=0.8.0.post2
59
  sox~=1.5.0
60
  srt~=3.5.3
61
  svgelements~=1.9.6
62
  text-unidecode~=1.3
63
- tqdm~=4.66.5
64
  typing_extensions~=4.12.2
65
  uritemplate~=4.1.1
66
  urllib3~=2.2.3
@@ -71,9 +71,9 @@ tiktoken~=0.8.0
71
  timm
72
  sentencepiece
73
  transformers
74
- litellm~=1.60.5
75
  pysrt
76
- moviepy~=2.1.2
77
  yt-dlp
78
  imageio_ffmpeg~=0.5.1
79
  langchain~=0.3.14
@@ -86,13 +86,20 @@ manim-chemistry~=0.4.4
86
  manim-dsa~=0.2.0
87
  manim-circuit~=0.0.3
88
  langfuse~=2.58.1
89
- chromadb~=0.6.3
90
  google-cloud-aiplatform~=1.79.0
91
  cairosvg
92
  pylatexenc~=2.10
93
  ffmpeg-python~=0.2.0
94
- kokoro-onnx[gpu] # if you have a GPU, otherwise kokoro-onnx
95
  soundfile~=0.13.1
96
  krippendorff~=0.8.1
97
  statsmodels~=0.14.4
98
- opencv-python~=4.11.0
 
 
 
 
 
 
 
 
32
  mutagen~=1.47.0
33
  networkx~=3.4.2
34
  numpy~=2.2.2
35
+ pillow>=8.3.0
36
  proto-plus~=1.25.0
37
  protobuf~=5.28.3
38
  pyasn1~=0.6.1
 
41
  pycairo~=1.27.0
42
  pydantic~=2.9.2
43
  pydantic_core~=2.23.4
44
+ pydub>=0.25.0
45
  pyglet~=2.0.18
46
  Pygments~=2.18.0
47
  #pyobjc-core~=10.3.1 # only for mac
48
  #pyobjc-framework-Cocoa~=10.3.1 # only for mac
49
  pyparsing~=3.2.0
50
  pyrr~=0.10.3
51
+ python-dotenv>=0.19.0
52
  python-slugify~=8.0.4
53
+ requests>=2.25.0
54
  rich~=13.9.3
55
  rsa~=4.9
56
+ scipy>=1.7.0
57
  screeninfo~=0.8.1
58
  skia-pathops~=0.8.0.post2
59
  sox~=1.5.0
60
  srt~=3.5.3
61
  svgelements~=1.9.6
62
  text-unidecode~=1.3
63
+ tqdm>=4.62.0
64
  typing_extensions~=4.12.2
65
  uritemplate~=4.1.1
66
  urllib3~=2.2.3
 
71
  timm
72
  sentencepiece
73
  transformers
74
+ litellm>=1.0.0
75
  pysrt
76
+ moviepy>=1.0.3
77
  yt-dlp
78
  imageio_ffmpeg~=0.5.1
79
  langchain~=0.3.14
 
86
  manim-dsa~=0.2.0
87
  manim-circuit~=0.0.3
88
  langfuse~=2.58.1
89
+ chromadb>=0.4.0
90
  google-cloud-aiplatform~=1.79.0
91
  cairosvg
92
  pylatexenc~=2.10
93
  ffmpeg-python~=0.2.0
94
+ elevenlabs~=1.0.0
95
  soundfile~=0.13.1
96
  krippendorff~=0.8.1
97
  statsmodels~=0.14.4
98
+ opencv-python>=4.5.0
99
+
100
+ # Core dependencies
101
+ gradio>=4.0.0
102
+ pathlib>=1.0.0
103
+
104
+ # Data processing
105
+ pandas>=1.3.0
src/config/config.py CHANGED
@@ -12,9 +12,6 @@ class Config:
12
  MANIM_DOCS_PATH = "data/rag/manim_docs"
13
  EMBEDDING_MODEL = "azure/text-embedding-3-large"
14
 
15
- # Kokoro TTS configurations
16
- KOKORO_MODEL_PATH = os.getenv('KOKORO_MODEL_PATH')
17
- KOKORO_VOICES_PATH = os.getenv('KOKORO_VOICES_PATH')
18
- KOKORO_DEFAULT_VOICE = os.getenv('KOKORO_DEFAULT_VOICE')
19
- KOKORO_DEFAULT_SPEED = float(os.getenv('KOKORO_DEFAULT_SPEED', '1.0'))
20
- KOKORO_DEFAULT_LANG = os.getenv('KOKORO_DEFAULT_LANG')
 
12
  MANIM_DOCS_PATH = "data/rag/manim_docs"
13
  EMBEDDING_MODEL = "azure/text-embedding-3-large"
14
 
15
+ # ElevenLabs TTS configurations
16
+ ELEVENLABS_API_KEY = os.getenv('ELEVENLABS_API_KEY')
17
+ ELEVENLABS_DEFAULT_VOICE_ID = os.getenv('ELEVENLABS_DEFAULT_VOICE_ID', 'EXAVITQu4vr4xnSDxMaL') # Default: Bella voice
 
 
 
src/core/code_generator.py CHANGED
@@ -115,7 +115,7 @@ class CodeGenerator:
115
 
116
  # If cache file exists, load and return cached queries
117
  if os.path.exists(cache_file):
118
- with open(cache_file, 'r') as f:
119
  cached_queries = json.load(f)
120
  print(f"Using cached RAG queries for {cache_key}")
121
  return cached_queries
@@ -143,7 +143,7 @@ class CodeGenerator:
143
  return [] # Return empty list in case of parsing error
144
 
145
  # Cache the queries
146
- with open(cache_file, 'w') as f:
147
  json.dump(queries, f)
148
 
149
  return queries
@@ -173,7 +173,7 @@ class CodeGenerator:
173
 
174
  # If cache file exists, load and return cached queries
175
  if os.path.exists(cache_file):
176
- with open(cache_file, 'r') as f:
177
  cached_queries = json.load(f)
178
  print(f"Using cached RAG queries for error fix in {cache_key}")
179
  return cached_queries
@@ -200,7 +200,7 @@ class CodeGenerator:
200
  return [] # Return empty list in case of parsing error
201
 
202
  # Cache the queries
203
- with open(cache_file, 'w') as f:
204
  json.dump(queries, f)
205
 
206
  return queries
@@ -335,26 +335,29 @@ class CodeGenerator:
335
  return code, response_text
336
 
337
  def fix_code_errors(self, implementation_plan: str, code: str, error: str, scene_trace_id: str, topic: str, scene_number: int, session_id: str, rag_queries_cache: Dict = None) -> str:
338
- """Fix errors in generated Manim code.
 
339
 
340
  Args:
341
- implementation_plan (str): Original implementation plan
342
- code (str): Code containing errors
343
- error (str): Error message to fix
344
- scene_trace_id (str): Trace identifier
345
  topic (str): Topic of the scene
346
  scene_number (int): Scene number
347
  session_id (str): Session identifier
348
  rag_queries_cache (Dict, optional): Cache for RAG queries. Defaults to None.
349
 
350
  Returns:
351
- Tuple[str, str]: Fixed code and response text
352
  """
353
- # Format error fix prompt
354
- prompt = get_prompt_fix_error(implementation_plan=implementation_plan, manim_code=code, error=error)
355
-
 
 
 
356
  if self.use_rag:
357
- # Generate RAG queries for error fixing
358
  rag_queries = self._generate_rag_queries_error_fix(
359
  error=error,
360
  code=code,
@@ -363,31 +366,110 @@ class CodeGenerator:
363
  scene_number=scene_number,
364
  session_id=session_id
365
  )
366
- retrieved_docs = self.vector_store.find_relevant_docs(
367
- queries=rag_queries,
368
- k=2, # number of documents to retrieve for error fixing
369
- trace_id=scene_trace_id,
370
- topic=topic,
371
- scene_number=scene_number
372
- )
373
- # Format the retrieved documents into a string
374
- prompt = get_prompt_fix_error(implementation_plan=implementation_plan, manim_code=code, error=error, additional_context=retrieved_docs)
375
 
376
- # Get fixed code from model
377
- response_text = self.scene_model(
 
378
  _prepare_text_inputs(prompt),
379
- metadata={"generation_name": "code_fix_error", "trace_id": scene_trace_id, "tags": [topic, f"scene{scene_number}"], "session_id": session_id}
380
  )
381
 
382
- # Extract fixed code with retries
383
  fixed_code = self._extract_code_with_retries(
384
- response_text,
385
- r"```python(.*)```",
386
- generation_name="code_fix_error",
387
  trace_id=scene_trace_id,
388
  session_id=session_id
389
  )
390
- return fixed_code, response_text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
391
 
392
  def visual_self_reflection(self, code: str, media_path: Union[str, Image.Image], scene_trace_id: str, topic: str, scene_number: int, session_id: str) -> str:
393
  """Use snapshot image or mp4 video to fix code.
 
115
 
116
  # If cache file exists, load and return cached queries
117
  if os.path.exists(cache_file):
118
+ with open(cache_file, 'r', encoding='utf-8') as f:
119
  cached_queries = json.load(f)
120
  print(f"Using cached RAG queries for {cache_key}")
121
  return cached_queries
 
143
  return [] # Return empty list in case of parsing error
144
 
145
  # Cache the queries
146
+ with open(cache_file, 'w', encoding='utf-8') as f:
147
  json.dump(queries, f)
148
 
149
  return queries
 
173
 
174
  # If cache file exists, load and return cached queries
175
  if os.path.exists(cache_file):
176
+ with open(cache_file, 'r', encoding='utf-8') as f:
177
  cached_queries = json.load(f)
178
  print(f"Using cached RAG queries for error fix in {cache_key}")
179
  return cached_queries
 
200
  return [] # Return empty list in case of parsing error
201
 
202
  # Cache the queries
203
+ with open(cache_file, 'w', encoding='utf-8') as f:
204
  json.dump(queries, f)
205
 
206
  return queries
 
335
  return code, response_text
336
 
337
  def fix_code_errors(self, implementation_plan: str, code: str, error: str, scene_trace_id: str, topic: str, scene_number: int, session_id: str, rag_queries_cache: Dict = None) -> str:
338
+ """
339
+ Fix errors in the generated code using the helper model.
340
 
341
  Args:
342
+ implementation_plan (str): The implementation plan for context
343
+ code (str): The original code with errors
344
+ error (str): The error message to fix
345
+ scene_trace_id (str): Trace ID for the scene
346
  topic (str): Topic of the scene
347
  scene_number (int): Scene number
348
  session_id (str): Session identifier
349
  rag_queries_cache (Dict, optional): Cache for RAG queries. Defaults to None.
350
 
351
  Returns:
352
+ str: Fixed code
353
  """
354
+ # First, try to fix common known issues automatically
355
+ fixed_code = self._auto_fix_common_issues(code, error)
356
+ if fixed_code != code:
357
+ return fixed_code
358
+
359
+ # If auto-fix didn't help, use LLM to fix the error
360
  if self.use_rag:
 
361
  rag_queries = self._generate_rag_queries_error_fix(
362
  error=error,
363
  code=code,
 
366
  scene_number=scene_number,
367
  session_id=session_id
368
  )
369
+ context = self.vector_store.query_documents(rag_queries, limit=5)
370
+ else:
371
+ context = ""
 
 
 
 
 
 
372
 
373
+ # Generate fixed code using LLM
374
+ prompt = get_prompt_fix_error(error, code, context)
375
+ fixed_code = self.scene_model(
376
  _prepare_text_inputs(prompt),
377
+ metadata={"generation_name": "fix-error", "trace_id": scene_trace_id, "tags": [topic, f"scene{scene_number}"], "session_id": session_id}
378
  )
379
 
 
380
  fixed_code = self._extract_code_with_retries(
381
+ fixed_code,
382
+ pattern=r'```python\n(.*?)\n```',
383
+ generation_name="fix-error",
384
  trace_id=scene_trace_id,
385
  session_id=session_id
386
  )
387
+
388
+ return fixed_code
389
+
390
+ def _auto_fix_common_issues(self, code: str, error: str) -> str:
391
+ """
392
+ Automatically fix common recurring issues in generated code.
393
+
394
+ Args:
395
+ code (str): The original code with errors
396
+ error (str): The error message
397
+
398
+ Returns:
399
+ str: Fixed code if auto-fix applied, otherwise original code
400
+ """
401
+ fixed_code = code
402
+
403
+ # Fix 1: Config object attribute errors
404
+ if "'ManimMLConfig' object has no attribute 'frame_x_radius'" in error or \
405
+ "'ManimMLConfig' object is not subscriptable" in error:
406
+ # Replace problematic config access with hardcoded constants
407
+ fixed_code = fixed_code.replace(
408
+ 'FRAME_X_MIN = config["frame_x_radius"]',
409
+ 'FRAME_X_MIN = -7.0'
410
+ ).replace(
411
+ 'FRAME_X_MAX = config["frame_x_radius"]',
412
+ 'FRAME_X_MAX = 7.0'
413
+ ).replace(
414
+ 'FRAME_Y_MIN = config["frame_y_radius"]',
415
+ 'FRAME_Y_MIN = -4.0'
416
+ ).replace(
417
+ 'FRAME_Y_MAX = config["frame_y_radius"]',
418
+ 'FRAME_Y_MAX = 4.0'
419
+ ).replace(
420
+ 'FRAME_X_MIN = config.frame_x_radius',
421
+ 'FRAME_X_MIN = -7.0'
422
+ ).replace(
423
+ 'FRAME_X_MAX = config.frame_x_radius',
424
+ 'FRAME_X_MAX = 7.0'
425
+ ).replace(
426
+ 'FRAME_Y_MIN = config.frame_y_radius',
427
+ 'FRAME_Y_MIN = -4.0'
428
+ ).replace(
429
+ 'FRAME_Y_MAX = config.frame_y_radius',
430
+ 'FRAME_Y_MAX = 4.0'
431
+ ).replace(
432
+ 'FRAME_X_MIN = global_config.frame_x_radius',
433
+ 'FRAME_X_MIN = -7.0'
434
+ ).replace(
435
+ 'FRAME_X_MAX = global_config.frame_x_radius',
436
+ 'FRAME_X_MAX = 7.0'
437
+ ).replace(
438
+ 'FRAME_Y_MIN = global_config.frame_y_radius',
439
+ 'FRAME_Y_MIN = -4.0'
440
+ ).replace(
441
+ 'FRAME_Y_MAX = global_config.frame_y_radius',
442
+ 'FRAME_Y_MAX = 4.0'
443
+ )
444
+
445
+ # Fix 2: Arrow3D with buff parameter
446
+ if "unexpected keyword argument 'buff'" in error and "Arrow3D" in code:
447
+ import re
448
+ # Remove buff parameter from Arrow3D calls
449
+ arrow3d_pattern = r'Arrow3D\([^)]*buff=[^,)]*[,)]'
450
+ def remove_buff(match):
451
+ call = match.group(0)
452
+ # Remove buff parameter and any trailing comma
453
+ call = re.sub(r',?\s*buff=[^,)]*', '', call)
454
+ # Fix any double commas
455
+ call = call.replace(',,', ',').replace('(,', '(')
456
+ return call
457
+ fixed_code = re.sub(arrow3d_pattern, remove_buff, fixed_code)
458
+
459
+ # Fix 3: Syntax errors with stray backticks
460
+ if "invalid syntax" in error and "```" in code:
461
+ fixed_code = fixed_code.replace('```', '')
462
+
463
+ # Fix 4: UpdateFromFunc parameter issues
464
+ if "missing 1 required positional argument" in error and "UpdateFromFunc" in code:
465
+ # Fix update function signatures to match Manim's requirements
466
+ fixed_code = re.sub(
467
+ r'def update_ball\(self, obj, alpha\):',
468
+ 'def update_ball(obj):',
469
+ fixed_code
470
+ )
471
+
472
+ return fixed_code
473
 
474
  def visual_self_reflection(self, code: str, media_path: Union[str, Image.Image], scene_trace_id: str, topic: str, scene_number: int, session_id: str) -> str:
475
  """Use snapshot image or mp4 video to fix code.
src/core/video_planner.py CHANGED
@@ -169,11 +169,11 @@ class VideoPlanner:
169
  # replace all spaces and special characters with underscores for file path compatibility
170
  file_prefix = topic.lower()
171
  file_prefix = re.sub(r'[^a-z0-9_]+', '_', file_prefix)
172
- # save plan to file
173
- os.makedirs(os.path.join(self.output_dir, file_prefix), exist_ok=True) # Ensure directory exists
174
- with open(os.path.join(self.output_dir, file_prefix, f"{file_prefix}_scene_outline.txt"), "w") as f:
 
175
  f.write(scene_outline)
176
- print(f"Plan saved to {file_prefix}_scene_outline.txt")
177
 
178
  return scene_outline
179
 
@@ -246,10 +246,11 @@ class VideoPlanner:
246
  vision_match = re.search(r'(<SCENE_VISION_STORYBOARD_PLAN>.*?</SCENE_VISION_STORYBOARD_PLAN>)', vision_storyboard_plan, re.DOTALL)
247
  vision_storyboard_plan = vision_match.group(1) if vision_match else vision_storyboard_plan
248
  implementation_plan += vision_storyboard_plan + "\n\n"
249
- file_path_vs = os.path.join(subplan_dir, f"{file_prefix}_scene{i}_vision_storyboard_plan.txt")
250
- with open(file_path_vs, "w") as f:
 
251
  f.write(vision_storyboard_plan)
252
- print(f"Scene {i} Vision and Storyboard Plan saved to {file_path_vs}")
253
 
254
  # ===== Step 2: Generate Technical Implementation Plan =====
255
  # =========================================================
@@ -292,10 +293,11 @@ class VideoPlanner:
292
  technical_match = re.search(r'(<SCENE_TECHNICAL_IMPLEMENTATION_PLAN>.*?</SCENE_TECHNICAL_IMPLEMENTATION_PLAN>)', technical_implementation_plan, re.DOTALL)
293
  technical_implementation_plan = technical_match.group(1) if technical_match else technical_implementation_plan
294
  implementation_plan += technical_implementation_plan + "\n\n"
295
- file_path_ti = os.path.join(subplan_dir, f"{file_prefix}_scene{i}_technical_implementation_plan.txt")
296
- with open(file_path_ti, "w") as f:
 
297
  f.write(technical_implementation_plan)
298
- print(f"Scene {i} Technical Implementation Plan saved to {file_path_ti}")
299
 
300
  # ===== Step 3: Generate Animation and Narration Plan =====
301
  # =========================================================
@@ -330,18 +332,23 @@ class VideoPlanner:
330
  animation_match = re.search(r'(<SCENE_ANIMATION_NARRATION_PLAN>.*?</SCENE_ANIMATION_NARRATION_PLAN>)', animation_narration_plan, re.DOTALL)
331
  animation_narration_plan = animation_match.group(1) if animation_match else animation_narration_plan
332
  implementation_plan += animation_narration_plan + "\n\n"
333
- file_path_an = os.path.join(subplan_dir, f"{file_prefix}_scene{i}_animation_narration_plan.txt")
334
- with open(file_path_an, "w") as f:
 
335
  f.write(animation_narration_plan)
336
- print(f"Scene {i} Animation and Narration Plan saved to {file_path_an}")
337
 
338
  # ===== Step 4: Save Implementation Plan =====
339
  # ==========================================
340
  # save the overall implementation plan to file
341
- with open(os.path.join(self.output_dir, file_prefix, f"scene{i}", f"{file_prefix}_scene{i}_implementation_plan.txt"), "w") as f:
 
 
 
 
342
  f.write(f"# Scene {i} Implementation Plan\n\n")
343
  f.write(implementation_plan)
344
- print(f"Scene {i} Implementation Plan saved to {file_path_ti}")
345
 
346
  return implementation_plan
347
 
 
169
  # replace all spaces and special characters with underscores for file path compatibility
170
  file_prefix = topic.lower()
171
  file_prefix = re.sub(r'[^a-z0-9_]+', '_', file_prefix)
172
+ outline_path = os.path.join(self.output_dir, file_prefix, "scene_outline.txt")
173
+
174
+ # Save the scene outline to a file
175
+ with open(outline_path, 'w', encoding='utf-8') as f:
176
  f.write(scene_outline)
 
177
 
178
  return scene_outline
179
 
 
246
  vision_match = re.search(r'(<SCENE_VISION_STORYBOARD_PLAN>.*?</SCENE_VISION_STORYBOARD_PLAN>)', vision_storyboard_plan, re.DOTALL)
247
  vision_storyboard_plan = vision_match.group(1) if vision_match else vision_storyboard_plan
248
  implementation_plan += vision_storyboard_plan + "\n\n"
249
+ # Save the vision and storyboard plan to a file
250
+ storyboard_plan_path = os.path.join(subplan_dir, f"{file_prefix}_scene{i}_vision_storyboard_plan.txt")
251
+ with open(storyboard_plan_path, 'w', encoding='utf-8') as f:
252
  f.write(vision_storyboard_plan)
253
+ print(f"Scene {i} Vision and Storyboard Plan saved to {storyboard_plan_path}")
254
 
255
  # ===== Step 2: Generate Technical Implementation Plan =====
256
  # =========================================================
 
293
  technical_match = re.search(r'(<SCENE_TECHNICAL_IMPLEMENTATION_PLAN>.*?</SCENE_TECHNICAL_IMPLEMENTATION_PLAN>)', technical_implementation_plan, re.DOTALL)
294
  technical_implementation_plan = technical_match.group(1) if technical_match else technical_implementation_plan
295
  implementation_plan += technical_implementation_plan + "\n\n"
296
+ # Save the technical implementation plan to a file
297
+ technical_plan_path = os.path.join(subplan_dir, f"{file_prefix}_scene{i}_technical_implementation_plan.txt")
298
+ with open(technical_plan_path, 'w', encoding='utf-8') as f:
299
  f.write(technical_implementation_plan)
300
+ print(f"Scene {i} Technical Implementation Plan saved to {technical_plan_path}")
301
 
302
  # ===== Step 3: Generate Animation and Narration Plan =====
303
  # =========================================================
 
332
  animation_match = re.search(r'(<SCENE_ANIMATION_NARRATION_PLAN>.*?</SCENE_ANIMATION_NARRATION_PLAN>)', animation_narration_plan, re.DOTALL)
333
  animation_narration_plan = animation_match.group(1) if animation_match else animation_narration_plan
334
  implementation_plan += animation_narration_plan + "\n\n"
335
+ # Save the animation and narration plan to a file
336
+ animation_narration_plan_path = os.path.join(subplan_dir, f"{file_prefix}_scene{i}_animation_narration_plan.txt")
337
+ with open(animation_narration_plan_path, 'w', encoding='utf-8') as f:
338
  f.write(animation_narration_plan)
339
+ print(f"Scene {i} Animation and Narration Plan saved to {animation_narration_plan_path}")
340
 
341
  # ===== Step 4: Save Implementation Plan =====
342
  # ==========================================
343
  # save the overall implementation plan to file
344
+ file_prefix = re.sub(r'[^a-z0-9_]+', '_', file_prefix)
345
+ plan_path = os.path.join(self.output_dir, file_prefix, f"scene{i}", "implementation_plan.txt")
346
+
347
+ # Save the scene implementation to a file
348
+ with open(plan_path, 'w', encoding='utf-8') as f:
349
  f.write(f"# Scene {i} Implementation Plan\n\n")
350
  f.write(implementation_plan)
351
+ print(f"Scene {i} Implementation Plan saved to {plan_path}")
352
 
353
  return implementation_plan
354
 
src/core/video_renderer.py CHANGED
@@ -55,11 +55,19 @@ class VideoRenderer:
55
  try:
56
  # Execute manim in a thread to prevent blocking
57
  file_path = os.path.join(code_dir, f"{file_prefix}_scene{curr_scene}_v{curr_version}.py")
 
 
 
 
 
 
 
58
  result = await asyncio.to_thread(
59
  subprocess.run,
60
- ["manim", "-qh", file_path, "--media_dir", media_dir, "--progress_bar", "none"],
61
  capture_output=True,
62
- text=True
 
63
  )
64
 
65
  # if result.returncode != 0, it means that the code is not rendered successfully
@@ -153,11 +161,19 @@ class VideoRenderer:
153
  file_path = os.path.join(folder_path, file)
154
  try:
155
  media_dir = os.path.join(self.output_dir, file_prefix, "media")
 
 
 
 
 
 
 
156
  result = subprocess.run(
157
- f"manim -qh {file_path} --media_dir {media_dir}",
158
  shell=True,
159
  capture_output=True,
160
- text=True
 
161
  )
162
  if result.returncode != 0:
163
  raise Exception(result.stderr)
@@ -232,9 +248,18 @@ class VideoRenderer:
232
  if not os.path.exists(scene_outline_path):
233
  print(f"Warning: Scene outline file not found at {scene_outline_path}. Cannot determine scene count.")
234
  return
 
235
  with open(scene_outline_path) as f:
236
  plan = f.read()
237
- scene_outline = re.search(r'(<SCENE_OUTLINE>.*?</SCENE_OUTLINE>)', plan, re.DOTALL).group(1)
 
 
 
 
 
 
 
 
238
  scene_count = len(re.findall(r'<SCENE_(\d+)>[^<]', scene_outline))
239
 
240
  # Find all scene folders and videos
 
55
  try:
56
  # Execute manim in a thread to prevent blocking
57
  file_path = os.path.join(code_dir, f"{file_prefix}_scene{curr_scene}_v{curr_version}.py")
58
+ project_root = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..'))
59
+ manim_executable = os.path.join(project_root, ".venv", "Scripts", "manim.exe")
60
+ process_env = os.environ.copy()
61
+ if 'PYTHONPATH' in process_env:
62
+ process_env['PYTHONPATH'] = f"{project_root}{os.pathsep}{process_env['PYTHONPATH']}"
63
+ else:
64
+ process_env['PYTHONPATH'] = project_root
65
  result = await asyncio.to_thread(
66
  subprocess.run,
67
+ [manim_executable, "-qh", file_path, "--media_dir", media_dir, "--progress_bar", "none"],
68
  capture_output=True,
69
+ text=True,
70
+ env=process_env
71
  )
72
 
73
  # if result.returncode != 0, it means that the code is not rendered successfully
 
161
  file_path = os.path.join(folder_path, file)
162
  try:
163
  media_dir = os.path.join(self.output_dir, file_prefix, "media")
164
+ project_root = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..'))
165
+ manim_executable = os.path.join(project_root, ".venv", "Scripts", "manim.exe")
166
+ process_env = os.environ.copy()
167
+ if 'PYTHONPATH' in process_env:
168
+ process_env['PYTHONPATH'] = f"{project_root}{os.pathsep}{process_env['PYTHONPATH']}"
169
+ else:
170
+ process_env['PYTHONPATH'] = project_root
171
  result = subprocess.run(
172
+ f"{manim_executable} -qh {file_path} --media_dir {media_dir}",
173
  shell=True,
174
  capture_output=True,
175
+ text=True,
176
+ env=process_env
177
  )
178
  if result.returncode != 0:
179
  raise Exception(result.stderr)
 
248
  if not os.path.exists(scene_outline_path):
249
  print(f"Warning: Scene outline file not found at {scene_outline_path}. Cannot determine scene count.")
250
  return
251
+
252
  with open(scene_outline_path) as f:
253
  plan = f.read()
254
+
255
+ # Check if scene outline exists in the plan
256
+ scene_outline_match = re.search(r'(<SCENE_OUTLINE>.*?</SCENE_OUTLINE>)', plan, re.DOTALL)
257
+ if not scene_outline_match:
258
+ print(f"Warning: No scene outline found in plan file. The plan generation might have failed.")
259
+ print(f"Plan content preview: {plan[:500]}...")
260
+ return
261
+
262
+ scene_outline = scene_outline_match.group(1)
263
  scene_count = len(re.findall(r'<SCENE_(\d+)>[^<]', scene_outline))
264
 
265
  # Find all scene folders and videos
src/utils/elevenlabs_voiceover.py ADDED
@@ -0,0 +1,210 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Copyright (c) 2025 Xposed73
3
+ All rights reserved.
4
+ This file is part of the Manim Voiceover project.
5
+ """
6
+
7
+ import hashlib
8
+ import json
9
+ import requests
10
+ import os
11
+ from pathlib import Path
12
+ from manim_voiceover.services.base import SpeechService
13
+ from manim_voiceover.helper import remove_bookmarks
14
+ from src.config.config import Config
15
+ import time
16
+
17
+
18
+ class ElevenLabsService(SpeechService):
19
+ """Speech service class for ElevenLabs TTS integration."""
20
+
21
+ def __init__(self,
22
+ api_key: str = None,
23
+ voice_id: str = None,
24
+ model_id: str = "eleven_multilingual_v2",
25
+ voice_settings: dict = None,
26
+ **kwargs):
27
+ """
28
+ Initialize ElevenLabs service.
29
+
30
+ Args:
31
+ api_key: ElevenLabs API key (defaults to ELEVENLABS_API_KEY env var)
32
+ voice_id: Voice ID to use (defaults to ELEVENLABS_DEFAULT_VOICE_ID env var)
33
+ model_id: Model ID to use for generation
34
+ voice_settings: Voice settings dict with stability, similarity_boost, style, use_speaker_boost
35
+ """
36
+ self.api_key = api_key or Config.ELEVENLABS_API_KEY
37
+ self.voice_id = voice_id or Config.ELEVENLABS_DEFAULT_VOICE_ID
38
+ self.model_id = model_id
39
+
40
+ # Default voice settings
41
+ default_settings = {
42
+ "stability": 0.5,
43
+ "similarity_boost": 0.75,
44
+ "style": 0.0,
45
+ "use_speaker_boost": True
46
+ }
47
+ self.voice_settings = voice_settings or default_settings
48
+
49
+ if not self.api_key:
50
+ raise ValueError("ElevenLabs API key not found. Please set ELEVENLABS_API_KEY environment variable.")
51
+ if not self.voice_id:
52
+ raise ValueError("ElevenLabs voice ID not found. Please set ELEVENLABS_DEFAULT_VOICE_ID environment variable.")
53
+
54
+ super().__init__(**kwargs)
55
+
56
+ def get_data_hash(self, input_data: dict) -> str:
57
+ """
58
+ Generates a hash based on the input data dictionary.
59
+ The hash is used to create a unique identifier for the input data.
60
+
61
+ Parameters:
62
+ input_data (dict): A dictionary of input data (e.g., text, voice, etc.).
63
+
64
+ Returns:
65
+ str: The generated hash as a string.
66
+ """
67
+ # Convert the input data dictionary to a JSON string (sorted for consistency)
68
+ data_str = json.dumps(input_data, sort_keys=True)
69
+ # Generate a SHA-256 hash of the JSON string
70
+ return hashlib.sha256(data_str.encode('utf-8')).hexdigest()
71
+
72
+ def text_to_speech(self, text: str, output_file: str) -> str:
73
+ """
74
+ Generate audio using ElevenLabs API with robust error handling.
75
+
76
+ Args:
77
+ text (str): Text to synthesize
78
+ output_file (str): Path to save the audio file
79
+
80
+ Returns:
81
+ str: Path to the generated audio file
82
+
83
+ Raises:
84
+ Exception: If API request fails after retries
85
+ """
86
+ url = f"https://api.elevenlabs.io/v1/text-to-speech/{self.voice_id}"
87
+
88
+ headers = {
89
+ "Accept": "audio/mpeg",
90
+ "Content-Type": "application/json",
91
+ "xi-api-key": self.api_key
92
+ }
93
+
94
+ data = {
95
+ "text": text,
96
+ "model_id": "eleven_monolingual_v1",
97
+ "voice_settings": {
98
+ "stability": 0.5,
99
+ "similarity_boost": 0.8
100
+ }
101
+ }
102
+
103
+ max_retries = 3
104
+ retry_delay = 1
105
+
106
+ for attempt in range(max_retries):
107
+ try:
108
+ response = requests.post(url, json=data, headers=headers, timeout=30)
109
+ response.raise_for_status()
110
+
111
+ # Save the audio file
112
+ with open(output_file, 'wb') as f:
113
+ f.write(response.content)
114
+
115
+ return output_file
116
+
117
+ except requests.exceptions.ConnectionError as e:
118
+ print(f"Connection error (attempt {attempt + 1}/{max_retries}): {e}")
119
+ if attempt < max_retries - 1:
120
+ time.sleep(retry_delay * (attempt + 1))
121
+ continue
122
+ # If all retries failed, create a silent audio file as fallback
123
+ self._create_silent_audio(output_file, duration=len(text) * 0.1) # Rough estimate
124
+ return output_file
125
+
126
+ except requests.exceptions.Timeout as e:
127
+ print(f"Timeout error (attempt {attempt + 1}/{max_retries}): {e}")
128
+ if attempt < max_retries - 1:
129
+ time.sleep(retry_delay * (attempt + 1))
130
+ continue
131
+ self._create_silent_audio(output_file, duration=len(text) * 0.1)
132
+ return output_file
133
+
134
+ except requests.exceptions.RequestException as e:
135
+ print(f"Request error (attempt {attempt + 1}/{max_retries}): {e}")
136
+ if attempt < max_retries - 1:
137
+ time.sleep(retry_delay * (attempt + 1))
138
+ continue
139
+ self._create_silent_audio(output_file, duration=len(text) * 0.1)
140
+ return output_file
141
+
142
+ # This should not be reached, but added for safety
143
+ self._create_silent_audio(output_file, duration=len(text) * 0.1)
144
+ return output_file
145
+
146
+ def _create_silent_audio(self, output_file: str, duration: float):
147
+ """Create a silent audio file as fallback when API fails."""
148
+ try:
149
+ import numpy as np
150
+ from scipy.io import wavfile
151
+
152
+ sample_rate = 22050
153
+ samples = int(sample_rate * duration)
154
+ silence = np.zeros(samples, dtype=np.float32)
155
+
156
+ # Convert to appropriate format for wav
157
+ silence_int = (silence * 32767).astype(np.int16)
158
+ wavfile.write(output_file.replace('.mp3', '.wav'), sample_rate, silence_int)
159
+
160
+ print(f"Created silent audio fallback: {output_file}")
161
+
162
+ except Exception as e:
163
+ print(f"Failed to create silent audio: {e}")
164
+ # Create an empty file as last resort
165
+ with open(output_file, 'w') as f:
166
+ f.write("")
167
+
168
+ def generate_from_text(self, text: str, cache_dir: str = None, path: str = None) -> dict:
169
+ """
170
+ Generate audio from text with caching support.
171
+
172
+ Args:
173
+ text: Text to convert to speech
174
+ cache_dir: Directory for caching audio files
175
+ path: Optional specific path for the audio file
176
+
177
+ Returns:
178
+ Dictionary with audio generation details
179
+ """
180
+ if cache_dir is None:
181
+ cache_dir = self.cache_dir
182
+
183
+ input_data = {
184
+ "input_text": text,
185
+ "service": "elevenlabs",
186
+ "voice_id": self.voice_id,
187
+ "model_id": self.model_id,
188
+ "voice_settings": self.voice_settings
189
+ }
190
+
191
+ cached_result = self.get_cached_result(input_data, cache_dir)
192
+ if cached_result is not None:
193
+ return cached_result
194
+
195
+ if path is None:
196
+ audio_path = self.get_data_hash(input_data) + ".mp3"
197
+ else:
198
+ audio_path = path
199
+
200
+ # Generate audio file using ElevenLabs API
201
+ full_audio_path = str(Path(cache_dir) / audio_path)
202
+ self.text_to_speech(text, full_audio_path)
203
+
204
+ json_dict = {
205
+ "input_text": text,
206
+ "input_data": input_data,
207
+ "original_audio": audio_path,
208
+ }
209
+
210
+ return json_dict
task_generator/prompts_raw/prompt_code_generation.txt CHANGED
@@ -17,7 +17,7 @@ Scene Technical Implementation:
17
 
18
  1. **Scene Class:** Class name `Scene{scene_number}`, where `{scene_number}` is replaced by the scene number (e.g., `Scene1`, `Scene2`). The scene class should at least inherit from `VoiceoverScene`. However, you can add more Manim Scene classes on top of VoiceoverScene for multiple inheritance if needed.
19
  2. **Imports:** Include ALL necessary imports explicitly at the top of the file, based on used Manim classes, functions, colors, and constants. Do not rely on implicit imports. Double-check for required modules, classes, functions, colors, and constants, *ensuring all imports are valid and consistent with the Manim Documentation*. **Include imports for any used Manim plugins.**
20
- 3. **Speech Service:** Initialize `KokoroService()`. You MUST import like this: `from src.utils.kokoro_voiceover import KokoroService` as this is our custom voiceover service.
21
  4. **Reusable Animations:** Implement functions for each animation sequence to create modular and reusable code. Structure code into well-defined functions, following function definition patterns from Manim Documentation.
22
  5. **Voiceover:** Use `with self.voiceover(text="...")` for speech synchronization, precisely matching the narration script and animation timings from the Animation and Narration Plan.
23
  6. **Comments:** Add clear and concise comments for complex animations, spatial logic (positioning, arrangements), and object lifecycle management. *Use comments extensively to explain code logic, especially for spatial positioning, animation sequences, and constraint enforcement, mirroring commenting style in Manim Documentation*. **Add comments to explain the purpose and usage of any Manim plugins.**
@@ -51,7 +51,7 @@ Scene Technical Implementation:
51
  * **Reusable Object Creation Functions:** Define reusable functions within helper classes for creating specific Manim objects (e.g., `create_axes`, `create_formula_tex`, `create_explanation_text`).
52
  * **Clear Comments and Variable Names:** Use clear, concise comments to explain code sections and logic. Employ descriptive variable names (e.g., `linear_function_formula`, `logistic_plot`) for better readability.
53
  * **Text Elements:** Create text elements using `Tex` or `MathTex` for formulas and explanations, styling them with `color` and `font_size` as needed.
54
- * **Manim Best Practices:** Follow Manim best practices, including using `VoiceoverScene`, `KokoroService`, common Manim objects, animations, relative positioning, and predefined colors.
55
 
56
  You MUST generate the Python code in the following format (from <CODE> to </CODE>):
57
  <CODE>
@@ -59,7 +59,8 @@ You MUST generate the Python code in the following format (from <CODE> to </CODE
59
  from manim import *
60
  from manim import config as global_config
61
  from manim_voiceover import VoiceoverScene
62
- from src.utils.kokoro_voiceover import KokoroService # You MUST import like this as this is our custom voiceover service.
 
63
 
64
  # plugins imports, don't change the import statements
65
  from manim_circuit import *
@@ -68,6 +69,14 @@ from manim_chemistry import *
68
  from manim_dsa import *
69
  from manim_ml import *
70
 
 
 
 
 
 
 
 
 
71
  # Helper Functions/Classes (Implement and use helper classes and functions for improved code reusability and organization)
72
  class Scene{scene_number}_Helper: # Example: class Scene1_Helper:
73
  # Helper class containing utility functions for scene {scene_number}.
@@ -115,7 +124,7 @@ class Scene{scene_number}(VoiceoverScene, MovingCameraScene): # Note: You can a
115
  # Reminder: This scene class is fully self-contained. There is no dependency on the implementation from previous or subsequent scenes.
116
  def construct(self):
117
  # Initialize speech service
118
- self.set_speech_service(KokoroService())
119
 
120
  # Instantiate helper class (as per plan)
121
  helper = Scene{scene_number}_Helper(self) # Example: helper = Scene1_Helper(self)
@@ -133,7 +142,7 @@ class Scene{scene_number}(VoiceoverScene, MovingCameraScene): # Note: You can a
133
  with self.voiceover(text="[Narration for Stage 1 - from Animation and Narration Plan]") as tracker: # Voiceover for Stage 1
134
  # Object Creation using helper functions (as per plan)
135
  axes = helper.create_axes() # Example: axes = helper.create_axes()
136
- formula = helper.create_formula_tex("...", BLUE_C) # Example: formula = helper.create_formula_tex("...", BLUE_C)
137
  explanation = helper.create_explanation_text("...") # Example: explanation = helper.create_explanation_text("...")
138
 
139
  # Positioning objects (relative positioning, constraint validation - as per plan)
@@ -161,6 +170,9 @@ The `get_center_of_edges` helper function is particularly useful for:
161
  1. Finding the midpoint of polygon edges for label placement
162
  2. Calculating offset positions for side labels that don't overlap with the polygon
163
  3. Creating consistent label positioning across different polygon sizes and orientations
 
 
 
164
 
165
  Example usage in your scene:
166
  ```python
@@ -172,4 +184,31 @@ def label_triangle_sides(self, triangle, labels=["a", "b", "c"]):
172
  tex = MathTex(label).move_to(center)
173
  labeled_sides.add(tex)
174
  return labeled_sides
175
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
  1. **Scene Class:** Class name `Scene{scene_number}`, where `{scene_number}` is replaced by the scene number (e.g., `Scene1`, `Scene2`). The scene class should at least inherit from `VoiceoverScene`. However, you can add more Manim Scene classes on top of VoiceoverScene for multiple inheritance if needed.
19
  2. **Imports:** Include ALL necessary imports explicitly at the top of the file, based on used Manim classes, functions, colors, and constants. Do not rely on implicit imports. Double-check for required modules, classes, functions, colors, and constants, *ensuring all imports are valid and consistent with the Manim Documentation*. **Include imports for any used Manim plugins.**
20
+ 3. **Speech Service:** Initialize `ElevenLabsService()`. You MUST import like this: `from src.utils.elevenlabs_voiceover import ElevenLabsService` as this is our custom voiceover service.
21
  4. **Reusable Animations:** Implement functions for each animation sequence to create modular and reusable code. Structure code into well-defined functions, following function definition patterns from Manim Documentation.
22
  5. **Voiceover:** Use `with self.voiceover(text="...")` for speech synchronization, precisely matching the narration script and animation timings from the Animation and Narration Plan.
23
  6. **Comments:** Add clear and concise comments for complex animations, spatial logic (positioning, arrangements), and object lifecycle management. *Use comments extensively to explain code logic, especially for spatial positioning, animation sequences, and constraint enforcement, mirroring commenting style in Manim Documentation*. **Add comments to explain the purpose and usage of any Manim plugins.**
 
51
  * **Reusable Object Creation Functions:** Define reusable functions within helper classes for creating specific Manim objects (e.g., `create_axes`, `create_formula_tex`, `create_explanation_text`).
52
  * **Clear Comments and Variable Names:** Use clear, concise comments to explain code sections and logic. Employ descriptive variable names (e.g., `linear_function_formula`, `logistic_plot`) for better readability.
53
  * **Text Elements:** Create text elements using `Tex` or `MathTex` for formulas and explanations, styling them with `color` and `font_size` as needed.
54
+ * **Manim Best Practices:** Follow Manim best practices, including using `VoiceoverScene`, `ElevenLabsService`, common Manim objects, animations, relative positioning, and predefined colors.
55
 
56
  You MUST generate the Python code in the following format (from <CODE> to </CODE>):
57
  <CODE>
 
59
  from manim import *
60
  from manim import config as global_config
61
  from manim_voiceover import VoiceoverScene
62
+ import sys
63
+ from src.utils.elevenlabs_voiceover import ElevenLabsService # You MUST import like this as this is our custom voiceover service.
64
 
65
  # plugins imports, don't change the import statements
66
  from manim_circuit import *
 
69
  from manim_dsa import *
70
  from manim_ml import *
71
 
72
+ # Define frame boundaries for constraint checking
73
+ FRAME_WIDTH = 14.0
74
+ FRAME_HEIGHT = 8.0
75
+ FRAME_X_MIN = -FRAME_WIDTH / 2
76
+ FRAME_X_MAX = FRAME_WIDTH / 2
77
+ FRAME_Y_MIN = -FRAME_HEIGHT / 2
78
+ FRAME_Y_MAX = FRAME_HEIGHT / 2
79
+
80
  # Helper Functions/Classes (Implement and use helper classes and functions for improved code reusability and organization)
81
  class Scene{scene_number}_Helper: # Example: class Scene1_Helper:
82
  # Helper class containing utility functions for scene {scene_number}.
 
124
  # Reminder: This scene class is fully self-contained. There is no dependency on the implementation from previous or subsequent scenes.
125
  def construct(self):
126
  # Initialize speech service
127
+ self.init_voiceover(ElevenLabsService())
128
 
129
  # Instantiate helper class (as per plan)
130
  helper = Scene{scene_number}_Helper(self) # Example: helper = Scene1_Helper(self)
 
142
  with self.voiceover(text="[Narration for Stage 1 - from Animation and Narration Plan]") as tracker: # Voiceover for Stage 1
143
  # Object Creation using helper functions (as per plan)
144
  axes = helper.create_axes() # Example: axes = helper.create_axes()
145
+ formula = helper.create_formula_tex(r"...", BLUE_C) # Example: formula = helper.create_formula_tex("...", BLUE_C)
146
  explanation = helper.create_explanation_text("...") # Example: explanation = helper.create_explanation_text("...")
147
 
148
  # Positioning objects (relative positioning, constraint validation - as per plan)
 
170
  1. Finding the midpoint of polygon edges for label placement
171
  2. Calculating offset positions for side labels that don't overlap with the polygon
172
  3. Creating consistent label positioning across different polygon sizes and orientations
173
+ 4. Using raw strings for Tex and MathTex (e.g. r"my\_string") is recommended to avoid issues with escape characters.
174
+ 5. When using animations like `Write` on multiple objects, either apply the animation to each object separately: `self.play(Write(obj1), Write(obj2))` or group them in a `VGroup`: `self.play(Write(VGroup(obj1, obj2)))`.
175
+ 6. Do not repeat keyword arguments in function calls.
176
 
177
  Example usage in your scene:
178
  ```python
 
184
  tex = MathTex(label).move_to(center)
185
  labeled_sides.add(tex)
186
  return labeled_sides
187
+ ```
188
+
189
+ **CRITICAL ASSET GUIDELINES:**
190
+ - NEVER use `SVGMobject()` with external files like "car.svg", "person.svg", etc.
191
+ - ALWAYS use built-in Manim objects and basic geometric shapes:
192
+ * For cars: Use `Rectangle()` with `RoundedRectangle()` for wheels
193
+ * For people: Use `Circle()` for head, `Rectangle()` for body
194
+ * For objects: Use `Circle()`, `Rectangle()`, `Triangle()`, `Polygon()`, etc.
195
+ - Use `Text()` or `Tex()` for labels, avoid complex LaTeX when possible
196
+ - Create simple visual representations rather than loading external assets
197
+
198
+ **TEXT RENDERING BEST PRACTICES:**
199
+ - Prefer `Text()` over `Tex()` for simple labels and titles
200
+ - For mathematical expressions, use basic `MathTex()` with simple formulas
201
+ - Avoid complex LaTeX packages or special characters that might cause compilation issues
202
+ - Use `DecimalNumber()` for numeric displays instead of LaTeX formatting
203
+
204
+ **MANIM API BEST PRACTICES:**
205
+ - For curved lines, use `CurvedArrow()` or `ArcBetweenPoints()` instead of manually creating bezier curves
206
+ - Avoid using `add_cubic_bezier_curve_to()` - use built-in curved objects instead
207
+ - For dashed lines: use `DashedLine(start, end)` without additional curve modifications
208
+ - For simple curves: use `Arc()`, `Circle()`, or `ArcBetweenPoints()`
209
+ - Check Manim documentation for correct method signatures and parameters
210
+ - **Arrow3D Usage**: When using `Arrow3D`, do NOT use the `buff` parameter as it's not supported. Use only `start`, `end`, `color`, and `thickness` parameters
211
+ - **3D Objects**: For 3D scenes, ensure proper camera setup and avoid mixing 2D positioning methods with 3D objects
212
+
213
+ **VOICEOVER INITIALIZATION:**
214
+ - ALWAYS use `
test_deployment.py ADDED
@@ -0,0 +1,202 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Test script to verify deployment readiness for Theorem Explanation Agent
4
+ """
5
+
6
+ import os
7
+ import sys
8
+ import traceback
9
+ from pathlib import Path
10
+
11
+ def test_imports():
12
+ """Test if all required imports work."""
13
+ print("Testing imports...")
14
+
15
+ try:
16
+ import gradio as gr
17
+ print("βœ… Gradio imported successfully")
18
+ print(f" Version: {gr.__version__}")
19
+ except ImportError as e:
20
+ print(f"❌ Failed to import Gradio: {e}")
21
+ return False
22
+
23
+ try:
24
+ import numpy as np
25
+ print("βœ… NumPy imported successfully")
26
+ except ImportError as e:
27
+ print(f"❌ Failed to import NumPy: {e}")
28
+ return False
29
+
30
+ try:
31
+ import requests
32
+ print("βœ… Requests imported successfully")
33
+ except ImportError as e:
34
+ print(f"❌ Failed to import Requests: {e}")
35
+ return False
36
+
37
+ # Test optional dependencies
38
+ try:
39
+ import manim
40
+ print("βœ… Manim imported successfully")
41
+ except ImportError:
42
+ print("⚠️ Manim not available - will run in demo mode")
43
+
44
+ return True
45
+
46
+ def test_app_functionality():
47
+ """Test if the app can be imported and basic functions work."""
48
+ print("\nTesting app functionality...")
49
+
50
+ try:
51
+ # Set demo mode for testing
52
+ os.environ["DEMO_MODE"] = "true"
53
+
54
+ # Import app components
55
+ sys.path.insert(0, str(Path(__file__).parent))
56
+ from app import (
57
+ initialize_video_generator,
58
+ simulate_video_generation,
59
+ list_available_models,
60
+ get_example_topics
61
+ )
62
+
63
+ print("βœ… App components imported successfully")
64
+
65
+ # Test initialization
66
+ init_result = initialize_video_generator()
67
+ print(f" Initialization: {init_result}")
68
+
69
+ # Test simulation
70
+ sim_result = simulate_video_generation("test topic", "test context", 3)
71
+ print(f" Simulation result: {sim_result['success']}")
72
+
73
+ # Test model listing
74
+ models = list_available_models()
75
+ print(f" Available models: {len(models)} models")
76
+
77
+ # Test examples
78
+ examples = get_example_topics()
79
+ print(f" Example topics: {len(examples)} examples")
80
+
81
+ print("βœ… Basic app functionality works")
82
+ return True
83
+
84
+ except Exception as e:
85
+ print(f"❌ App functionality test failed: {e}")
86
+ traceback.print_exc()
87
+ return False
88
+
89
+ def test_gradio_interface():
90
+ """Test if Gradio interface can be created."""
91
+ print("\nTesting Gradio interface...")
92
+
93
+ try:
94
+ os.environ["DEMO_MODE"] = "true"
95
+ from app import create_gradio_interface, create_api_endpoints
96
+
97
+ # Test main interface creation
98
+ interface = create_gradio_interface()
99
+ print("βœ… Main Gradio interface created successfully")
100
+
101
+ # Test API interface creation
102
+ api_interface = create_api_endpoints()
103
+ print("βœ… API interface created successfully")
104
+
105
+ return True
106
+
107
+ except Exception as e:
108
+ print(f"❌ Gradio interface test failed: {e}")
109
+ traceback.print_exc()
110
+ return False
111
+
112
+ def test_environment():
113
+ """Test environment variables and configuration."""
114
+ print("\nTesting environment...")
115
+
116
+ # Check demo mode
117
+ demo_mode = os.getenv("DEMO_MODE", "false").lower() == "true"
118
+ print(f" Demo mode: {demo_mode}")
119
+
120
+ # Check for API keys (optional)
121
+ api_keys = {
122
+ "GEMINI_API_KEY": os.getenv("GEMINI_API_KEY"),
123
+ "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY"),
124
+ "ELEVENLABS_API_KEY": os.getenv("ELEVENLABS_API_KEY")
125
+ }
126
+
127
+ for key, value in api_keys.items():
128
+ if value:
129
+ print(f" {key}: βœ… Set")
130
+ else:
131
+ print(f" {key}: ⚠️ Not set (demo mode will work)")
132
+
133
+ # Check Python version
134
+ python_version = sys.version_info
135
+ print(f" Python version: {python_version.major}.{python_version.minor}.{python_version.micro}")
136
+
137
+ if python_version >= (3, 8):
138
+ print("βœ… Python version is compatible")
139
+ else:
140
+ print("❌ Python version too old (requires 3.8+)")
141
+ return False
142
+
143
+ return True
144
+
145
+ def main():
146
+ """Run all tests."""
147
+ print("πŸ§ͺ Testing Theorem Explanation Agent Deployment Readiness\n")
148
+
149
+ tests = [
150
+ ("Environment", test_environment),
151
+ ("Imports", test_imports),
152
+ ("App Functionality", test_app_functionality),
153
+ ("Gradio Interface", test_gradio_interface)
154
+ ]
155
+
156
+ results = []
157
+ for test_name, test_func in tests:
158
+ print(f"\n{'='*50}")
159
+ print(f"Running {test_name} test...")
160
+ print("="*50)
161
+
162
+ try:
163
+ result = test_func()
164
+ results.append((test_name, result))
165
+ except Exception as e:
166
+ print(f"❌ {test_name} test crashed: {e}")
167
+ results.append((test_name, False))
168
+
169
+ # Summary
170
+ print(f"\n{'='*50}")
171
+ print("TEST SUMMARY")
172
+ print("="*50)
173
+
174
+ all_passed = True
175
+ for test_name, result in results:
176
+ status = "βœ… PASS" if result else "❌ FAIL"
177
+ print(f"{test_name}: {status}")
178
+ if not result:
179
+ all_passed = False
180
+
181
+ print(f"\n{'='*50}")
182
+ if all_passed:
183
+ print("πŸŽ‰ ALL TESTS PASSED - Ready for deployment!")
184
+ print("\nπŸ“‹ Deployment Instructions:")
185
+ print("1. Push code to GitHub repository")
186
+ print("2. Create new Hugging Face Space")
187
+ print("3. Connect to your repository")
188
+ print("4. Set DEMO_MODE=false in Space settings (if you have API keys)")
189
+ print("5. Add API keys as Space secrets (optional)")
190
+ print("6. Deploy and test!")
191
+ else:
192
+ print("❌ SOME TESTS FAILED - Fix issues before deployment")
193
+ print("\nπŸ”§ Recommended actions:")
194
+ print("- Install missing dependencies")
195
+ print("- Fix import errors")
196
+ print("- Ensure Python 3.8+ is being used")
197
+
198
+ return all_passed
199
+
200
+ if __name__ == "__main__":
201
+ success = main()
202
+ sys.exit(0 if success else 1)