LazySloth commited on
Commit
8f1899c
·
verified ·
1 Parent(s): 6b746d4

Added README.md

Browse files
Files changed (1) hide show
  1. README.md +338 -0
README.md ADDED
@@ -0,0 +1,338 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # manimator
2
+
3
+ ![manimator](https://github.com/HyperCluster-Tech/manimator/blob/main/assets/manimator.png)
4
+ [![GitHub Stars](https://img.shields.io/github/stars/HyperCluster-Tech/manimator?style=social)](https://github.com/HyperCluster-Tech/manimator/stargazers)
5
+ [![GitHub Forks](https://img.shields.io/github/forks/HyperCluster-Tech/manimator?style=social)](https://github.com/HyperCluster-Tech/manimator/network/members)
6
+ [![GitHub Issues](https://img.shields.io/github/issues/HyperCluster-Tech/manimator)](https://github.com/HyperCluster-Tech/manimator/issues)
7
+ [![GitHub Pull Requests](https://img.shields.io/github/issues-pr/HyperCluster-Tech/manimator)](https://github.com/HyperCluster-Tech/manimator/pulls)
8
+ [![License](https://img.shields.io/github/license/HyperCluster-Tech/manimator)](https://github.com/HyperCluster-Tech/manimator/blob/main/LICENSE)
9
+ [![Website](https://img.shields.io/badge/Website-manimator.hypercluster.tech-blue)](https://manimator.hypercluster.tech/)
10
+
11
+ ### What is _manimator_?
12
+
13
+ manimator is a tool to transform research papers and mathematical concepts into stunning visual explanations, powered by AI and the [manim](https://github.com/ManimCommunity/manim) engine
14
+
15
+ Building on the incredible work by 3Blue1Brown and the manim community, _manimator_ turns complex research papers and user prompts into clear, animated explainer videos.
16
+
17
+ ### 🔗 Try it out:
18
+
19
+ - Gradio Demo: [![On Gradio (Hugging Face)](https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-md-dark.svg)](https://huggingface.co/spaces/HyperCluster/manimator)
20
+ - Or replace `arxiv.org` with `manimator.hypercluster.tech` in any arXiv PDF URL for instant visualizations!
21
+
22
+ ### 🌟 Highlights so far:
23
+
24
+ - Over **1000+ uses** within 24 hours of launch and over **5000 uses** within a week
25
+ - Featured as Hugging Face's **Space of the Week**!
26
+ - 16th in Hugging Face's Top Trending Spaces
27
+ - Take a look at the paper on arXiv here: https://www.arxiv.org/abs/2507.14306
28
+
29
+ ## 🎥 Demo Videos:
30
+
31
+ <table border="0" style="width: 100%; text-align: center; margin: 20px 0;">
32
+ <tr>
33
+ <td width="50%">
34
+ <video src="https://github.com/user-attachments/assets/5aa9ebea-06f1-489d-9a68-c8c62cdc6915" width="100%" controls autoplay loop></video>
35
+ <p align="center">ArXiv usage Walkthrough</p>
36
+ </td>
37
+ <td width="50%">
38
+ <video src="https://github.com/user-attachments/assets/820142c4-7931-4aa8-b9b7-c1d5d46b23b5" width="100%" controls autoplay loop></video>
39
+ <p align="center">Gradio Walkthrough</p>
40
+ </td>
41
+ </tr>
42
+ </table>
43
+
44
+ ## Installation
45
+
46
+ > [!IMPORTANT]
47
+ > This project is built using the [poetry](https://python-poetry.org/) tool to manage Python packages and dependencies. Download it from [here](https://python-poetry.org/docs/#installing-with-the-official-installer) to run this project or use the Docker image.
48
+ > This project is dependent on the [manim](https://github.com/ManimCommunity/manim) engine and hence has certain dependencies for running the engine properly which can be found [here](https://docs.manim.community/en/stable/installation.html).
49
+
50
+ ```
51
+ bash
52
+ git clone https://github.com/HyperCluster-Tech/manimator
53
+ cd manimator
54
+ ```
55
+
56
+ Install Dependencies:
57
+ `poetry install`
58
+
59
+ Activate the environment:
60
+ `poetry env activate`
61
+
62
+ (If you're using a version before Poetry 2.0, you should use `poetry shell`)
63
+
64
+ ## Usage
65
+
66
+ After successfully installing all the project dependencies and manim dependencies, set the environment variables in a .env file according to the .env.example:
67
+
68
+ Run the FastAPI server:
69
+
70
+ ```
71
+ poetry run app
72
+ ```
73
+
74
+ and visit `localhost:8000/docs` to open SwaggerUI
75
+
76
+ Run the Gradio interface:
77
+
78
+ ```
79
+ poetry run gradio-app
80
+ ```
81
+
82
+ and open `localhost:7860`
83
+
84
+ ### Notes
85
+
86
+ To change the models being used, you can set the environment variables for the models according to [LiteLLM syntax](https://docs.litellm.ai/docs/providers) and set the corresponding API keys accordingly.
87
+
88
+ To prompt engineer to better suit your use case, you can modify the system prompts in `utils/system_prompts.py` and change the few shot examples in `few_shot/few_shot_prompts.py`.
89
+
90
+ ## 🛳️ Docker
91
+
92
+ To use manimator with Docker, execute the following commands:
93
+
94
+ 1. Clone the manimator repo to get the Docker image (we will be publishing the image in DockerHub soon)
95
+ 2. Run the Docker container, exposing port 8000 for the FastAPI server or 7860 for the Gradio interface
96
+
97
+ Build the Docker image locally. Then, run the Docker container as follows:
98
+
99
+ `docker build -t manimator .`
100
+
101
+ If you are running the FastAPI server
102
+
103
+ `docker run -p 8000:8000 manimator`
104
+
105
+ Else for the Gradio interface
106
+
107
+ `docker run -p 7860:7860 manimator`
108
+
109
+ <details>
110
+ <summary><h2>API Endpoints</h2></summary>
111
+
112
+ - [API Endpoints](#api-endpoints)
113
+ - [Health Check](#health-check)
114
+ - [PDF Processing](#pdf-processing)
115
+ - [Generate PDF Scene](#generate-pdf-scene)
116
+ - [Process ArXiv PDF](#process-arxiv-pdf)
117
+ - [Scene Generation](#scene-generation)
118
+ - [Generate Prompt Scene](#generate-prompt-scene)
119
+ - [Animation Generation](#animation-generation)
120
+ - [Generate Animation](#generate-animation)
121
+
122
+ ### Health Check
123
+
124
+ #### Check API Health Status
125
+
126
+ Endpoint: `/health-check`
127
+ Method: GET
128
+
129
+ Returns the health status of the API.
130
+
131
+ Response:
132
+
133
+ ```json
134
+ {
135
+ "status": "ok"
136
+ }
137
+ ```
138
+
139
+ Curl command:
140
+
141
+ ```bash
142
+ curl http://localhost:8000/health-check
143
+ ```
144
+
145
+ ### PDF Processing
146
+
147
+ #### Generate PDF Scene
148
+
149
+ Endpoint: `/generate-pdf-scene`
150
+ Method: POST
151
+
152
+ Processes a PDF file and generates a scene description for animation.
153
+
154
+ Request:
155
+
156
+ - Content-Type: `multipart/form-data`
157
+ - Body: PDF file
158
+
159
+ Response:
160
+
161
+ ```json
162
+ {
163
+ "scene_description": "Generated scene description based on PDF content"
164
+ }
165
+ ```
166
+
167
+ Curl command:
168
+
169
+ ```bash
170
+ curl -X POST -F "file=@/path/to/file.pdf" http://localhost:8000/generate-pdf-scene
171
+ ```
172
+
173
+ #### Process ArXiv PDF
174
+
175
+ Endpoint: `/pdf/{arxiv_id}`
176
+ Method: GET
177
+
178
+ Downloads and processes an arXiv paper by ID to generate a scene description.
179
+
180
+ Parameters:
181
+
182
+ - `arxiv_id`: The arXiv paper identifier
183
+
184
+ Response:
185
+
186
+ ```json
187
+ {
188
+ "scene_description": "Generated scene description based on arXiv paper"
189
+ }
190
+ ```
191
+
192
+ Curl command:
193
+
194
+ ```bash
195
+ curl http://localhost:8000/pdf/2312.12345
196
+ ```
197
+
198
+ ### Scene Generation
199
+
200
+ #### Generate Prompt Scene
201
+
202
+ Endpoint: `/generate-prompt-scene`
203
+ Method: POST
204
+
205
+ Generates a scene description from a text prompt.
206
+
207
+ Request:
208
+
209
+ - Content-Type: `application/json`
210
+ - Body:
211
+
212
+ ```json
213
+ {
214
+ "prompt": "Your scene description prompt"
215
+ }
216
+ ```
217
+
218
+ Response:
219
+
220
+ ```json
221
+ {
222
+ "scene_description": "Generated scene description based on prompt"
223
+ }
224
+ ```
225
+
226
+ Curl command:
227
+
228
+ ```bash
229
+ curl -X POST \
230
+ -H "Content-Type: application/json" \
231
+ -d '{"prompt": "Explain how neural networks work"}' \
232
+ http://localhost:8000/generate-prompt-scene
233
+ ```
234
+
235
+ ### Animation Generation
236
+
237
+ #### Generate Animation
238
+
239
+ Endpoint: `/generate-animation`
240
+ Method: POST
241
+
242
+ Generates a Manim animation based on a text prompt.
243
+
244
+ Request:
245
+
246
+ - Content-Type: `application/json`
247
+ - Body:
248
+
249
+ ```json
250
+ {
251
+ "prompt": "Your animation prompt"
252
+ }
253
+ ```
254
+
255
+ Response:
256
+
257
+ - Content-Type: `video/mp4`
258
+ - Body: Generated MP4 animation file
259
+
260
+ Curl command:
261
+
262
+ ```bash
263
+ curl -X POST \
264
+ -H "Content-Type: application/json" \
265
+ -d '{"prompt": "Create an animation explaining quantum computing"}' \
266
+ --output animation.mp4 \
267
+ http://localhost:8000/generate-animation
268
+ ```
269
+
270
+ ### Error Handling
271
+
272
+ All endpoints follow consistent error handling:
273
+
274
+ - 400: Bad Request - Invalid input or missing required fields
275
+ - 500: Internal Server Error - Processing or generation failure
276
+
277
+ Error responses include a detail message:
278
+
279
+ ```json
280
+ {
281
+ "detail": "Error description"
282
+ }
283
+ ```
284
+
285
+ ### Notes
286
+
287
+ 1. The API processes PDFs and generates animations using the Manim library
288
+ 2. Scene descriptions are generated using Language Models (LLMs)
289
+ 3. Animations are rendered using Manim with specific quality settings (-pql flag)
290
+ 4. All generated files are handled in temporary directories and cleaned up automatically
291
+ 5. PDF processing includes automatic compression for optimal performance
292
+
293
+ </details>
294
+
295
+ ## Coming Soon
296
+
297
+ - **Improved Generation Quality**
298
+ Enhance the clarity and precision of generated animations and videos.
299
+
300
+ - **Video Transcription**
301
+ Automatically generate scripts explaining how concepts in the video relate to the research paper.
302
+
303
+ - **Adding Audio**
304
+ Support for adding voiceovers and background music to create more engaging visualizations.
305
+
306
+ - **Chrome Extension**
307
+ Based on the code graciously contributed by [Dr. Seth Dobrin](https://drsethdobrin.com/) under the [Creative Commons License](https://github.com/HyperCluster-Tech/manimator-chrome-extension/blob/main/LICENSE), we will be releasing a Chrome Extension on the Chrome Web Store soon!
308
+
309
+ ## Limitations
310
+
311
+ - **LLM Limitations**
312
+ For accurate document parsing and code generation, we require large models like Gemini, DeepSeek V3 and Qwen 2.5 Coder 32B, which cannot be run locally.
313
+
314
+ - **Video Generation Limitations**
315
+ The generated video may sometimes exhibit overlap between scenes and rendered elements, leading to visual inconsistencies. Additionally, it sometimes fails to effectively visualize complex papers in a relevant and meaningful manner.
316
+
317
+ ## License
318
+
319
+ manimator is licensed under the MIT License. See `LICENSE` for more information.
320
+ The project uses the [Manim engine](https://github.com/ManimCommunity/manim) under the hood, which is double-licensed under the MIT license, with copyright by 3blue1brown LLC and copyright by Manim Community Developers.
321
+
322
+ ## Acknowledgements
323
+
324
+ We acknowledge the [Manim Community](https://www.manim.community/) and [3Blue1Brown](https://github.com/3b1b/manim) for developing and maintaining the Manim library, which serves as the foundation for this project. Project developers include: [Samarth P](https://github.com/samarth777), [Vyoman Jain](https://github.com/VyoJ), [Shiva Golugula](https://github.com/Shiva4113), and [M Sai Sathvik](https://github.com/User-LazySloth) for their efforts in developing **manimator**.
325
+
326
+ Models and Providers being used:
327
+
328
+ - DeepSeek-V3
329
+ - Llama 3.3 70B via Groq
330
+ - Gemini 1.5 Flash / 2.0 Flash-experimental
331
+
332
+ ---
333
+
334
+ ## Contact
335
+
336
+ For any inquiries, please contact us at [email protected] or refer to our website [hypercluster.tech](https://www.hypercluster.tech/)
337
+
338
+ <img src="https://api.star-history.com/svg?repos=HyperCluster-Tech/manimator&type=Date" alt="Star History Chart">