Spaces:

rambo12
/

text-to-video-generator

Running

App Files Files Community

Pravin Barapatre commited on Jul 15

Commit

db8251f

0 Parent(s):

Pin dependencies for Hugging Face Spaces compatibility and remove submodule issue

Browse files

Files changed (13) hide show

.gitignore +79 -0
.gradio/certificate.pem +31 -0
README.md +282 -0
app.py +651 -0
demo.py +57 -0
requirements.txt +16 -0
simple_generator.py +134 -0
test_app.py +479 -0
text-to-video-generator/.gitattributes +35 -0
text-to-video-generator/README.md +12 -0
text-to-video-generator/app.py +651 -0
text-to-video-generator/requirements.txt +16 -0
text_to_video.py +289 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,79 @@

+# Generated videos
+*.mp4
+*.avi
+*.mov
+*.mkv
+*.webm
+# Model caches
+.cache/
+models/
+checkpoints/
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# Virtual environments
+venv/
+env/
+ENV/
+env.bak/
+venv.bak/
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+# OS
+.DS_Store
+.DS_Store?
+._*
+.Spotlight-V100
+.Trashes
+ehthumbs.db
+Thumbs.db
+# Logs
+*.log
+logs/
+# Temporary files
+tmp/
+temp/
+*.tmp
+# Hugging Face cache
+.huggingface/
+# Jupyter Notebook
+.ipynb_checkpoints
+# Environment variables
+.env
+.env.local
+.env.development.local
+.env.test.local
+.env.production.local

.gradio/certificate.pem ADDED Viewed

	@@ -0,0 +1,31 @@

+-----BEGIN CERTIFICATE-----
+MIIFazCCA1OgAwIBAgIRAIIQz7DSQONZRGPgu2OCiwAwDQYJKoZIhvcNAQELBQAw
+TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh
+cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMTUwNjA0MTEwNDM4
+WhcNMzUwNjA0MTEwNDM4WjBPMQswCQYDVQQGEwJVUzEpMCcGA1UEChMgSW50ZXJu
+ZXQgU2VjdXJpdHkgUmVzZWFyY2ggR3JvdXAxFTATBgNVBAMTDElTUkcgUm9vdCBY
+MTCCAiIwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAK3oJHP0FDfzm54rVygc
+h77ct984kIxuPOZXoHj3dcKi/vVqbvYATyjb3miGbESTtrFj/RQSa78f0uoxmyF+
+0TM8ukj13Xnfs7j/EvEhmkvBioZxaUpmZmyPfjxwv60pIgbz5MDmgK7iS4+3mX6U
+A5/TR5d8mUgjU+g4rk8Kb4Mu0UlXjIB0ttov0DiNewNwIRt18jA8+o+u3dpjq+sW
+T8KOEUt+zwvo/7V3LvSye0rgTBIlDHCNAymg4VMk7BPZ7hm/ELNKjD+Jo2FR3qyH
+B5T0Y3HsLuJvW5iB4YlcNHlsdu87kGJ55tukmi8mxdAQ4Q7e2RCOFvu396j3x+UC
+B5iPNgiV5+I3lg02dZ77DnKxHZu8A/lJBdiB3QW0KtZB6awBdpUKD9jf1b0SHzUv
+KBds0pjBqAlkd25HN7rOrFleaJ1/ctaJxQZBKT5ZPt0m9STJEadao0xAH0ahmbWn
+OlFuhjuefXKnEgV4We0+UXgVCwOPjdAvBbI+e0ocS3MFEvzG6uBQE3xDk3SzynTn
+jh8BCNAw1FtxNrQHusEwMFxIt4I7mKZ9YIqioymCzLq9gwQbooMDQaHWBfEbwrbw
+qHyGO0aoSCqI3Haadr8faqU9GY/rOPNk3sgrDQoo//fb4hVC1CLQJ13hef4Y53CI
+rU7m2Ys6xt0nUW7/vGT1M0NPAgMBAAGjQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNV
+HRMBAf8EBTADAQH/MB0GA1UdDgQWBBR5tFnme7bl5AFzgAiIyBpY9umbbjANBgkq
+hkiG9w0BAQsFAAOCAgEAVR9YqbyyqFDQDLHYGmkgJykIrGF1XIpu+ILlaS/V9lZL
+ubhzEFnTIZd+50xx+7LSYK05qAvqFyFWhfFQDlnrzuBZ6brJFe+GnY+EgPbk6ZGQ
+3BebYhtF8GaV0nxvwuo77x/Py9auJ/GpsMiu/X1+mvoiBOv/2X/qkSsisRcOj/KK
+NFtY2PwByVS5uCbMiogziUwthDyC3+6WVwW6LLv3xLfHTjuCvjHIInNzktHCgKQ5
+ORAzI4JMPJ+GslWYHb4phowim57iaztXOoJwTdwJx4nLCgdNbOhdjsnvzqvHu7Ur
+TkXWStAmzOVyyghqpZXjFaH3pO3JLF+l+/+sKAIuvtd7u+Nxe5AW0wdeRlN8NwdC
+jNPElpzVmbUq4JUagEiuTDkHzsxHpFKVK7q4+63SM1N95R1NbdWhscdCb+ZAJzVc
+oyi3B43njTOQ5yOf+1CceWxG1bQVs5ZufpsMljq4Ui0/1lvh+wjChP4kqKOJ2qxq
+4RgqsahDYVvTH9w7jXbyLeiNdd8XM2w9U/t7y0Ff/9yi0GE44Za4rF2LN9d11TPA
+mRGunUHBcnWEvgJBQl9nJEiU0Zsnvgc/ubhPgXRR4Xq37Z0j4r7g1SgEEzwxA57d
+emyPxgcYxn/eR44/KJ4EBs+lVDR3veyJm+kXQ99b21/+jh5Xos1AnX5iItreGCc=
+-----END CERTIFICATE-----

README.md ADDED Viewed

	@@ -0,0 +1,282 @@

+# Text-to-Video Generation with Hugging Face Models
+A powerful text-to-video generation application using state-of-the-art AI models from Hugging Face. Generate high-quality videos from text descriptions with an intuitive web interface or command-line tool.
+## Features
+- **Multiple Models**: Support for various text-to-video models including DAMO, Zeroscope, and Stable Video Diffusion
+- **Web Interface**: Beautiful Gradio-based web UI for easy interaction
+- **Command Line**: Simple command-line interface for automation and scripting
+- **GPU Optimization**: Automatic GPU detection and memory optimization
+- **Customizable Parameters**: Control video length, quality, and generation parameters
+- **Reproducible Results**: Seed-based generation for consistent outputs
+## Supported Models
+| Model | Description | Max Frames | FPS | Quality |
+|-------|-------------|------------|-----|---------|
+| `damo-vilab/text-to-video-ms-1.7b` | Fast and efficient text-to-video model | 16 | 8 | Good |
+| `cerspense/zeroscope_v2_XL` | High-quality text-to-video model | 24 | 6 | Excellent |
+| `stabilityai/stable-video-diffusion-img2vid-xt` | Image-to-video model (requires initial image) | 25 | 6 | Excellent |
+## Installation
+### Prerequisites
+- Python 3.8 or higher
+- CUDA-compatible GPU (recommended for faster generation)
+- At least 8GB RAM (16GB+ recommended)
+### Setup
+1. **Clone or download this repository**
+2. **Install dependencies**:
+   ```bash
+   pip install -r requirements.txt
+   ```
+3. **Verify installation**:
+   ```bash
+   python -c "import torch; print(f'PyTorch version: {torch.__version__}'); print(f'CUDA available: {torch.cuda.is_available()}')"
+   ```
+## Usage
+### Web Interface (Recommended)
+Launch the interactive web interface:
+```bash
+python text_to_video.py
+```
+The interface will be available at `http://localhost:7860` and will also provide a public shareable link.
+**Features of the web interface:**
+- Intuitive parameter controls
+- Real-time model information
+- Example prompts to get started
+- Live video preview
+- Easy parameter adjustment
+### Command Line Interface
+For automation or scripting, use the command-line interface:
+```bash
+python simple_generator.py "A beautiful sunset over the ocean"
+```
+**Command-line options:**
+```bash
+python simple_generator.py --help
+```
+**Example commands:**
+Basic generation:
+```bash
+python simple_generator.py "A cat playing with a ball of yarn"
+```
+Advanced generation with custom parameters:
+```bash
+python simple_generator.py "A futuristic city with flying cars" \
+    --model cerspense/zeroscope_v2_XL \
+    --frames 24 \
+    --fps 6 \
+    --steps 30 \
+    --guidance 8.0 \
+    --seed 42 \
+    --output my_video.mp4
+```
+## Free Hosting Options
+### 1. Hugging Face Spaces (Recommended)
+**Pros:**
+- Completely free
+- Optimized for AI applications
+- Automatic GPU allocation
+- Easy deployment
+- Built-in model caching
+**Deployment Steps:**
+1. **Create a Hugging Face account** at https://huggingface.co
+2. **Create a new Space:**
+   - Go to https://huggingface.co/spaces
+   - Click "Create new Space"
+   - Choose "Gradio" as the SDK
+   - Select "CPU" or "GPU" (GPU requires verification)
+3. **Upload your files:**
+   - Upload `app.py` (already created for you)
+   - Upload `requirements.txt`
+   - Upload `README.md`
+4. **Your app will be live** at: `https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME`
+### 2. Streamlit Cloud
+**Pros:**
+- Free tier available
+- Easy deployment
+- Good for data science apps
+**Deployment:**
+```bash
+pip install streamlit
+streamlit deploy
+```
+### 3. Railway
+**Pros:**
+- Free tier with $5 credit
+- Easy deployment
+- Good performance
+**Deployment:**
+```bash
+pip install railway
+railway login
+railway init
+railway deploy
+```
+### 4. Render
+**Pros:**
+- Free tier available
+- Easy deployment
+- Good documentation
+**Deployment:**
+- Connect your GitHub repository
+- Choose "Web Service"
+- Set build command and start command
+### 5. Google Colab (For Testing)
+**Pros:**
+- Free GPU access
+- Good for testing
+- Jupyter notebook interface
+**Usage:**
+```python
+!pip install gradio diffusers transformers
+!git clone https://github.com/your-repo/text-to-video
+%cd text-to-video
+!python app.py
+```
+## Parameters Explained
+- **Text Prompt**: The description of the video you want to generate
+- **Model**: Choose from available Hugging Face models
+- **Number of Frames**: Controls video length (more frames = longer video)
+- **FPS**: Frames per second (affects playback speed)
+- **Inference Steps**: Number of denoising steps (more steps = better quality but slower)
+- **Guidance Scale**: How closely to follow the prompt (higher = more adherence)
+- **Seed**: Random seed for reproducible results
+## Performance Tips
+### For Faster Generation:
+- Use fewer inference steps (10-20)
+- Use the DAMO model for speed
+- Reduce number of frames
+- Use GPU if available
+### For Better Quality:
+- Increase inference steps (30-50)
+- Use Zeroscope or Stable Video Diffusion models
+- Increase guidance scale (8-12)
+- Use more frames for longer videos
+### Memory Optimization:
+- The application automatically enables memory optimizations on GPU
+- For limited GPU memory, use fewer frames and steps
+- Consider using CPU if GPU memory is insufficient
+## Troubleshooting
+### Common Issues
+1. **CUDA Out of Memory**:
+   - Reduce number of frames or inference steps
+   - Use CPU instead of GPU
+   - Close other GPU-intensive applications
+2. **Model Loading Errors**:
+   - Check internet connection
+   - Ensure sufficient disk space for model downloads
+   - Try a different model
+3. **Slow Generation**:
+   - Use GPU if available
+   - Reduce inference steps
+   - Use the DAMO model for speed
+### System Requirements
+- **Minimum**: 8GB RAM, CPU-only
+- **Recommended**: 16GB+ RAM, CUDA-compatible GPU with 8GB+ VRAM
+- **Optimal**: 32GB+ RAM, RTX 3080/4080 or better
+## Model Information
+### DAMO Text-to-Video MS-1.7B
+- **Best for**: Fast prototyping and quick results
+- **Speed**: Very fast
+- **Quality**: Good
+- **Use case**: Quick demos, iterative testing
+### Zeroscope v2 XL
+- **Best for**: High-quality production videos
+- **Speed**: Medium
+- **Quality**: Excellent
+- **Use case**: Final outputs, professional content
+### Stable Video Diffusion XT
+- **Best for**: Image-to-video generation
+- **Speed**: Medium
+- **Quality**: Excellent
+- **Use case**: Animating static images
+## Examples
+Try these example prompts to get started:
+- "A beautiful sunset over the ocean with waves crashing on the shore"
+- "A cat playing with a ball of yarn in a cozy living room"
+- "A futuristic city with flying cars and neon lights"
+- "A butterfly emerging from a cocoon in a garden"
+- "A rocket launching into space with fire and smoke"
+- "A dancer performing ballet in a grand theater"
+- "A robot walking through a snowy forest"
+- "A flower blooming in time-lapse"
+## Contributing
+Feel free to contribute by:
+- Adding new models
+- Improving the interface
+- Optimizing performance
+- Adding new features
+## License
+This project uses open-source models and libraries. Please check the individual model licenses on Hugging Face for commercial usage restrictions.
+## Resources
+- [Hugging Face Diffusers Documentation](https://huggingface.co/docs/diffusers/index)
+- [Text-to-Video Models on Hugging Face](https://huggingface.co/models?pipeline_tag=text-to-video)
+- [PyTorch Documentation](https://pytorch.org/docs/)
+- [Gradio Documentation](https://gradio.app/docs/)

app.py ADDED Viewed

	@@ -0,0 +1,651 @@

+import torch
+import gradio as gr
+from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler
+from diffusers.utils import export_to_video
+import numpy as np
+import os
+import logging
+from gtts import gTTS
+from moviepy.editor import VideoFileClip, AudioFileClip, CompositeAudioClip
+import tempfile
+# Set up logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+class TextToVideoGenerator:
+    def __init__(self):
+        self.pipeline = None
+        self.current_model = None
+        self.device = "cuda" if torch.cuda.is_available() else "cpu"
+        logger.info(f"Using device: {self.device}")
+        # Available models - including the advanced Wan2.1 model
+        self.models = {
+            "damo-vilab/text-to-video-ms-1.7b": {
+                "name": "DAMO Text-to-Video MS-1.7B",
+                "description": "Fast and efficient text-to-video model",
+                "max_frames": 16,
+                "fps": 8,
+                "quality": "Good",
+                "speed": "Fast"
+            },
+            "cerspense/zeroscope_v2_XL": {
+                "name": "Zeroscope v2 XL",
+                "description": "High-quality text-to-video model",
+                "max_frames": 24,
+                "fps": 6,
+                "quality": "Excellent",
+                "speed": "Medium"
+            },
+            "Wan-AI/Wan2.1-T2V-14B": {
+                "name": "Wan2.1-T2V-14B (SOTA)",
+                "description": "State-of-the-art text-to-video model with 14B parameters",
+                "max_frames": 32,
+                "fps": 8,
+                "quality": "SOTA",
+                "speed": "Medium",
+                "resolutions": ["480P", "720P"],
+                "features": ["Chinese & English text", "High motion dynamics", "Best quality"]
+            }
+        }
+        # Voice options (gTTS only supports language, not gender/age)
+        self.voices = {
+            "Default (English)": "en"
+        }
+    def generate_audio(self, text, voice_type):
+        """Generate audio from text using gTTS"""
+        try:
+            lang = self.voices[voice_type]
+            tts = gTTS(text=text, lang=lang)
+            with tempfile.NamedTemporaryFile(suffix=".mp3", delete=False) as temp_audio:
+                audio_path = temp_audio.name
+            tts.save(audio_path)
+            logger.info(f"Audio generated successfully: {audio_path}")
+            return audio_path
+        except Exception as e:
+            logger.error(f"Error generating audio: {str(e)}")
+            return None
+    def merge_audio_video(self, video_path, audio_path, output_path):
+        """Merge audio and video using moviepy"""
+        try:
+            # Load video and audio
+            video_clip = VideoFileClip(video_path)
+            audio_clip = AudioFileClip(audio_path)
+            # Ensure audio duration matches video duration
+            if audio_clip.duration > video_clip.duration:
+                audio_clip = audio_clip.subclip(0, video_clip.duration)
+            elif audio_clip.duration < video_clip.duration:
+                # Loop audio if it's shorter than video
+                loops_needed = int(video_clip.duration / audio_clip.duration) + 1
+                audio_clip = CompositeAudioClip([audio_clip] * loops_needed).subclip(0, video_clip.duration)
+            # Merge audio and video
+            final_clip = video_clip.set_audio(audio_clip)
+            # Write final video with audio
+            final_clip.write_videofile(output_path, codec='libx264', audio_codec='aac')
+            # Clean up
+            video_clip.close()
+            audio_clip.close()
+            final_clip.close()
+            logger.info(f"Audio and video merged successfully: {output_path}")
+            return output_path
+        except Exception as e:
+            logger.error(f"Error merging audio and video: {str(e)}")
+            return None
+    def load_model(self, model_id):
+        """Load the specified model"""
+        if self.current_model == model_id and self.pipeline is not None:
+            return f"Model {self.models[model_id]['name']} is already loaded"
+        try:
+            logger.info(f"Loading model: {model_id}")
+            # Clear GPU memory if needed
+            if torch.cuda.is_available():
+                torch.cuda.empty_cache()
+            # Special handling for Wan2.1 model
+            if model_id == "Wan-AI/Wan2.1-T2V-14B":
+                # Wan2.1 requires specific configuration
+                self.pipeline = DiffusionPipeline.from_pretrained(
+                    model_id,
+                    torch_dtype=torch.float16 if self.device == "cuda" else torch.float32,
+                    variant="fp16" if self.device == "cuda" else None,
+                    use_safetensors=True
+                )
+            else:
+                # Standard loading for other models
+                self.pipeline = DiffusionPipeline.from_pretrained(
+                    model_id,
+                    torch_dtype=torch.float16 if self.device == "cuda" else torch.float32,
+                    variant="fp16" if self.device == "cuda" else None
+                )
+            # Move to device
+            self.pipeline = self.pipeline.to(self.device)
+            # Optimize scheduler for faster inference
+            if hasattr(self.pipeline, 'scheduler'):
+                self.pipeline.scheduler = DPMSolverMultistepScheduler.from_config(
+                    self.pipeline.scheduler.config
+                )
+            # Enable memory efficient attention if available
+            if self.device == "cuda":
+                self.pipeline.enable_model_cpu_offload()
+                self.pipeline.enable_vae_slicing()
+            self.current_model = model_id
+            logger.info(f"Successfully loaded model: {model_id}")
+            return f"Successfully loaded {self.models[model_id]['name']}"
+        except Exception as e:
+            logger.error(f"Error loading model: {str(e)}")
+            return f"Error loading model: {str(e)}"
+    def generate_video(self, prompt, model_id, num_frames=16, fps=8, num_inference_steps=25, guidance_scale=7.5, seed=None, resolution="480P", voice_script="", voice_type="Default (English)", add_voice=True):
+        """Generate video from text prompt with optional voice"""
+        try:
+            # Use prompt as voice script if voice_script is empty
+            if not voice_script.strip() and add_voice:
+                voice_script = prompt
+            # Load model if not already loaded
+            if self.current_model != model_id:
+                load_result = self.load_model(model_id)
+                if "Error" in load_result:
+                    return None, load_result
+            # Set seed for reproducibility
+            if seed is not None:
+                torch.manual_seed(seed)
+                if torch.cuda.is_available():
+                    torch.cuda.manual_seed(seed)
+            # Get model config
+            model_config = self.models[model_id]
+            num_frames = min(num_frames, model_config["max_frames"])
+            fps = model_config["fps"]
+            # Special handling for Wan2.1 model
+            if model_id == "Wan-AI/Wan2.1-T2V-14B":
+                # Wan2.1 specific parameters
+                if resolution == "720P":
+                    width, height = 1280, 720
+                else:  # 480P
+                    width, height = 832, 480
+                logger.info(f"Generating Wan2.1 video with prompt: {prompt}")
+                logger.info(f"Parameters: frames={num_frames}, fps={fps}, steps={num_inference_steps}, resolution={resolution}")
+                # Generate video with Wan2.1 specific settings
+                result = self.pipeline(
+                    prompt,
+                    num_inference_steps=num_inference_steps,
+                    guidance_scale=guidance_scale,
+                    num_frames=num_frames,
+                    width=width,
+                    height=height
+                )
+                video_frames = result['frames'] if isinstance(result, dict) else result.frames
+            else:
+                # Standard generation for other models
+                logger.info(f"Generating video with prompt: {prompt}")
+                logger.info(f"Parameters: frames={num_frames}, fps={fps}, steps={num_inference_steps}")
+                result = self.pipeline(
+                    prompt,
+                    num_inference_steps=num_inference_steps,
+                    guidance_scale=guidance_scale,
+                    num_frames=num_frames
+                )
+                video_frames = result['frames'] if isinstance(result, dict) else result.frames
+            # Convert to numpy array
+            video_frames = np.array(video_frames)
+            # Save video
+            output_path = f"generated_video_{seed if seed else 'random'}.mp4"
+            export_to_video(video_frames, output_path, fps=fps)
+            logger.info(f"Video saved to: {output_path}")
+            # Add voice if requested
+            if add_voice and voice_script.strip():
+                logger.info(f"Generating voice for script: {voice_script}")
+                # Generate audio
+                audio_path = self.generate_audio(voice_script, voice_type)
+                if audio_path:
+                    # Create final output path with voice
+                    final_output_path = f"generated_video_with_voice_{seed if seed else 'random'}.mp4"
+                    # Merge audio and video
+                    final_path = self.merge_audio_video(output_path, audio_path, final_output_path)
+                    # Clean up temporary files
+                    try:
+                        os.unlink(audio_path)
+                        os.unlink(output_path)
+                    except:
+                        pass
+                    if final_path:
+                        return final_path, f"Video with voice generated successfully! Saved as {final_path}"
+                    else:
+                        return output_path, f"Video generated but voice merging failed. Saved as {output_path}"
+                else:
+                    return output_path, f"Video generated but voice generation failed. Saved as {output_path}"
+            else:
+                return output_path, f"Video generated successfully! Saved as {output_path}"
+        except Exception as e:
+            logger.error(f"Error generating video: {str(e)}")
+            return None, f"Error generating video: {str(e)}"
+    def get_available_models(self):
+        """Get list of available models"""
+        return list(self.models.keys())
+    def get_model_info(self, model_id):
+        """Get information about a specific model"""
+        if model_id in self.models:
+            return self.models[model_id]
+        return None
+    def get_available_voices(self):
+        """Get list of available voices"""
+        return list(self.voices.keys())
+# Initialize the generator
+generator = TextToVideoGenerator()
+def create_interface():
+    """Create Gradio interface"""
+    def generate_video_interface(prompt, model_id, num_frames, fps, num_inference_steps, guidance_scale, seed, resolution, voice_script, voice_type, add_voice):
+        if not prompt.strip():
+            return None, "Please enter a prompt"
+        return generator.generate_video(
+            prompt=prompt,
+            model_id=model_id,
+            num_frames=num_frames,
+            fps=fps,
+            num_inference_steps=num_inference_steps,
+            guidance_scale=guidance_scale,
+            seed=seed,
+            resolution=resolution,
+            voice_script=voice_script,
+            voice_type=voice_type,
+            add_voice=add_voice
+        )
+    # Custom CSS for professional styling
+    custom_css = """
+    .gradio-container {
+        max-width: 1200px !important;
+        margin: 0 auto !important;
+    }
+    .header {
+        text-align: center;
+        padding: 2rem 0;
+        background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+        color: white;
+        border-radius: 15px;
+        margin-bottom: 2rem;
+    }
+    .header h1 {
+        font-size: 2.5rem;
+        font-weight: 700;
+        margin: 0;
+        text-shadow: 2px 2px 4px rgba(0,0,0,0.3);
+    }
+    .header p {
+        font-size: 1.1rem;
+        margin: 0.5rem 0 0 0;
+        opacity: 0.9;
+    }
+    .feature-card {
+        background: white;
+        border-radius: 10px;
+        padding: 1.5rem;
+        box-shadow: 0 4px 6px rgba(0,0,0,0.1);
+        margin-bottom: 1rem;
+        border-left: 4px solid #667eea;
+    }
+    .feature-card h3 {
+        color: #333;
+        margin: 0 0 0.5rem 0;
+        font-size: 1.2rem;
+    }
+    .feature-card p {
+        color: #666;
+        margin: 0;
+        font-size: 0.9rem;
+    }
+    .model-info {
+        background: #f8f9fa;
+        border-radius: 8px;
+        padding: 1rem;
+        border: 1px solid #e9ecef;
+    }
+    .model-info h4 {
+        color: #495057;
+        margin: 0 0 0.5rem 0;
+        font-size: 1rem;
+    }
+    .model-info p {
+        color: #6c757d;
+        margin: 0.25rem 0;
+        font-size: 0.85rem;
+    }
+    .generate-btn {
+        background: linear-gradient(135deg, #667eea 0%, #764ba2 100%) !important;
+        border: none !important;
+        color: white !important;
+        font-weight: 600 !important;
+        padding: 1rem 2rem !important;
+        border-radius: 10px !important;
+        font-size: 1.1rem !important;
+        transition: all 0.3s ease !important;
+    }
+    .generate-btn:hover {
+        transform: translateY(-2px) !important;
+        box-shadow: 0 6px 12px rgba(102, 126, 234, 0.4) !important;
+    }
+    .example-card {
+        background: #f8f9fa;
+        border-radius: 8px;
+        padding: 1rem;
+        margin: 0.5rem 0;
+        border: 1px solid #e9ecef;
+        cursor: pointer;
+        transition: all 0.2s ease;
+    }
+    .example-card:hover {
+        background: #e9ecef;
+        transform: translateX(5px);
+    }
+    .status-box {
+        background: #e3f2fd;
+        border: 1px solid #2196f3;
+        border-radius: 8px;
+        padding: 1rem;
+    }
+    .pricing-info {
+        background: linear-gradient(135deg, #ffecd2 0%, #fcb69f 100%);
+        border-radius: 10px;
+        padding: 1rem;
+        text-align: center;
+        margin: 1rem 0;
+    }
+    .pricing-info h4 {
+        color: #d84315;
+        margin: 0 0 0.5rem 0;
+    }
+    .pricing-info p {
+        color: #bf360c;
+        margin: 0;
+        font-size: 0.9rem;
+    }
+    """
+    # Create interface
+    with gr.Blocks(title="AI Video Creator Pro", theme=gr.themes.Soft(), css=custom_css) as interface:
+        # Professional Header
+        with gr.Group(elem_classes="header"):
+            gr.Markdown("""
+            # 🎬 AI Video Creator Pro
+            ### Transform Your Ideas Into Stunning Videos with AI-Powered Generation
+            """)
+        with gr.Row():
+            with gr.Column(scale=2):
+                # Main Input Section
+                with gr.Group(elem_classes="feature-card"):
+                    gr.Markdown("## 🎯 Video Generation")
+                    prompt = gr.Textbox(
+                        label="📝 Video Description",
+                        placeholder="Describe the video you want to create... (e.g., 'A majestic dragon soaring through a mystical forest with glowing mushrooms')",
+                        lines=3,
+                        max_lines=5,
+                        container=True
+                    )
+                    with gr.Row():
+                        model_id = gr.Dropdown(
+                            choices=generator.get_available_models(),
+                            value=generator.get_available_models()[0],
+                            label="🤖 AI Model",
+                            info="Choose the AI model for video generation",
+                            container=True
+                        )
+                        resolution = gr.Dropdown(
+                            choices=["480P", "720P"],
+                            value="480P",
+                            label="📐 Resolution (Wan2.1 only)",
+                            info="Select video resolution",
+                            visible=False,
+                            container=True
+                        )
+                    with gr.Row():
+                        num_frames = gr.Slider(
+                            minimum=8,
+                            maximum=32,
+                            value=16,
+                            step=1,
+                            label="🎞️ Video Length (Frames)",
+                            info="More frames = longer video"
+                        )
+                        fps = gr.Slider(
+                            minimum=4,
+                            maximum=12,
+                            value=8,
+                            step=1,
+                            label="⚡ FPS",
+                            info="Frames per second"
+                        )
+                    with gr.Row():
+                        num_inference_steps = gr.Slider(
+                            minimum=10,
+                            maximum=50,
+                            value=25,
+                            step=1,
+                            label="🎨 Quality Steps",
+                            info="More steps = better quality but slower"
+                        )
+                        guidance_scale = gr.Slider(
+                            minimum=1.0,
+                            maximum=20.0,
+                            value=7.5,
+                            step=0.5,
+                            label="🎯 Guidance Scale",
+                            info="Higher values = more prompt adherence"
+                        )
+                    seed = gr.Number(
+                        label="🎲 Seed (Optional)",
+                        value=None,
+                        info="Set for reproducible results",
+                        container=True
+                    )
+                # Voice Section
+                with gr.Group(elem_classes="feature-card"):
+                    gr.Markdown("## 🎤 Voice & Audio")
+                    with gr.Row():
+                        add_voice = gr.Checkbox(
+                            label="🎵 Add Voice Narration",
+                            value=True,
+                            info="Enable to add professional voice-over"
+                        )
+                        voice_type = gr.Dropdown(
+                            choices=generator.get_available_voices(),
+                            value="Default (English)",
+                            label="🗣️ Voice Type",
+                            info="Select the voice for narration",
+                            container=True
+                        )
+                    voice_script = gr.Textbox(
+                        label="📜 Narration Script (Optional)",
+                        placeholder="Enter your narration script here... (Leave blank to use video description)",
+                        lines=2,
+                        max_lines=3,
+                        info="If left blank, the video description will be used as narration",
+                        container=True
+                    )
+                # Generate Button
+                generate_btn = gr.Button("🚀 Generate Professional Video", variant="primary", size="lg", elem_classes="generate-btn")
+                # Output Section
+                with gr.Group(elem_classes="feature-card"):
+                    gr.Markdown("## 📺 Generated Video")
+                    status_text = gr.Textbox(label="📊 Status", interactive=False, elem_classes="status-box")
+                    video_output = gr.Video(label="🎬 Your Video", elem_classes="status-box")
+            with gr.Column(scale=1):
+                # Model Information
+                with gr.Group(elem_classes="model-info"):
+                    gr.Markdown("## 🤖 AI Model Details")
+                    model_info = gr.JSON(label="Current Model Specifications", elem_classes="model-info")
+                # Pricing Information
+                with gr.Group(elem_classes="pricing-info"):
+                    gr.Markdown("## 💰 Pricing")
+                    gr.Markdown("""
+                    **Free Tier:** 5 videos per day
+                    **Pro Plan:** $9.99/month
+                    - Unlimited videos
+                    - Priority processing
+                    - HD quality
+                    - Advanced features
+                    **Enterprise:** Contact us
+                    """)
+                # Examples
+                with gr.Group():
+                    gr.Markdown("## 💡 Inspiration Examples")
+                    examples = [
+                        ["A beautiful sunset over the ocean with waves crashing on the shore"],
+                        ["A cat playing with a ball of yarn in a cozy living room"],
+                        ["A futuristic city with flying cars and neon lights"],
+                        ["A butterfly emerging from a cocoon in a garden"],
+                        ["A rocket launching into space with fire and smoke"],
+                        ["Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage"],
+                        ["A majestic dragon soaring through a mystical forest with glowing mushrooms"]
+                    ]
+                    gr.Examples(
+                        examples=examples,
+                        inputs=prompt,
+                        label="Click to try these examples",
+                        elem_classes="example-card"
+                    )
+                # Features
+                with gr.Group():
+                    gr.Markdown("## ✨ Features")
+                    gr.Markdown("""
+                    🎬 **Multiple AI Models**
+                    - State-of-the-art video generation
+                    - Quality vs speed options
+                    🎤 **Professional Voice-Over**
+                    - Multiple voice types
+                    - Custom narration scripts
+                    🎨 **Advanced Controls**
+                    - Quality settings
+                    - Resolution options
+                    - Reproducible results
+                    ⚡ **Fast Processing**
+                    - GPU acceleration
+                    - Optimized pipelines
+                    """)
+        # Event handlers
+        generate_btn.click(
+            fn=generate_video_interface,
+            inputs=[prompt, model_id, num_frames, fps, num_inference_steps, guidance_scale, seed, resolution, voice_script, voice_type, add_voice],
+            outputs=[video_output, status_text]
+        )
+        # Update model info when model changes
+        def update_model_info(model_id):
+            info = generator.get_model_info(model_id)
+            return info
+        # Show/hide resolution selector based on model
+        def update_resolution_visibility(model_id):
+            if model_id == "Wan-AI/Wan2.1-T2V-14B":
+                return gr.Dropdown(visible=True)
+            else:
+                return gr.Dropdown(visible=False)
+        model_id.change(
+            fn=update_model_info,
+            inputs=model_id,
+            outputs=model_info
+        )
+        model_id.change(
+            fn=update_resolution_visibility,
+            inputs=model_id,
+            outputs=resolution
+        )
+        # Load initial model info
+        interface.load(lambda: generator.get_model_info(generator.get_available_models()[0]), outputs=model_info)
+    return interface
+# Create and launch the interface
+interface = create_interface()
+interface.launch(
+    server_name="0.0.0.0",
+    server_port=7860,
+    share=True,
+    show_error=True
+)

demo.py ADDED Viewed

	@@ -0,0 +1,57 @@

+#!/usr/bin/env python3
+"""
+Demo script for text-to-video generation
+This script demonstrates how to use the text-to-video generator with a simple example.
+"""
+import os
+import sys
+from simple_generator import generate_video_from_text
+def main():
+    print("Text-to-Video Generation Demo")
+    print("=" * 40)
+    # Demo prompt
+    demo_prompt = "A beautiful butterfly flying through a colorful garden with flowers"
+    print(f"Generating video for prompt: '{demo_prompt}'")
+    print("This may take a few minutes depending on your hardware...")
+    print()
+    try:
+        # Generate video with default settings
+        output_path = generate_video_from_text(
+            prompt=demo_prompt,
+            model_id="damo-vilab/text-to-video-ms-1.7b",  # Fast model for demo
+            num_frames=16,
+            fps=8,
+            num_inference_steps=20,  # Reduced for faster demo
+            guidance_scale=7.5,
+            seed=42,  # Fixed seed for reproducible demo
+            output_path="demo_video.mp4"
+        )
+        print("=" * 40)
+        print("Demo completed successfully!")
+        print(f"Video saved as: {output_path}")
+        print()
+        print("You can now:")
+        print("1. Open the video file to view the result")
+        print("2. Run 'python text_to_video.py' for the web interface")
+        print("3. Try different prompts with 'python simple_generator.py'")
+    except Exception as e:
+        print(f"Error during demo: {str(e)}")
+        print()
+        print("Troubleshooting tips:")
+        print("- Make sure all dependencies are installed: pip install -r requirements.txt")
+        print("- Check if you have sufficient disk space")
+        print("- Ensure you have a stable internet connection for model download")
+        print("- Try running with CPU if GPU memory is insufficient")
+        return 1
+    return 0
+if __name__ == "__main__":
+    exit(main())

requirements.txt ADDED Viewed

	@@ -0,0 +1,16 @@

+torch==2.2.2
+torchvision==0.17.2
+diffusers==0.27.2
+transformers==4.39.3
+accelerate==0.28.0
+safetensors==0.4.2
+opencv-python==4.9.0.80
+pillow==10.3.0
+numpy==1.24.4
+gradio==4.25.0
+huggingface-hub==0.23.0
+xformers==0.0.25
+imageio==2.34.0
+imageio-ffmpeg==0.4.9
+gTTS==2.5.1
+moviepy==1.0.3

simple_generator.py ADDED Viewed

	@@ -0,0 +1,134 @@

+import torch
+from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler
+from diffusers.utils import export_to_video
+import numpy as np
+import argparse
+import logging
+# Set up logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+def generate_video_from_text(
+    prompt,
+    model_id="damo-vilab/text-to-video-ms-1.7b",
+    num_frames=16,
+    fps=8,
+    num_inference_steps=25,
+    guidance_scale=7.5,
+    seed=None,
+    output_path="generated_video.mp4"
+):
+    """
+    Generate a video from text prompt using Hugging Face models
+    Args:
+        prompt (str): Text description of the video
+        model_id (str): Hugging Face model ID
+        num_frames (int): Number of frames to generate
+        fps (int): Frames per second
+        num_inference_steps (int): Number of denoising steps
+        guidance_scale (float): Guidance scale for generation
+        seed (int): Random seed for reproducibility
+        output_path (str): Output video file path
+    """
+    # Check device
+    device = "cuda" if torch.cuda.is_available() else "cpu"
+    logger.info(f"Using device: {device}")
+    try:
+        # Set seed for reproducibility
+        if seed is not None:
+            torch.manual_seed(seed)
+            if torch.cuda.is_available():
+                torch.cuda.manual_seed(seed)
+        logger.info(f"Loading model: {model_id}")
+        # Load pipeline
+        pipeline = DiffusionPipeline.from_pretrained(
+            model_id,
+            torch_dtype=torch.float16 if device == "cuda" else torch.float32,
+            variant="fp16" if device == "cuda" else None
+        )
+        # Move to device
+        pipeline = pipeline.to(device)
+        # Optimize scheduler for faster inference
+        if hasattr(pipeline, 'scheduler'):
+            pipeline.scheduler = DPMSolverMultistepScheduler.from_config(
+                pipeline.scheduler.config
+            )
+        # Enable memory efficient attention if available
+        if device == "cuda":
+            pipeline.enable_model_cpu_offload()
+            pipeline.enable_vae_slicing()
+        logger.info(f"Generating video with prompt: {prompt}")
+        logger.info(f"Parameters: frames={num_frames}, fps={fps}, steps={num_inference_steps}")
+        # Generate video
+        video_frames = pipeline(
+            prompt,
+            num_inference_steps=num_inference_steps,
+            guidance_scale=guidance_scale,
+            num_frames=num_frames
+        ).frames
+        # Convert to numpy array
+        video_frames = np.array(video_frames)
+        # Save video
+        export_to_video(video_frames, output_path, fps=fps)
+        logger.info(f"Video saved to: {output_path}")
+        return output_path
+    except Exception as e:
+        logger.error(f"Error generating video: {str(e)}")
+        raise
+def main():
+    parser = argparse.ArgumentParser(description="Generate video from text using Hugging Face models")
+    parser.add_argument("prompt", help="Text description of the video to generate")
+    parser.add_argument("--model", default="damo-vilab/text-to-video-ms-1.7b",
+                       help="Hugging Face model ID to use")
+    parser.add_argument("--frames", type=int, default=16,
+                       help="Number of frames to generate (default: 16)")
+    parser.add_argument("--fps", type=int, default=8,
+                       help="Frames per second (default: 8)")
+    parser.add_argument("--steps", type=int, default=25,
+                       help="Number of inference steps (default: 25)")
+    parser.add_argument("--guidance", type=float, default=7.5,
+                       help="Guidance scale (default: 7.5)")
+    parser.add_argument("--seed", type=int, default=None,
+                       help="Random seed for reproducibility")
+    parser.add_argument("--output", default="generated_video.mp4",
+                       help="Output video file path (default: generated_video.mp4)")
+    args = parser.parse_args()
+    try:
+        output_path = generate_video_from_text(
+            prompt=args.prompt,
+            model_id=args.model,
+            num_frames=args.frames,
+            fps=args.fps,
+            num_inference_steps=args.steps,
+            guidance_scale=args.guidance,
+            seed=args.seed,
+            output_path=args.output
+        )
+        print(f"Video generated successfully: {output_path}")
+    except Exception as e:
+        print(f"Error: {str(e)}")
+        return 1
+    return 0
+if __name__ == "__main__":
+    exit(main())

test_app.py ADDED Viewed

	@@ -0,0 +1,479 @@

+import gradio as gr
+import logging
+import tempfile
+import os
+# Set up logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+class TextToVideoGenerator:
+    def __init__(self):
+        self.device = "cpu"  # Simplified for testing
+        # Available models - including the advanced Wan2.1 model
+        self.models = {
+            "damo-vilab/text-to-video-ms-1.7b": {
+                "name": "DAMO Text-to-Video MS-1.7B",
+                "description": "Fast and efficient text-to-video model",
+                "max_frames": 16,
+                "fps": 8,
+                "quality": "Good",
+                "speed": "Fast"
+            },
+            "cerspense/zeroscope_v2_XL": {
+                "name": "Zeroscope v2 XL",
+                "description": "High-quality text-to-video model",
+                "max_frames": 24,
+                "fps": 6,
+                "quality": "Excellent",
+                "speed": "Medium"
+            },
+            "Wan-AI/Wan2.1-T2V-14B": {
+                "name": "Wan2.1-T2V-14B (SOTA)",
+                "description": "State-of-the-art text-to-video model with 14B parameters",
+                "max_frames": 32,
+                "fps": 8,
+                "quality": "SOTA",
+                "speed": "Medium",
+                "resolutions": ["480P", "720P"],
+                "features": ["Chinese & English text", "High motion dynamics", "Best quality"]
+            }
+        }
+        # Voice options (gTTS only supports language, not gender/age)
+        self.voices = {
+            "Default (English)": "en"
+        }
+    def generate_video(self, prompt, model_id, num_frames=16, fps=8, num_inference_steps=25, guidance_scale=7.5, seed=None, resolution="480P", voice_script="", voice_type="Default (English)", add_voice=True):
+        """Generate video from text prompt with optional voice (DEMO VERSION)"""
+        try:
+            # This is a demo version that simulates video generation
+            logger.info(f"DEMO: Would generate video with prompt: {prompt}")
+            logger.info(f"DEMO: Model: {model_id}, Frames: {num_frames}, FPS: {fps}")
+            if add_voice and voice_script.strip():
+                logger.info(f"DEMO: Would add voice narration: {voice_script}")
+            # Create a dummy video file for demonstration
+            dummy_video_path = "demo_video.mp4"
+            # For demo purposes, return a success message
+            return dummy_video_path, f"DEMO: Video generation completed! (This is a test version - no actual video generated)"
+        except Exception as e:
+            logger.error(f"Error in demo video generation: {str(e)}")
+            return None, f"Demo error: {str(e)}"
+    def get_available_models(self):
+        """Get list of available models"""
+        return list(self.models.keys())
+    def get_model_info(self, model_id):
+        """Get information about a specific model"""
+        if model_id in self.models:
+            return self.models[model_id]
+        return None
+    def get_available_voices(self):
+        """Get list of available voices"""
+        return list(self.voices.keys())
+# Initialize the generator
+generator = TextToVideoGenerator()
+def create_interface():
+    """Create Gradio interface"""
+    def generate_video_interface(prompt, model_id, num_frames, fps, num_inference_steps, guidance_scale, seed, resolution, voice_script, voice_type, add_voice):
+        if not prompt.strip():
+            return None, "Please enter a prompt"
+        return generator.generate_video(
+            prompt=prompt,
+            model_id=model_id,
+            num_frames=num_frames,
+            fps=fps,
+            num_inference_steps=num_inference_steps,
+            guidance_scale=guidance_scale,
+            seed=seed,
+            resolution=resolution,
+            voice_script=voice_script,
+            voice_type=voice_type,
+            add_voice=add_voice
+        )
+    # Custom CSS for professional styling
+    custom_css = """
+    .gradio-container {
+        max-width: 1200px !important;
+        margin: 0 auto !important;
+    }
+    .header {
+        text-align: center;
+        padding: 2rem 0;
+        background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+        color: white;
+        border-radius: 15px;
+        margin-bottom: 2rem;
+    }
+    .header h1 {
+        font-size: 2.5rem;
+        font-weight: 700;
+        margin: 0;
+        text-shadow: 2px 2px 4px rgba(0,0,0,0.3);
+    }
+    .header p {
+        font-size: 1.1rem;
+        margin: 0.5rem 0 0 0;
+        opacity: 0.9;
+    }
+    .feature-card {
+        background: white;
+        border-radius: 10px;
+        padding: 1.5rem;
+        box-shadow: 0 4px 6px rgba(0,0,0,0.1);
+        margin-bottom: 1rem;
+        border-left: 4px solid #667eea;
+    }
+    .feature-card h3 {
+        color: #333;
+        margin: 0 0 0.5rem 0;
+        font-size: 1.2rem;
+    }
+    .feature-card p {
+        color: #666;
+        margin: 0;
+        font-size: 0.9rem;
+    }
+    .model-info {
+        background: #f8f9fa;
+        border-radius: 8px;
+        padding: 1rem;
+        border: 1px solid #e9ecef;
+    }
+    .model-info h4 {
+        color: #495057;
+        margin: 0 0 0.5rem 0;
+        font-size: 1rem;
+    }
+    .model-info p {
+        color: #6c757d;
+        margin: 0.25rem 0;
+        font-size: 0.85rem;
+    }
+    .generate-btn {
+        background: linear-gradient(135deg, #667eea 0%, #764ba2 100%) !important;
+        border: none !important;
+        color: white !important;
+        font-weight: 600 !important;
+        padding: 1rem 2rem !important;
+        border-radius: 10px !important;
+        font-size: 1.1rem !important;
+        transition: all 0.3s ease !important;
+    }
+    .generate-btn:hover {
+        transform: translateY(-2px) !important;
+        box-shadow: 0 6px 12px rgba(102, 126, 234, 0.4) !important;
+    }
+    .example-card {
+        background: #f8f9fa;
+        border-radius: 8px;
+        padding: 1rem;
+        margin: 0.5rem 0;
+        border: 1px solid #e9ecef;
+        cursor: pointer;
+        transition: all 0.2s ease;
+    }
+    .example-card:hover {
+        background: #e9ecef;
+        transform: translateX(5px);
+    }
+    .status-box {
+        background: #e3f2fd;
+        border: 1px solid #2196f3;
+        border-radius: 8px;
+        padding: 1rem;
+    }
+    .pricing-info {
+        background: linear-gradient(135deg, #ffecd2 0%, #fcb69f 100%);
+        border-radius: 10px;
+        padding: 1rem;
+        text-align: center;
+        margin: 1rem 0;
+    }
+    .pricing-info h4 {
+        color: #d84315;
+        margin: 0 0 0.5rem 0;
+    }
+    .pricing-info p {
+        color: #bf360c;
+        margin: 0;
+        font-size: 0.9rem;
+    }
+    .demo-notice {
+        background: linear-gradient(135deg, #fff3cd 0%, #ffeaa7 100%);
+        border: 1px solid #ffc107;
+        border-radius: 8px;
+        padding: 1rem;
+        margin: 1rem 0;
+        text-align: center;
+    }
+    """
+    # Create interface
+    with gr.Blocks(title="AI Video Creator Pro - DEMO", theme=gr.themes.Soft(), css=custom_css) as interface:
+        # Professional Header
+        with gr.Group(elem_classes="header"):
+            gr.Markdown("""
+            # 🎬 AI Video Creator Pro
+            ### Transform Your Ideas Into Stunning Videos with AI-Powered Generation
+            """)
+        # Demo Notice
+        with gr.Group(elem_classes="demo-notice"):
+            gr.Markdown("""
+            ## 🚧 DEMO VERSION
+            This is a demonstration of the professional UI. Video generation is simulated for testing purposes.
+            The full version with actual AI video generation will be available once dependencies are resolved.
+            """)
+        with gr.Row():
+            with gr.Column(scale=2):
+                # Main Input Section
+                with gr.Group(elem_classes="feature-card"):
+                    gr.Markdown("## 🎯 Video Generation")
+                    prompt = gr.Textbox(
+                        label="📝 Video Description",
+                        placeholder="Describe the video you want to create... (e.g., 'A majestic dragon soaring through a mystical forest with glowing mushrooms')",
+                        lines=3,
+                        max_lines=5,
+                        container=True
+                    )
+                    with gr.Row():
+                        model_id = gr.Dropdown(
+                            choices=generator.get_available_models(),
+                            value=generator.get_available_models()[0],
+                            label="🤖 AI Model",
+                            info="Choose the AI model for video generation",
+                            container=True
+                        )
+                        resolution = gr.Dropdown(
+                            choices=["480P", "720P"],
+                            value="480P",
+                            label="📐 Resolution (Wan2.1 only)",
+                            info="Select video resolution",
+                            visible=False,
+                            container=True
+                        )
+                    with gr.Row():
+                        num_frames = gr.Slider(
+                            minimum=8,
+                            maximum=32,
+                            value=16,
+                            step=1,
+                            label="🎞️ Video Length (Frames)",
+                            info="More frames = longer video"
+                        )
+                        fps = gr.Slider(
+                            minimum=4,
+                            maximum=12,
+                            value=8,
+                            step=1,
+                            label="⚡ FPS",
+                            info="Frames per second"
+                        )
+                    with gr.Row():
+                        num_inference_steps = gr.Slider(
+                            minimum=10,
+                            maximum=50,
+                            value=25,
+                            step=1,
+                            label="🎨 Quality Steps",
+                            info="More steps = better quality but slower"
+                        )
+                        guidance_scale = gr.Slider(
+                            minimum=1.0,
+                            maximum=20.0,
+                            value=7.5,
+                            step=0.5,
+                            label="🎯 Guidance Scale",
+                            info="Higher values = more prompt adherence"
+                        )
+                    seed = gr.Number(
+                        label="🎲 Seed (Optional)",
+                        value=None,
+                        info="Set for reproducible results",
+                        container=True
+                    )
+                # Voice Section
+                with gr.Group(elem_classes="feature-card"):
+                    gr.Markdown("## 🎤 Voice & Audio")
+                    with gr.Row():
+                        add_voice = gr.Checkbox(
+                            label="🎵 Add Voice Narration",
+                            value=True,
+                            info="Enable to add professional voice-over"
+                        )
+                        voice_type = gr.Dropdown(
+                            choices=generator.get_available_voices(),
+                            value="Default (English)",
+                            label="🗣️ Voice Type",
+                            info="Select the voice for narration",
+                            container=True
+                        )
+                    voice_script = gr.Textbox(
+                        label="📜 Narration Script (Optional)",
+                        placeholder="Enter your narration script here... (Leave blank to use video description)",
+                        lines=2,
+                        max_lines=3,
+                        info="If left blank, the video description will be used as narration",
+                        container=True
+                    )
+                # Generate Button
+                generate_btn = gr.Button("🚀 Generate Professional Video (DEMO)", variant="primary", size="lg", elem_classes="generate-btn")
+                # Output Section
+                with gr.Group(elem_classes="feature-card"):
+                    gr.Markdown("## 📺 Generated Video")
+                    status_text = gr.Textbox(label="📊 Status", interactive=False, elem_classes="status-box")
+                    video_output = gr.Video(label="🎬 Your Video", elem_classes="status-box")
+            with gr.Column(scale=1):
+                # Model Information
+                with gr.Group(elem_classes="model-info"):
+                    gr.Markdown("## 🤖 AI Model Details")
+                    model_info = gr.JSON(label="Current Model Specifications", elem_classes="model-info")
+                # Pricing Information
+                with gr.Group(elem_classes="pricing-info"):
+                    gr.Markdown("## 💰 Pricing")
+                    gr.Markdown("""
+                    **Free Tier:** 5 videos per day
+                    **Pro Plan:** $9.99/month
+                    - Unlimited videos
+                    - Priority processing
+                    - HD quality
+                    - Advanced features
+                    **Enterprise:** Contact us
+                    """)
+                # Examples
+                with gr.Group():
+                    gr.Markdown("## 💡 Inspiration Examples")
+                    examples = [
+                        ["A beautiful sunset over the ocean with waves crashing on the shore"],
+                        ["A cat playing with a ball of yarn in a cozy living room"],
+                        ["A futuristic city with flying cars and neon lights"],
+                        ["A butterfly emerging from a cocoon in a garden"],
+                        ["A rocket launching into space with fire and smoke"],
+                        ["Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage"],
+                        ["A majestic dragon soaring through a mystical forest with glowing mushrooms"]
+                    ]
+                    gr.Examples(
+                        examples=examples,
+                        inputs=prompt,
+                        label="Click to try these examples"
+                    )
+                # Features
+                with gr.Group():
+                    gr.Markdown("## ✨ Features")
+                    gr.Markdown("""
+                    🎬 **Multiple AI Models**
+                    - State-of-the-art video generation
+                    - Quality vs speed options
+                    🎤 **Professional Voice-Over**
+                    - Multiple voice types
+                    - Custom narration scripts
+                    🎨 **Advanced Controls**
+                    - Quality settings
+                    - Resolution options
+                    - Reproducible results
+                    ⚡ **Fast Processing**
+                    - GPU acceleration
+                    - Optimized pipelines
+                    """)
+        # Event handlers
+        generate_btn.click(
+            fn=generate_video_interface,
+            inputs=[prompt, model_id, num_frames, fps, num_inference_steps, guidance_scale, seed, resolution, voice_script, voice_type, add_voice],
+            outputs=[video_output, status_text]
+        )
+        # Update model info when model changes
+        def update_model_info(model_id):
+            info = generator.get_model_info(model_id)
+            return info
+        # Show/hide resolution selector based on model
+        def update_resolution_visibility(model_id):
+            if model_id == "Wan-AI/Wan2.1-T2V-14B":
+                return gr.Dropdown(visible=True)
+            else:
+                return gr.Dropdown(visible=False)
+        model_id.change(
+            fn=update_model_info,
+            inputs=model_id,
+            outputs=model_info
+        )
+        model_id.change(
+            fn=update_resolution_visibility,
+            inputs=model_id,
+            outputs=resolution
+        )
+        # Load initial model info
+        interface.load(lambda: generator.get_model_info(generator.get_available_models()[0]), outputs=model_info)
+    return interface
+# Create and launch the interface
+interface = create_interface()
+interface.launch(
+    server_name="0.0.0.0",
+    server_port=7861,
+    share=True,
+    show_error=True
+)

text-to-video-generator/.gitattributes ADDED Viewed

	@@ -0,0 +1,35 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text

text-to-video-generator/README.md ADDED Viewed

	@@ -0,0 +1,12 @@

+---
+title: Text To Video Generator
+emoji: 🔥
+colorFrom: gray
+colorTo: gray
+sdk: gradio
+sdk_version: 5.37.0
+app_file: app.py
+pinned: false
+---
+Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

text-to-video-generator/app.py ADDED Viewed

	@@ -0,0 +1,651 @@

+import torch
+import gradio as gr
+from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler
+from diffusers.utils import export_to_video
+import numpy as np
+import os
+import logging
+from gtts import gTTS
+from moviepy.editor import VideoFileClip, AudioFileClip, CompositeAudioClip
+import tempfile
+# Set up logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+class TextToVideoGenerator:
+    def __init__(self):
+        self.pipeline = None
+        self.current_model = None
+        self.device = "cuda" if torch.cuda.is_available() else "cpu"
+        logger.info(f"Using device: {self.device}")
+        # Available models - including the advanced Wan2.1 model
+        self.models = {
+            "damo-vilab/text-to-video-ms-1.7b": {
+                "name": "DAMO Text-to-Video MS-1.7B",
+                "description": "Fast and efficient text-to-video model",
+                "max_frames": 16,
+                "fps": 8,
+                "quality": "Good",
+                "speed": "Fast"
+            },
+            "cerspense/zeroscope_v2_XL": {
+                "name": "Zeroscope v2 XL",
+                "description": "High-quality text-to-video model",
+                "max_frames": 24,
+                "fps": 6,
+                "quality": "Excellent",
+                "speed": "Medium"
+            },
+            "Wan-AI/Wan2.1-T2V-14B": {
+                "name": "Wan2.1-T2V-14B (SOTA)",
+                "description": "State-of-the-art text-to-video model with 14B parameters",
+                "max_frames": 32,
+                "fps": 8,
+                "quality": "SOTA",
+                "speed": "Medium",
+                "resolutions": ["480P", "720P"],
+                "features": ["Chinese & English text", "High motion dynamics", "Best quality"]
+            }
+        }
+        # Voice options (gTTS only supports language, not gender/age)
+        self.voices = {
+            "Default (English)": "en"
+        }
+    def generate_audio(self, text, voice_type):
+        """Generate audio from text using gTTS"""
+        try:
+            lang = self.voices[voice_type]
+            tts = gTTS(text=text, lang=lang)
+            with tempfile.NamedTemporaryFile(suffix=".mp3", delete=False) as temp_audio:
+                audio_path = temp_audio.name
+            tts.save(audio_path)
+            logger.info(f"Audio generated successfully: {audio_path}")
+            return audio_path
+        except Exception as e:
+            logger.error(f"Error generating audio: {str(e)}")
+            return None
+    def merge_audio_video(self, video_path, audio_path, output_path):
+        """Merge audio and video using moviepy"""
+        try:
+            # Load video and audio
+            video_clip = VideoFileClip(video_path)
+            audio_clip = AudioFileClip(audio_path)
+            # Ensure audio duration matches video duration
+            if audio_clip.duration > video_clip.duration:
+                audio_clip = audio_clip.subclip(0, video_clip.duration)
+            elif audio_clip.duration < video_clip.duration:
+                # Loop audio if it's shorter than video
+                loops_needed = int(video_clip.duration / audio_clip.duration) + 1
+                audio_clip = CompositeAudioClip([audio_clip] * loops_needed).subclip(0, video_clip.duration)
+            # Merge audio and video
+            final_clip = video_clip.set_audio(audio_clip)
+            # Write final video with audio
+            final_clip.write_videofile(output_path, codec='libx264', audio_codec='aac')
+            # Clean up
+            video_clip.close()
+            audio_clip.close()
+            final_clip.close()
+            logger.info(f"Audio and video merged successfully: {output_path}")
+            return output_path
+        except Exception as e:
+            logger.error(f"Error merging audio and video: {str(e)}")
+            return None
+    def load_model(self, model_id):
+        """Load the specified model"""
+        if self.current_model == model_id and self.pipeline is not None:
+            return f"Model {self.models[model_id]['name']} is already loaded"
+        try:
+            logger.info(f"Loading model: {model_id}")
+            # Clear GPU memory if needed
+            if torch.cuda.is_available():
+                torch.cuda.empty_cache()
+            # Special handling for Wan2.1 model
+            if model_id == "Wan-AI/Wan2.1-T2V-14B":
+                # Wan2.1 requires specific configuration
+                self.pipeline = DiffusionPipeline.from_pretrained(
+                    model_id,
+                    torch_dtype=torch.float16 if self.device == "cuda" else torch.float32,
+                    variant="fp16" if self.device == "cuda" else None,
+                    use_safetensors=True
+                )
+            else:
+                # Standard loading for other models
+                self.pipeline = DiffusionPipeline.from_pretrained(
+                    model_id,
+                    torch_dtype=torch.float16 if self.device == "cuda" else torch.float32,
+                    variant="fp16" if self.device == "cuda" else None
+                )
+            # Move to device
+            self.pipeline = self.pipeline.to(self.device)
+            # Optimize scheduler for faster inference
+            if hasattr(self.pipeline, 'scheduler'):
+                self.pipeline.scheduler = DPMSolverMultistepScheduler.from_config(
+                    self.pipeline.scheduler.config
+                )
+            # Enable memory efficient attention if available
+            if self.device == "cuda":
+                self.pipeline.enable_model_cpu_offload()
+                self.pipeline.enable_vae_slicing()
+            self.current_model = model_id
+            logger.info(f"Successfully loaded model: {model_id}")
+            return f"Successfully loaded {self.models[model_id]['name']}"
+        except Exception as e:
+            logger.error(f"Error loading model: {str(e)}")
+            return f"Error loading model: {str(e)}"
+    def generate_video(self, prompt, model_id, num_frames=16, fps=8, num_inference_steps=25, guidance_scale=7.5, seed=None, resolution="480P", voice_script="", voice_type="Default (English)", add_voice=True):
+        """Generate video from text prompt with optional voice"""
+        try:
+            # Use prompt as voice script if voice_script is empty
+            if not voice_script.strip() and add_voice:
+                voice_script = prompt
+            # Load model if not already loaded
+            if self.current_model != model_id:
+                load_result = self.load_model(model_id)
+                if "Error" in load_result:
+                    return None, load_result
+            # Set seed for reproducibility
+            if seed is not None:
+                torch.manual_seed(seed)
+                if torch.cuda.is_available():
+                    torch.cuda.manual_seed(seed)
+            # Get model config
+            model_config = self.models[model_id]
+            num_frames = min(num_frames, model_config["max_frames"])
+            fps = model_config["fps"]
+            # Special handling for Wan2.1 model
+            if model_id == "Wan-AI/Wan2.1-T2V-14B":
+                # Wan2.1 specific parameters
+                if resolution == "720P":
+                    width, height = 1280, 720
+                else:  # 480P
+                    width, height = 832, 480
+                logger.info(f"Generating Wan2.1 video with prompt: {prompt}")
+                logger.info(f"Parameters: frames={num_frames}, fps={fps}, steps={num_inference_steps}, resolution={resolution}")
+                # Generate video with Wan2.1 specific settings
+                result = self.pipeline(
+                    prompt,
+                    num_inference_steps=num_inference_steps,
+                    guidance_scale=guidance_scale,
+                    num_frames=num_frames,
+                    width=width,
+                    height=height
+                )
+                video_frames = result['frames'] if isinstance(result, dict) else result.frames
+            else:
+                # Standard generation for other models
+                logger.info(f"Generating video with prompt: {prompt}")
+                logger.info(f"Parameters: frames={num_frames}, fps={fps}, steps={num_inference_steps}")
+                result = self.pipeline(
+                    prompt,
+                    num_inference_steps=num_inference_steps,
+                    guidance_scale=guidance_scale,
+                    num_frames=num_frames
+                )
+                video_frames = result['frames'] if isinstance(result, dict) else result.frames
+            # Convert to numpy array
+            video_frames = np.array(video_frames)
+            # Save video
+            output_path = f"generated_video_{seed if seed else 'random'}.mp4"
+            export_to_video(video_frames, output_path, fps=fps)
+            logger.info(f"Video saved to: {output_path}")
+            # Add voice if requested
+            if add_voice and voice_script.strip():
+                logger.info(f"Generating voice for script: {voice_script}")
+                # Generate audio
+                audio_path = self.generate_audio(voice_script, voice_type)
+                if audio_path:
+                    # Create final output path with voice
+                    final_output_path = f"generated_video_with_voice_{seed if seed else 'random'}.mp4"
+                    # Merge audio and video
+                    final_path = self.merge_audio_video(output_path, audio_path, final_output_path)
+                    # Clean up temporary files
+                    try:
+                        os.unlink(audio_path)
+                        os.unlink(output_path)
+                    except:
+                        pass
+                    if final_path:
+                        return final_path, f"Video with voice generated successfully! Saved as {final_path}"
+                    else:
+                        return output_path, f"Video generated but voice merging failed. Saved as {output_path}"
+                else:
+                    return output_path, f"Video generated but voice generation failed. Saved as {output_path}"
+            else:
+                return output_path, f"Video generated successfully! Saved as {output_path}"
+        except Exception as e:
+            logger.error(f"Error generating video: {str(e)}")
+            return None, f"Error generating video: {str(e)}"
+    def get_available_models(self):
+        """Get list of available models"""
+        return list(self.models.keys())
+    def get_model_info(self, model_id):
+        """Get information about a specific model"""
+        if model_id in self.models:
+            return self.models[model_id]
+        return None
+    def get_available_voices(self):
+        """Get list of available voices"""
+        return list(self.voices.keys())
+# Initialize the generator
+generator = TextToVideoGenerator()
+def create_interface():
+    """Create Gradio interface"""
+    def generate_video_interface(prompt, model_id, num_frames, fps, num_inference_steps, guidance_scale, seed, resolution, voice_script, voice_type, add_voice):
+        if not prompt.strip():
+            return None, "Please enter a prompt"
+        return generator.generate_video(
+            prompt=prompt,
+            model_id=model_id,
+            num_frames=num_frames,
+            fps=fps,
+            num_inference_steps=num_inference_steps,
+            guidance_scale=guidance_scale,
+            seed=seed,
+            resolution=resolution,
+            voice_script=voice_script,
+            voice_type=voice_type,
+            add_voice=add_voice
+        )
+    # Custom CSS for professional styling
+    custom_css = """
+    .gradio-container {
+        max-width: 1200px !important;
+        margin: 0 auto !important;
+    }
+    .header {
+        text-align: center;
+        padding: 2rem 0;
+        background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+        color: white;
+        border-radius: 15px;
+        margin-bottom: 2rem;
+    }
+    .header h1 {
+        font-size: 2.5rem;
+        font-weight: 700;
+        margin: 0;
+        text-shadow: 2px 2px 4px rgba(0,0,0,0.3);
+    }
+    .header p {
+        font-size: 1.1rem;
+        margin: 0.5rem 0 0 0;
+        opacity: 0.9;
+    }
+    .feature-card {
+        background: white;
+        border-radius: 10px;
+        padding: 1.5rem;
+        box-shadow: 0 4px 6px rgba(0,0,0,0.1);
+        margin-bottom: 1rem;
+        border-left: 4px solid #667eea;
+    }
+    .feature-card h3 {
+        color: #333;
+        margin: 0 0 0.5rem 0;
+        font-size: 1.2rem;
+    }
+    .feature-card p {
+        color: #666;
+        margin: 0;
+        font-size: 0.9rem;
+    }
+    .model-info {
+        background: #f8f9fa;
+        border-radius: 8px;
+        padding: 1rem;
+        border: 1px solid #e9ecef;
+    }
+    .model-info h4 {
+        color: #495057;
+        margin: 0 0 0.5rem 0;
+        font-size: 1rem;
+    }
+    .model-info p {
+        color: #6c757d;
+        margin: 0.25rem 0;
+        font-size: 0.85rem;
+    }
+    .generate-btn {
+        background: linear-gradient(135deg, #667eea 0%, #764ba2 100%) !important;
+        border: none !important;
+        color: white !important;
+        font-weight: 600 !important;
+        padding: 1rem 2rem !important;
+        border-radius: 10px !important;
+        font-size: 1.1rem !important;
+        transition: all 0.3s ease !important;
+    }
+    .generate-btn:hover {
+        transform: translateY(-2px) !important;
+        box-shadow: 0 6px 12px rgba(102, 126, 234, 0.4) !important;
+    }
+    .example-card {
+        background: #f8f9fa;
+        border-radius: 8px;
+        padding: 1rem;
+        margin: 0.5rem 0;
+        border: 1px solid #e9ecef;
+        cursor: pointer;
+        transition: all 0.2s ease;
+    }
+    .example-card:hover {
+        background: #e9ecef;
+        transform: translateX(5px);
+    }
+    .status-box {
+        background: #e3f2fd;
+        border: 1px solid #2196f3;
+        border-radius: 8px;
+        padding: 1rem;
+    }
+    .pricing-info {
+        background: linear-gradient(135deg, #ffecd2 0%, #fcb69f 100%);
+        border-radius: 10px;
+        padding: 1rem;
+        text-align: center;
+        margin: 1rem 0;
+    }
+    .pricing-info h4 {
+        color: #d84315;
+        margin: 0 0 0.5rem 0;
+    }
+    .pricing-info p {
+        color: #bf360c;
+        margin: 0;
+        font-size: 0.9rem;
+    }
+    """
+    # Create interface
+    with gr.Blocks(title="AI Video Creator Pro", theme=gr.themes.Soft(), css=custom_css) as interface:
+        # Professional Header
+        with gr.Group(elem_classes="header"):
+            gr.Markdown("""
+            # 🎬 AI Video Creator Pro
+            ### Transform Your Ideas Into Stunning Videos with AI-Powered Generation
+            """)
+        with gr.Row():
+            with gr.Column(scale=2):
+                # Main Input Section
+                with gr.Group(elem_classes="feature-card"):
+                    gr.Markdown("## 🎯 Video Generation")
+                    prompt = gr.Textbox(
+                        label="📝 Video Description",
+                        placeholder="Describe the video you want to create... (e.g., 'A majestic dragon soaring through a mystical forest with glowing mushrooms')",
+                        lines=3,
+                        max_lines=5,
+                        container=True
+                    )
+                    with gr.Row():
+                        model_id = gr.Dropdown(
+                            choices=generator.get_available_models(),
+                            value=generator.get_available_models()[0],
+                            label="🤖 AI Model",
+                            info="Choose the AI model for video generation",
+                            container=True
+                        )
+                        resolution = gr.Dropdown(
+                            choices=["480P", "720P"],
+                            value="480P",
+                            label="📐 Resolution (Wan2.1 only)",
+                            info="Select video resolution",
+                            visible=False,
+                            container=True
+                        )
+                    with gr.Row():
+                        num_frames = gr.Slider(
+                            minimum=8,
+                            maximum=32,
+                            value=16,
+                            step=1,
+                            label="🎞️ Video Length (Frames)",
+                            info="More frames = longer video"
+                        )
+                        fps = gr.Slider(
+                            minimum=4,
+                            maximum=12,
+                            value=8,
+                            step=1,
+                            label="⚡ FPS",
+                            info="Frames per second"
+                        )
+                    with gr.Row():
+                        num_inference_steps = gr.Slider(
+                            minimum=10,
+                            maximum=50,
+                            value=25,
+                            step=1,
+                            label="🎨 Quality Steps",
+                            info="More steps = better quality but slower"
+                        )
+                        guidance_scale = gr.Slider(
+                            minimum=1.0,
+                            maximum=20.0,
+                            value=7.5,
+                            step=0.5,
+                            label="🎯 Guidance Scale",
+                            info="Higher values = more prompt adherence"
+                        )
+                    seed = gr.Number(
+                        label="🎲 Seed (Optional)",
+                        value=None,
+                        info="Set for reproducible results",
+                        container=True
+                    )
+                # Voice Section
+                with gr.Group(elem_classes="feature-card"):
+                    gr.Markdown("## 🎤 Voice & Audio")
+                    with gr.Row():
+                        add_voice = gr.Checkbox(
+                            label="🎵 Add Voice Narration",
+                            value=True,
+                            info="Enable to add professional voice-over"
+                        )
+                        voice_type = gr.Dropdown(
+                            choices=generator.get_available_voices(),
+                            value="Default (English)",
+                            label="🗣️ Voice Type",
+                            info="Select the voice for narration",
+                            container=True
+                        )
+                    voice_script = gr.Textbox(
+                        label="📜 Narration Script (Optional)",
+                        placeholder="Enter your narration script here... (Leave blank to use video description)",
+                        lines=2,
+                        max_lines=3,
+                        info="If left blank, the video description will be used as narration",
+                        container=True
+                    )
+                # Generate Button
+                generate_btn = gr.Button("🚀 Generate Professional Video", variant="primary", size="lg", elem_classes="generate-btn")
+                # Output Section
+                with gr.Group(elem_classes="feature-card"):
+                    gr.Markdown("## 📺 Generated Video")
+                    status_text = gr.Textbox(label="📊 Status", interactive=False, elem_classes="status-box")
+                    video_output = gr.Video(label="🎬 Your Video", elem_classes="status-box")
+            with gr.Column(scale=1):
+                # Model Information
+                with gr.Group(elem_classes="model-info"):
+                    gr.Markdown("## 🤖 AI Model Details")
+                    model_info = gr.JSON(label="Current Model Specifications", elem_classes="model-info")
+                # Pricing Information
+                with gr.Group(elem_classes="pricing-info"):
+                    gr.Markdown("## 💰 Pricing")
+                    gr.Markdown("""
+                    **Free Tier:** 5 videos per day
+                    **Pro Plan:** $9.99/month
+                    - Unlimited videos
+                    - Priority processing
+                    - HD quality
+                    - Advanced features
+                    **Enterprise:** Contact us
+                    """)
+                # Examples
+                with gr.Group():
+                    gr.Markdown("## 💡 Inspiration Examples")
+                    examples = [
+                        ["A beautiful sunset over the ocean with waves crashing on the shore"],
+                        ["A cat playing with a ball of yarn in a cozy living room"],
+                        ["A futuristic city with flying cars and neon lights"],
+                        ["A butterfly emerging from a cocoon in a garden"],
+                        ["A rocket launching into space with fire and smoke"],
+                        ["Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage"],
+                        ["A majestic dragon soaring through a mystical forest with glowing mushrooms"]
+                    ]
+                    gr.Examples(
+                        examples=examples,
+                        inputs=prompt,
+                        label="Click to try these examples",
+                        elem_classes="example-card"
+                    )
+                # Features
+                with gr.Group():
+                    gr.Markdown("## ✨ Features")
+                    gr.Markdown("""
+                    🎬 **Multiple AI Models**
+                    - State-of-the-art video generation
+                    - Quality vs speed options
+                    🎤 **Professional Voice-Over**
+                    - Multiple voice types
+                    - Custom narration scripts
+                    🎨 **Advanced Controls**
+                    - Quality settings
+                    - Resolution options
+                    - Reproducible results
+                    ⚡ **Fast Processing**
+                    - GPU acceleration
+                    - Optimized pipelines
+                    """)
+        # Event handlers
+        generate_btn.click(
+            fn=generate_video_interface,
+            inputs=[prompt, model_id, num_frames, fps, num_inference_steps, guidance_scale, seed, resolution, voice_script, voice_type, add_voice],
+            outputs=[video_output, status_text]
+        )
+        # Update model info when model changes
+        def update_model_info(model_id):
+            info = generator.get_model_info(model_id)
+            return info
+        # Show/hide resolution selector based on model
+        def update_resolution_visibility(model_id):
+            if model_id == "Wan-AI/Wan2.1-T2V-14B":
+                return gr.Dropdown(visible=True)
+            else:
+                return gr.Dropdown(visible=False)
+        model_id.change(
+            fn=update_model_info,
+            inputs=model_id,
+            outputs=model_info
+        )
+        model_id.change(
+            fn=update_resolution_visibility,
+            inputs=model_id,
+            outputs=resolution
+        )
+        # Load initial model info
+        interface.load(lambda: generator.get_model_info(generator.get_available_models()[0]), outputs=model_info)
+    return interface
+# Create and launch the interface
+interface = create_interface()
+interface.launch(
+    server_name="0.0.0.0",
+    server_port=7860,
+    share=True,
+    show_error=True
+)

text-to-video-generator/requirements.txt ADDED Viewed

	@@ -0,0 +1,16 @@

+torch==2.2.2
+torchvision==0.17.2
+diffusers==0.27.2
+transformers==4.39.3
+accelerate==0.28.0
+safetensors==0.4.2
+opencv-python==4.9.0.80
+pillow==10.3.0
+numpy==1.24.4
+gradio==4.25.0
+huggingface-hub==0.23.0
+xformers==0.0.25
+imageio==2.34.0
+imageio-ffmpeg==0.4.9
+gTTS==2.5.1
+moviepy==1.0.3

text_to_video.py ADDED Viewed

	@@ -0,0 +1,289 @@

+import torch
+import gradio as gr
+from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler
+from diffusers.utils import export_to_video
+import numpy as np
+from PIL import Image
+import os
+import logging
+# Set up logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+class TextToVideoGenerator:
+    def __init__(self):
+        self.pipeline = None
+        self.current_model = None
+        self.device = "cuda" if torch.cuda.is_available() else "cpu"
+        logger.info(f"Using device: {self.device}")
+        # Available models
+        self.models = {
+            "damo-vilab/text-to-video-ms-1.7b": {
+                "name": "DAMO Text-to-Video MS-1.7B",
+                "description": "Fast and efficient text-to-video model",
+                "max_frames": 16,
+                "fps": 8
+            },
+            "cerspense/zeroscope_v2_XL": {
+                "name": "Zeroscope v2 XL",
+                "description": "High-quality text-to-video model",
+                "max_frames": 24,
+                "fps": 6
+            },
+            "stabilityai/stable-video-diffusion-img2vid-xt": {
+                "name": "Stable Video Diffusion XT",
+                "description": "Image-to-video model (requires initial image)",
+                "max_frames": 25,
+                "fps": 6
+            }
+        }
+    def load_model(self, model_id):
+        """Load the specified model"""
+        if self.current_model == model_id and self.pipeline is not None:
+            return f"Model {self.models[model_id]['name']} is already loaded"
+        try:
+            logger.info(f"Loading model: {model_id}")
+            # Clear GPU memory if needed
+            if torch.cuda.is_available():
+                torch.cuda.empty_cache()
+            # Load pipeline
+            self.pipeline = DiffusionPipeline.from_pretrained(
+                model_id,
+                torch_dtype=torch.float16 if self.device == "cuda" else torch.float32,
+                variant="fp16" if self.device == "cuda" else None
+            )
+            # Move to device
+            self.pipeline = self.pipeline.to(self.device)
+            # Optimize scheduler for faster inference
+            if hasattr(self.pipeline, 'scheduler'):
+                self.pipeline.scheduler = DPMSolverMultistepScheduler.from_config(
+                    self.pipeline.scheduler.config
+                )
+            # Enable memory efficient attention if available
+            if self.device == "cuda":
+                self.pipeline.enable_model_cpu_offload()
+                self.pipeline.enable_vae_slicing()
+            self.current_model = model_id
+            logger.info(f"Successfully loaded model: {model_id}")
+            return f"Successfully loaded {self.models[model_id]['name']}"
+        except Exception as e:
+            logger.error(f"Error loading model: {str(e)}")
+            return f"Error loading model: {str(e)}"
+    def generate_video(self, prompt, model_id, num_frames=16, fps=8, num_inference_steps=25, guidance_scale=7.5, seed=None):
+        """Generate video from text prompt"""
+        try:
+            # Load model if not already loaded
+            if self.current_model != model_id:
+                load_result = self.load_model(model_id)
+                if "Error" in load_result:
+                    return None, load_result
+            # Set seed for reproducibility
+            if seed is not None:
+                torch.manual_seed(seed)
+                if torch.cuda.is_available():
+                    torch.cuda.manual_seed(seed)
+            # Get model config
+            model_config = self.models[model_id]
+            num_frames = min(num_frames, model_config["max_frames"])
+            fps = model_config["fps"]
+            logger.info(f"Generating video with prompt: {prompt}")
+            logger.info(f"Parameters: frames={num_frames}, fps={fps}, steps={num_inference_steps}")
+            # Generate video
+            video_frames = self.pipeline(
+                prompt,
+                num_inference_steps=num_inference_steps,
+                guidance_scale=guidance_scale,
+                num_frames=num_frames
+            ).frames
+            # Convert to numpy array
+            video_frames = np.array(video_frames)
+            # Save video
+            output_path = f"generated_video_{seed if seed else 'random'}.mp4"
+            export_to_video(video_frames, output_path, fps=fps)
+            logger.info(f"Video saved to: {output_path}")
+            return output_path, f"Video generated successfully! Saved as {output_path}"
+        except Exception as e:
+            logger.error(f"Error generating video: {str(e)}")
+            return None, f"Error generating video: {str(e)}"
+    def get_available_models(self):
+        """Get list of available models"""
+        return list(self.models.keys())
+    def get_model_info(self, model_id):
+        """Get information about a specific model"""
+        if model_id in self.models:
+            return self.models[model_id]
+        return None
+# Initialize the generator
+generator = TextToVideoGenerator()
+def create_interface():
+    """Create Gradio interface"""
+    def generate_video_interface(prompt, model_id, num_frames, fps, num_inference_steps, guidance_scale, seed):
+        if not prompt.strip():
+            return None, "Please enter a prompt"
+        return generator.generate_video(
+            prompt=prompt,
+            model_id=model_id,
+            num_frames=num_frames,
+            fps=fps,
+            num_inference_steps=num_inference_steps,
+            guidance_scale=guidance_scale,
+            seed=seed
+        )
+    # Create interface
+    with gr.Blocks(title="Text-to-Video Generator", theme=gr.themes.Soft()) as interface:
+        gr.Markdown("# Text-to-Video Generation with Hugging Face Models")
+        gr.Markdown("Generate videos from text descriptions using state-of-the-art AI models")
+        with gr.Row():
+            with gr.Column(scale=2):
+                # Input section
+                with gr.Group():
+                    gr.Markdown("## Input Parameters")
+                    prompt = gr.Textbox(
+                        label="Text Prompt",
+                        placeholder="Enter your video description here...",
+                        lines=3,
+                        max_lines=5
+                    )
+                    model_id = gr.Dropdown(
+                        choices=generator.get_available_models(),
+                        value=generator.get_available_models()[0],
+                        label="Model",
+                        info="Select the model to use for generation"
+                    )
+                    with gr.Row():
+                        num_frames = gr.Slider(
+                            minimum=8,
+                            maximum=24,
+                            value=16,
+                            step=1,
+                            label="Number of Frames",
+                            info="More frames = longer video"
+                        )
+                        fps = gr.Slider(
+                            minimum=4,
+                            maximum=12,
+                            value=8,
+                            step=1,
+                            label="FPS",
+                            info="Frames per second"
+                        )
+                    with gr.Row():
+                        num_inference_steps = gr.Slider(
+                            minimum=10,
+                            maximum=50,
+                            value=25,
+                            step=1,
+                            label="Inference Steps",
+                            info="More steps = better quality but slower"
+                        )
+                        guidance_scale = gr.Slider(
+                            minimum=1.0,
+                            maximum=20.0,
+                            value=7.5,
+                            step=0.5,
+                            label="Guidance Scale",
+                            info="Higher values = more prompt adherence"
+                        )
+                    seed = gr.Number(
+                        label="Seed (Optional)",
+                        value=None,
+                        info="Set for reproducible results"
+                    )
+                    generate_btn = gr.Button("Generate Video", variant="primary", size="lg")
+                # Output section
+                with gr.Group():
+                    gr.Markdown("## Output")
+                    status_text = gr.Textbox(label="Status", interactive=False)
+                    video_output = gr.Video(label="Generated Video")
+            with gr.Column(scale=1):
+                # Model information
+                with gr.Group():
+                    gr.Markdown("## Model Information")
+                    model_info = gr.JSON(label="Current Model Details")
+                # Examples
+                with gr.Group():
+                    gr.Markdown("## Example Prompts")
+                    examples = [
+                        ["A beautiful sunset over the ocean with waves crashing on the shore"],
+                        ["A cat playing with a ball of yarn in a cozy living room"],
+                        ["A futuristic city with flying cars and neon lights"],
+                        ["A butterfly emerging from a cocoon in a garden"],
+                        ["A rocket launching into space with fire and smoke"]
+                    ]
+                    gr.Examples(
+                        examples=examples,
+                        inputs=prompt,
+                        label="Try these examples"
+                    )
+        # Event handlers
+        generate_btn.click(
+            fn=generate_video_interface,
+            inputs=[prompt, model_id, num_frames, fps, num_inference_steps, guidance_scale, seed],
+            outputs=[video_output, status_text]
+        )
+        # Update model info when model changes
+        def update_model_info(model_id):
+            info = generator.get_model_info(model_id)
+            return info
+        model_id.change(
+            fn=update_model_info,
+            inputs=model_id,
+            outputs=model_info
+        )
+        # Load initial model info
+        interface.load(lambda: generator.get_model_info(generator.get_available_models()[0]), outputs=model_info)
+    return interface
+if __name__ == "__main__":
+    # Create and launch the interface
+    interface = create_interface()
+    interface.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        share=True,
+        show_error=True
+    )