saint marzi

ausntmarzi

AI & ML interests

None yet

Recent Activity

reacted to openfree's post with πŸš€ about 2 months ago
🌏 Whisper-OCR Multilingual Translation Space πŸš€ Welcome! This Space takes English audio, video, images, and PDFs and instantly converts them into Chinese (ZH), Thai (TH), and Russian (RU)β€”no other source language required. https://huggingface.co/spaces/VIDraft/voice-trans ✨ Key Features 🎀 Microphone – Record English speech β†’ transcript + 3-language translation πŸ”Š Audio File – Upload English audio β†’ transcript + translation 🎬 Video File – Auto-extract audio with FFmpeg β†’ transcript + translation πŸ–ΌοΈ Image – Nanonets-OCR pulls text β†’ translation πŸ“„ PDF – Up to 50 pages of text & tables β†’ translation πŸ”„ Realtime Mode – Flush every 10-15 s; newest lines appear at the top πŸ› οΈ Quick Start Click β€œDuplicate” to fork, or launch directly. Pick a tab (🎀/πŸ”Š/🎬/πŸ–ΌοΈ/πŸ“„/πŸ”„) and feed it English input. After a few seconds, see the πŸ“œ original and 🌐 3-language translation side by side. ⚑ Tech Stack openai/whisper-large-v3-turbo β€” fast, high-accuracy ASR Nanonets-OCR-s (+ Flash Attention 2) β€” document/image OCR Gradio Blocks β€” clean tabbed UI PyTorch + CUDA β€” auto GPU allocation & ThreadPool load balancing πŸ“Œ Notes Translation quality depends on audio quality, lighting, and resolution. Huge videos hit the HF Space upload cap (~2 GB). Realtime tab requires browser microphone permission.
reacted to seawolf2357's post with πŸ”₯ about 2 months ago
⚑ FusionX Enhanced Wan 2.1 I2V (14B) 🎬 πŸš€ Revolutionary Image-to-Video Generation Model Generate cinematic-quality videos in just 8 steps! https://huggingface.co/spaces/Heartsync/WAN2-1-fast-T2V-FusioniX ✨ Key Features 🎯 Ultra-Fast Generation: Premium quality in just 8-10 steps 🎬 Cinematic Quality: Smooth motion with detailed textures πŸ”₯ FusionX Technology: Enhanced with CausVid + MPS Rewards LoRA πŸ“ Optimized Resolution: 576Γ—1024 default settings ⚑ 50% Speed Boost: Faster rendering compared to base models πŸ› οΈ Technical Stack Base Model: Wan2.1 I2V 14B Enhancement Technologies: πŸ”— CausVid LoRA (1.0 strength) - Motion modeling πŸ”— MPS Rewards LoRA (0.7 strength) - Detail optimization Scheduler: UniPC Multistep (flow_shift=8.0) Auto Prompt Enhancement: Automatic cinematic keyword injection 🎨 How to Use Upload Image - Select your starting image Enter Prompt - Describe desired motion and style Adjust Settings - 8 steps, 2-5 seconds recommended Generate - Complete in just minutes! πŸ’‘ Optimization Tips βœ… Recommended Settings: 8-10 steps, 576Γ—1024 resolution βœ… Prompting: Use "cinematic motion, smooth animation" keywords βœ… Duration: 2-5 seconds for optimal quality βœ… Motion: Emphasize natural movement and camera work πŸ† FusionX Enhanced vs Standard Models Performance Comparison: While standard models typically require 15-20 inference steps to achieve decent quality, our FusionX Enhanced version delivers premium results in just 8-10 steps - that's more than 50% faster! The rendering speed has been dramatically improved through optimized LoRA fusion, allowing creators to iterate quickly without sacrificing quality. Motion quality has been significantly enhanced with advanced causal modeling, producing smoother, more realistic animations compared to base implementations. Detail preservation is substantially better thanks to MPS Rewards training, maintaining crisp textures and consistent temporal coherence throughout the generated sequences.
liked a Space 2 months ago
aiqtech/flux-claude-monet-lora
View all activity

Organizations

None yet