Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
2
220
saint marzi
ausntmarzi
Follow
EX4L's profile picture
commit3r's profile picture
ltelte999's profile picture
4 followers
Β·
1 following
AI & ML interests
None yet
Recent Activity
reacted
to
openfree
's
post
with π
about 2 months ago
π Whisper-OCR Multilingual Translation Space π Welcome! This Space takes English audio, video, images, and PDFs and instantly converts them into Chinese (ZH), Thai (TH), and Russian (RU)βno other source language required. https://huggingface.co/spaces/VIDraft/voice-trans β¨ Key Features π€ Microphoneββ Record English speech β transcript + 3-language translation π Audio Fileββ Upload English audio β transcript + translation π¬ Video Fileββ Auto-extract audio with FFmpeg β transcript + translation πΌοΈ Imageββ Nanonets-OCR pulls text β translation π PDFββ Up to 50 pages of text & tables β translation π Realtime Modeββ Flush every 10-15 s; newest lines appear at the top π οΈ Quick Start Click βDuplicateβ to fork, or launch directly. Pick a tab (π€/π/π¬/πΌοΈ/π/π) and feed it English input. After a few seconds, see the π original and π 3-language translation side by side. β‘ Tech Stack openai/whisper-large-v3-turbo β fast, high-accuracy ASR Nanonets-OCR-s (+ Flash Attention 2) β document/image OCR Gradio Blocks β clean tabbed UI PyTorch + CUDA β auto GPU allocation & ThreadPool load balancing π Notes Translation quality depends on audio quality, lighting, and resolution. Huge videos hit the HF Space upload cap (~2 GB). Realtime tab requires browser microphone permission.
reacted
to
seawolf2357
's
post
with π₯
about 2 months ago
β‘ FusionX Enhanced Wan 2.1 I2V (14B) π¬ π Revolutionary Image-to-Video Generation Model Generate cinematic-quality videos in just 8 steps! https://huggingface.co/spaces/Heartsync/WAN2-1-fast-T2V-FusioniX β¨ Key Features π― Ultra-Fast Generation: Premium quality in just 8-10 steps π¬ Cinematic Quality: Smooth motion with detailed textures π₯ FusionX Technology: Enhanced with CausVid + MPS Rewards LoRA π Optimized Resolution: 576Γ1024 default settings β‘ 50% Speed Boost: Faster rendering compared to base models π οΈ Technical Stack Base Model: Wan2.1 I2V 14B Enhancement Technologies: π CausVid LoRA (1.0 strength) - Motion modeling π MPS Rewards LoRA (0.7 strength) - Detail optimization Scheduler: UniPC Multistep (flow_shift=8.0) Auto Prompt Enhancement: Automatic cinematic keyword injection π¨ How to Use Upload Image - Select your starting image Enter Prompt - Describe desired motion and style Adjust Settings - 8 steps, 2-5 seconds recommended Generate - Complete in just minutes! π‘ Optimization Tips β Recommended Settings: 8-10 steps, 576Γ1024 resolution β Prompting: Use "cinematic motion, smooth animation" keywords β Duration: 2-5 seconds for optimal quality β Motion: Emphasize natural movement and camera work π FusionX Enhanced vs Standard Models Performance Comparison: While standard models typically require 15-20 inference steps to achieve decent quality, our FusionX Enhanced version delivers premium results in just 8-10 steps - that's more than 50% faster! The rendering speed has been dramatically improved through optimized LoRA fusion, allowing creators to iterate quickly without sacrificing quality. Motion quality has been significantly enhanced with advanced causal modeling, producing smoother, more realistic animations compared to base implementations. Detail preservation is substantially better thanks to MPS Rewards training, maintaining crisp textures and consistent temporal coherence throughout the generated sequences.
liked
a Space
2 months ago
aiqtech/flux-claude-monet-lora
View all activity
Organizations
None yet
spaces
2
Sort:Β Recently updated
Running
Pulsar Display
π₯
16 X 16 DOT
Running
8
Vibe Coding Tetris
π¦
vibe coding
models
0
None public yet
datasets
0
None public yet