XTTS v2 Mobile - TorchScript Edition
β¨ UPDATED: Now with proper TorchScript models ready for mobile deployment!
Optimized XTTS v2 models exported to TorchScript format for direct mobile deployment on Android and iOS devices.
π― Key Features
- TorchScript Format: Self-contained
.tsfiles that run directly on mobile - Optimized for Mobile: Models processed with PyTorch Mobile optimizations
- Multiple Variants: Choose based on your device capabilities
- 17 Languages: Full multilingual support maintained
- 24kHz Output: High-quality audio generation
π¦ Model Variants
| Variant | Size | Memory | Target Devices | Quality |
|---|---|---|---|---|
| Original | 1.16 GB | ~1.5GB | High-end (4GB+ RAM) | Best |
| FP16 | 581 MB | ~800MB | Mid-range (3GB+ RAM) | Excellent |
Recommendation: Use FP16 variant for most devices - it offers the best balance of size, memory usage, and quality.
π Quick Start
Download Models
from huggingface_hub import hf_hub_download
# Download FP16 variant (recommended)
model_path = hf_hub_download(
repo_id="GenMedLabs/xtts-mobile",
filename="fp16/xtts_infer_fp16.ts"
)
Android Integration (Kotlin)
// Add to build.gradle
dependencies {
implementation 'org.pytorch:pytorch_android_lite:2.1.0'
}
// Load and use model
class XTTSModule(context: Context) {
private var module: Module? = null
fun initialize(modelPath: String) {
module = Module.load(modelPath)
}
fun generateSpeech(text: String, language: String): FloatArray {
val output = module?.forward(
IValue.from(text),
IValue.from(language)
)?.toTensor()
return output?.dataAsFloatArray ?: floatArrayOf()
}
}
iOS Integration (Swift)
import LibTorch
class XTTSModule {
private var module: TorchModule?
func initialize(modelPath: String) {
module = TorchModule(fileAtPath: modelPath)
}
func generateSpeech(text: String, language: String) -> [Float] {
guard let module = module else { return [] }
let output = module.forward([text, language])
return output.toArray()
}
}
React Native Integration
// Download model from HuggingFace
const HF_BASE = "https://huggingface.co/GenMedLabs/xtts-mobile/resolve/main";
async function downloadModel(variant = 'fp16') {
const url = `${HF_BASE}/${variant}/xtts_infer_${variant}.ts?download=true`;
const destPath = `${RNFS.DocumentDirectoryPath}/xtts_model.ts`;
await RNFS.downloadFile({
fromUrl: url,
toFile: destPath,
background: true
}).promise;
return destPath;
}
// Initialize native module
const modelPath = await downloadModel('fp16');
await XTTSModule.initialize(modelPath);
// Generate speech
const audio = await XTTSModule.speak("Hello world", "en");
π Memory Requirements
| Device RAM | Recommended Variant | Expected Performance |
|---|---|---|
| < 3GB | FP16 with streaming | May require optimization |
| 3-4GB | FP16 | Smooth performance |
| 4GB+ | Original or FP16 | Excellent performance |
π Supported Languages
en- Englishes- Spanishfr- Frenchde- Germanit- Italianpt- Portuguesepl- Polishtr- Turkishru- Russiannl- Dutchcs- Czechar- Arabiczh- Chineseja- Japaneseko- Koreanhu- Hungarianhi- Hindi
π§ Technical Details
- Model Architecture: XTTS v2 with GPT-style backbone
- Export Method: TorchScript with mobile optimizations
- PyTorch Version: 2.8.0 (use matching LibTorch version)
- Sample Rate: 24,000 Hz
- Quantization: FP16 uses half-precision floating point
π‘ Tips for Mobile Deployment
Memory Management:
- Load model once at app startup
- Keep model in memory for multiple generations
- Use
module.setNumThreads(1)to reduce memory usage
Performance Optimization:
- Warm up model with dummy input on first load
- Use FP16 variant for best balance
- Consider chunking long texts
Error Handling:
try { module = Module.load(modelPath) } catch (e: Exception) { // Fall back to server-side TTS Log.e("XTTS", "Failed to load model: ${e.message}") }
π Changelog
- 2024-09-23: Initial release with TorchScript models
- Added Original and FP16 variants
- Optimized for PyTorch Mobile
- Fixed compatibility issues
π License
Apache 2.0
π Acknowledgments
Based on the official XTTS v2 model. Optimized for mobile deployment.
π Citation
@misc{xtts2024mobile,
title={XTTS v2 Mobile - TorchScript Edition},
author={GenMedLabs},
year={2024},
publisher={HuggingFace}
}
β οΈ Important Notes
- These are TorchScript models (
.tsfiles), not PyTorch checkpoints (.pth) - Models are self-contained and include all necessary weights
- No additional tokenizer files needed - tokenization is built into the model
- INT8 quantization not available for ARM-based systems
- Downloads last month
- 10