Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens

Overview

Spark-TTS is an advanced text-to-speech system that uses the power of large language models (LLM) for highly accurate and natural-sounding voice synthesis. It is designed to be efficient, flexible, and powerful for both research and production use.

https://huggingface.co/SparkAudio/Spark-TTS-0.5B with ONNX weights.

Usage

python test_spark_tts.py --text "Your text to synthesize" --model_dir "/path/to/model"
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Fhrozen/Spark-TTS-0.5B-ONNX

Quantized
(3)
this model