You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Blaze - Mini

Blaze is a text-to-music generation model capable of producing high-quality music samples from natural language prompts.
It is a single-stage, auto-regressive Transformer trained over a 32 kHz EnCodec tokenizer using 4 audio codebooks sampled at 50 Hz.

Unlike earlier methods that depend on intermediate semantic representations, Blaze directly predicts all 4 codebooks in a single forward pass.
By introducing a slight delay between codebooks, Blaze achieves efficient parallel generation — reducing autoregressive steps to just 50 per second of audio.

🤗 Transformers Usage

You can use Blaze via the 🤗 Transformers text-to-audio pipeline:

1. Install required packages:

pip install --upgrade pip
pip install --upgrade transformers scipy

2. Run text-to-audio inference:

from transformers import pipeline
import scipy

synthesizer = pipeline("text-to-audio", "SVECTOR-CORPORATION/Blaze")

music = synthesizer("lo-fi music with a soothing melody", forward_params={"do_sample": True})

scipy.io.wavfile.write("blaze_output.wav", rate=music["sampling_rate"], data=music["audio"])

Intended Use

Primary Use:

Research on generative AI in music
Music prototyping guided by text
Exploring transformer models for creative generation

Out of Scope:

Commercial deployment without license
Harmful, biased, or culturally disrespectful content generation

Downloads last month: -

Safetensors

Model size

0.6B params

Tensor type

F32