You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Blaze - Mini

Blaze is a text-to-music generation model capable of producing high-quality music samples from natural language prompts.
It is a single-stage, auto-regressive Transformer trained over a 32 kHz EnCodec tokenizer using 4 audio codebooks sampled at 50 Hz.

Unlike earlier methods that depend on intermediate semantic representations, Blaze directly predicts all 4 codebooks in a single forward pass.
By introducing a slight delay between codebooks, Blaze achieves efficient parallel generation β€” reducing autoregressive steps to just 50 per second of audio.


πŸ€— Transformers Usage

You can use Blaze via the πŸ€— Transformers text-to-audio pipeline:

1. Install required packages:

pip install --upgrade pip
pip install --upgrade transformers scipy

2. Run text-to-audio inference:

from transformers import pipeline
import scipy

synthesizer = pipeline("text-to-audio", "SVECTOR-CORPORATION/Blaze")

music = synthesizer("lo-fi music with a soothing melody", forward_params={"do_sample": True})

scipy.io.wavfile.write("blaze_output.wav", rate=music["sampling_rate"], data=music["audio"])

Intended Use

Primary Use:

  • Research on generative AI in music
  • Music prototyping guided by text
  • Exploring transformer models for creative generation

Out of Scope:

  • Commercial deployment without license
  • Harmful, biased, or culturally disrespectful content generation
Downloads last month
6
Safetensors
Model size
591M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 1 Ask for provider support