---
license: mit
---
# SenseVoice.cpp Jetson Nano Binaries

**SenseVoice.cpp** is a high-performance, open-source C++ speech-to-text implementation aimed at edge devices. It leverages the [GGML](https://github.com/ggerganov/ggml) inference framework and supports multiple backends, including CUDA for GPU acceleration.

This repository hosts prebuilt binaries optimized for **NVIDIA Jetson Nano**, so you can skip the build step and start transcribing right away.

Original project: [https://github.com/lovemefan/SenseVoice.cpp](https://github.com/lovemefan/SenseVoice.cpp)

---

## ✨ Key Features

* **Multi-language ASR**: Supports Chinese (Mandarin), Cantonese, English, Japanese, and Korean.
* **Low latency**: Efficient inference with optional **flash-attn**.
* **Quantization**: Q3, Q4, Q5, Q6, Q8 quantized models to reduce memory footprint.
* **Flexible backends**:

  * CPU (all platforms)
  * CUDA (NVIDIA GPUs)
  * BLAS, Metal, Vulkan (upstream)
* **Voice Activity Detection (VAD)**: Built-in silence-based VAD parameters.
* **Inverse Text Normalization (ITN)**: Optionally output punctuation and formatted text.

*For full feature details (streaming mode, extra backends), see the [upstream documentation](https://github.com/lovemefan/SenseVoice.cpp/blob/main/README.md).*

---

## 📁 Deliverable Directory Structure

```bash
project-root/
├── bin/                     # Executables
│   ├── sense-voice-main     # Main ASR program
│   ├── sense-voice-quantize # Model quantization utility
│   └── sense-voice-zcr-main # Zero-Crossing Rate detection example
└── lib/                     # Libraries
    ├── libcommon.a          # Common static library
    ├── libggml-base.so      # GGML base operations
    ├── libggml-cpu.so       # GGML CPU support
    ├── libggml-cuda.so      # GGML CUDA support
    ├── libggml.so           # GGML core
    └── libsense-voice-core.a# SenseVoice core
```

* **bin/**: Standalone executables for Jetson Nano.
* **lib/**: Static (`.a`) and shared (`.so`) libraries required at runtime.

---

## 🚀 Quick Deployment

Follow these steps to deploy and run on Ubuntu-based distributions (e.g., JetPack 4.5.1 on Jetson Nano):

### 1. Clone the Repo with Git LFS Support

If you haven’t installed Git LFS yet, do so and initialize:

```bash
# Install Git LFS
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs
# Initialize in your repo
git lfs install
```

Clone the repository:

```bash
git clone https://huggingface.co/<YOUR_USERNAME>/sensevoice-jetson-nano.git
cd sensevoice-jetson-nano
git lfs pull
```

### 2. Track Large Binary Files with Git LFS

Ensure large files (shared libraries) use LFS to avoid push errors:

```bash
git lfs track "lib/*.so"
git add .gitattributes
```

### 3. Uploading New Binaries

When you update or add new `.so` files in `lib/`, commit and push as usual:

```bash
git add lib/*.so
git commit -m "Add updated shared libraries via LFS"
git push
```

### 4. Make Binaries Executable

```bash
chmod +x bin/*
```

### 5. Install Shared Libraries System-wide

```bash
sudo mkdir -p /usr/local/lib/sensevoice
sudo cp lib/*.so /usr/local/lib/sensevoice/
echo "/usr/local/lib/sensevoice" | sudo tee /etc/ld.so.conf.d/sensevoice.conf
sudo ldconfig
```

Alternatively, set `LD_LIBRARY_PATH` locally:

```bash
export LD_LIBRARY_PATH="$PWD/lib:$LD_LIBRARY_PATH"
```

### 6. Model Setup

Download or convert a GGUF model (e.g., `sense-voice-small-q4_k.gguf`):

```bash
# From Hugging Face
git clone https://huggingface.co/lovemefan/sense-voice-gguf.git models
```

### 7. Run Examples

#### Speech-to-Text (non-streaming)

```bash
bin/sense-voice-main \
  -m models/sense-voice-small-q4_k.gguf \
  -f input.wav \
  -t 4 \
  -l zh \
  --use-itn \
  --flash-attn
```

**Options**:

* `-t N` / `--threads N`: Number of decode threads (default: 4)
* `-l LANG` / `--language LANG`: `auto`, `zh`, `en`, `yue`, `ja`, `ko`
* `--min_speech_duration_ms`, `--max_speech_duration_ms`: VAD thresholds
* `--no-gpu` (`-ng`): Disable GPU
* `--use-itn` (`-itn`): Enable inverse text normalization
* `--flash-attn` (`-fa`): Enable Flash Attention decoder

#### Quantization Utility

```bash
bin/sense-voice-quantize \
  --input models/sense-voice-small.bin \
  --output models/sense-voice-small-q4_k.gguf \
  --type q4_k
```

Supported quant types: `q3`, `q4_k`, `q4_0`, `q5_0`, `q6_k`, `q8`.

#### Zero-Crossing Rate Demo

```bash
bin/sense-voice-zcr-main input.wav
```

Follow these steps to deploy and run on Ubuntu-based distributions (e.g., JetPack 4.5.1 on Jetson Nano):

### 1. Clone the Repo

```bash
git lfs install
git clone https://huggingface.co/<YOUR_USERNAME>/sensevoice-jetson-nano.git
cd sensevoice-jetson-nano
git pull
```

### 2. Make Binaries Executable

```bash
chmod +x bin/*
```

### 3. Install Shared Libraries System-wide

```bash
sudo mkdir -p /usr/local/lib/sensevoice
sudo cp lib/*.so /usr/local/lib/sensevoice/
echo "/usr/local/lib/sensevoice" | sudo tee /etc/ld.so.conf.d/sensevoice.conf
sudo ldconfig
```

Alternatively, set `LD_LIBRARY_PATH` locally:

```bash
export LD_LIBRARY_PATH="$PWD/lib:$LD_LIBRARY_PATH"
```

### 4. Model Setup

Download or convert a GGUF model (e.g., `sense-voice-small-q4_k.gguf`):

```bash
# From Hugging Face
git clone https://huggingface.co/lovemefan/sense-voice-gguf.git models
```

### 5. Run Examples

#### Speech-to-Text (non-streaming)

```bash
bin/sense-voice-main \
  -m models/sense-voice-small-q4_k.gguf \
  -f input.wav \
  -t 4 \
  -l zh \
  --use-itn \
  --flash-attn
```

**Options**:

* `-t N` / `--threads N`: Number of decode threads (default: 4)
* `-l LANG` / `--language LANG`: `auto`, `zh`, `en`, `yue`, `ja`, `ko`
* `--min_speech_duration_ms`, `--max_speech_duration_ms`: VAD thresholds
* `--no-gpu` (`-ng`): Disable GPU
* `--use-itn` (`-itn`): Enable inverse text normalization
* `--flash-attn` (`-fa`): Enable Flash Attention decoder

#### Quantization Utility

```bash
bin/sense-voice-quantize \
  --input models/sense-voice-small.bin \
  --output models/sense-voice-small-q4_k.gguf \
  --type q4_k
```

Supported quant types: `q3`, `q4_k`, `q4_0`, `q5_0`, `q6_k`, `q8`.

#### Zero-Crossing Rate Demo

```bash
bin/sense-voice-zcr-main input.wav
```

*For streaming ASR or advanced examples, please refer to upstream's `sense-voice-stream` in the original repo.*

---

## 🛠 Compatibility

* **Hardware**: NVIDIA Jetson Nano
* **OS**: Ubuntu 18.04 / JetPack 4.5.1
* **CUDA**: 10.2
* **C++**: C++17

---

## 📜 License

MIT License — see [LICENSE](LICENSE) for details.

For comprehensive build instructions, extra examples, and advanced backend support, visit the [official SenseVoice.cpp documentation](https://github.com/lovemefan/SenseVoice.cpp/blob/main/docs/build.md). Happy prototyping! 🎙️💕