test
#17
by
ravivarmai
- opened
README.md
CHANGED
@@ -1,5 +1,4 @@
|
|
1 |
---
|
2 |
-
library_name: mistral-common
|
3 |
language:
|
4 |
- en
|
5 |
- fr
|
@@ -10,18 +9,20 @@ language:
|
|
10 |
- nl
|
11 |
- hi
|
12 |
license: apache-2.0
|
|
|
13 |
inference: false
|
14 |
extra_gated_description: >-
|
15 |
If you want to learn more about how we process your personal data, please read
|
16 |
our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
|
|
|
17 |
tags:
|
18 |
-
-
|
19 |
---
|
20 |
# Voxtral Mini 1.0 (3B) - 2507
|
21 |
|
22 |
Voxtral Mini is an enhancement of [Ministral 3B](https://mistral.ai/news/ministraux), incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding.
|
23 |
|
24 |
-
Learn more about Voxtral in our blog post [here](https://mistral.ai/news/voxtral)
|
25 |
|
26 |
## Key Features
|
27 |
|
@@ -63,10 +64,10 @@ We recommend using this model with [vLLM](https://github.com/vllm-project/vllm).
|
|
63 |
|
64 |
#### Installation
|
65 |
|
66 |
-
Make sure to install vllm
|
67 |
|
68 |
```
|
69 |
-
uv pip install -U "vllm[audio]" --
|
70 |
```
|
71 |
|
72 |
Doing so should automatically install [`mistral_common >= 1.8.1`](https://github.com/mistralai/mistral-common/releases/tag/v1.8.1).
|
@@ -241,11 +242,11 @@ print(response)
|
|
241 |
|
242 |
### Transformers 🤗
|
243 |
|
244 |
-
|
245 |
|
246 |
-
Install Transformers:
|
247 |
```bash
|
248 |
-
pip install
|
249 |
```
|
250 |
|
251 |
Make sure to have `mistral-common >= 1.8.1` installed with audio dependencies:
|
@@ -511,7 +512,7 @@ repo_id = "mistralai/Voxtral-Mini-3B-2507"
|
|
511 |
processor = AutoProcessor.from_pretrained(repo_id)
|
512 |
model = VoxtralForConditionalGeneration.from_pretrained(repo_id, torch_dtype=torch.bfloat16, device_map=device)
|
513 |
|
514 |
-
inputs = processor.
|
515 |
inputs = inputs.to(device, dtype=torch.bfloat16)
|
516 |
|
517 |
outputs = model.generate(**inputs, max_new_tokens=500)
|
|
|
1 |
---
|
|
|
2 |
language:
|
3 |
- en
|
4 |
- fr
|
|
|
9 |
- nl
|
10 |
- hi
|
11 |
license: apache-2.0
|
12 |
+
library_name: vllm
|
13 |
inference: false
|
14 |
extra_gated_description: >-
|
15 |
If you want to learn more about how we process your personal data, please read
|
16 |
our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
|
17 |
+
pipeline_tag: audio-text-to-text
|
18 |
tags:
|
19 |
+
- transformers
|
20 |
---
|
21 |
# Voxtral Mini 1.0 (3B) - 2507
|
22 |
|
23 |
Voxtral Mini is an enhancement of [Ministral 3B](https://mistral.ai/news/ministraux), incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding.
|
24 |
|
25 |
+
Learn more about Voxtral in our blog post [here](https://mistral.ai/news/voxtral).
|
26 |
|
27 |
## Key Features
|
28 |
|
|
|
64 |
|
65 |
#### Installation
|
66 |
|
67 |
+
Make sure to install vllm from "main", we recommend using `uv`:
|
68 |
|
69 |
```
|
70 |
+
uv pip install -U "vllm[audio]" --torch-backend=auto --extra-index-url https://wheels.vllm.ai/nightly
|
71 |
```
|
72 |
|
73 |
Doing so should automatically install [`mistral_common >= 1.8.1`](https://github.com/mistralai/mistral-common/releases/tag/v1.8.1).
|
|
|
242 |
|
243 |
### Transformers 🤗
|
244 |
|
245 |
+
Voxtral is supported in Transformers natively!
|
246 |
|
247 |
+
Install Transformers from source:
|
248 |
```bash
|
249 |
+
pip install git+https://github.com/huggingface/transformers
|
250 |
```
|
251 |
|
252 |
Make sure to have `mistral-common >= 1.8.1` installed with audio dependencies:
|
|
|
512 |
processor = AutoProcessor.from_pretrained(repo_id)
|
513 |
model = VoxtralForConditionalGeneration.from_pretrained(repo_id, torch_dtype=torch.bfloat16, device_map=device)
|
514 |
|
515 |
+
inputs = processor.apply_transcrition_request(language="en", audio="https://huggingface.co/datasets/hf-internal-testing/dummy-audio-samples/resolve/main/obama.mp3", model_id=repo_id)
|
516 |
inputs = inputs.to(device, dtype=torch.bfloat16)
|
517 |
|
518 |
outputs = model.generate(**inputs, max_new_tokens=500)
|