Spaces:
Running
Running
Clifford Heath
commited on
readme : add instructions on converting to GGML + "--no-config" to wget (#874)
Browse files- README.md +2 -0
- models/README.md +17 -5
- models/download-ggml-model.sh +1 -1
README.md
CHANGED
|
@@ -71,6 +71,8 @@ Then, download one of the Whisper models converted in [ggml format](models). For
|
|
| 71 |
bash ./models/download-ggml-model.sh base.en
|
| 72 |
```
|
| 73 |
|
|
|
|
|
|
|
| 74 |
Now build the [main](examples/main) example and transcribe an audio file like this:
|
| 75 |
|
| 76 |
```bash
|
|
|
|
| 71 |
bash ./models/download-ggml-model.sh base.en
|
| 72 |
```
|
| 73 |
|
| 74 |
+
If you wish to convert the Whisper models to ggml format yourself, instructions are in [models/README.md](models/README.md).
|
| 75 |
+
|
| 76 |
Now build the [main](examples/main) example and transcribe an audio file like this:
|
| 77 |
|
| 78 |
```bash
|
models/README.md
CHANGED
|
@@ -1,15 +1,17 @@
|
|
| 1 |
## Whisper model files in custom ggml format
|
| 2 |
|
| 3 |
The [original Whisper PyTorch models provided by OpenAI](https://github.com/openai/whisper/blob/main/whisper/__init__.py#L17-L27)
|
| 4 |
-
|
| 5 |
-
using the [convert-pt-to-ggml.py](convert-pt-to-ggml.py) script.
|
| 6 |
-
|
| 7 |
-
|
|
|
|
|
|
|
| 8 |
|
| 9 |
- https://huggingface.co/ggerganov/whisper.cpp
|
| 10 |
- https://ggml.ggerganov.com
|
| 11 |
|
| 12 |
-
Sample
|
| 13 |
|
| 14 |
```java
|
| 15 |
$ ./download-ggml-model.sh base.en
|
|
@@ -21,6 +23,16 @@ You can now use it like this:
|
|
| 21 |
$ ./main -m models/ggml-base.en.bin -f samples/jfk.wav
|
| 22 |
```
|
| 23 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
A third option to obtain the model files is to download them from Hugging Face:
|
| 25 |
|
| 26 |
https://huggingface.co/ggerganov/whisper.cpp/tree/main
|
|
|
|
| 1 |
## Whisper model files in custom ggml format
|
| 2 |
|
| 3 |
The [original Whisper PyTorch models provided by OpenAI](https://github.com/openai/whisper/blob/main/whisper/__init__.py#L17-L27)
|
| 4 |
+
are converted to custom `ggml` format in order to be able to load them in C/C++.
|
| 5 |
+
Conversion is performed using the [convert-pt-to-ggml.py](convert-pt-to-ggml.py) script.
|
| 6 |
+
|
| 7 |
+
You can either obtain the original models and generate the `ggml` files yourself using the conversion script,
|
| 8 |
+
or you can use the [download-ggml-model.sh](download-ggml-model.sh) script to download the already converted models.
|
| 9 |
+
Currently, they are hosted on the following locations:
|
| 10 |
|
| 11 |
- https://huggingface.co/ggerganov/whisper.cpp
|
| 12 |
- https://ggml.ggerganov.com
|
| 13 |
|
| 14 |
+
Sample download:
|
| 15 |
|
| 16 |
```java
|
| 17 |
$ ./download-ggml-model.sh base.en
|
|
|
|
| 23 |
$ ./main -m models/ggml-base.en.bin -f samples/jfk.wav
|
| 24 |
```
|
| 25 |
|
| 26 |
+
To convert the files yourself, use the convert-pt-to-ggml.py script. Here is an example usage.
|
| 27 |
+
The original PyTorch files are assumed to have been downloaded into ~/.cache/whisper
|
| 28 |
+
Change `~/path/to/repo/whisper/` to the location for your copy of the Whisper source:
|
| 29 |
+
```
|
| 30 |
+
mkdir models/whisper-medium
|
| 31 |
+
python models/convert-pt-to-ggml.py ~/.cache/whisper/medium.pt ~/path/to/repo/whisper/ ./models/whisper-medium
|
| 32 |
+
mv ./models/whisper-medium/ggml-model.bin models/ggml-medium.bin
|
| 33 |
+
rmdir models/whisper-medium
|
| 34 |
+
```
|
| 35 |
+
|
| 36 |
A third option to obtain the model files is to download them from Hugging Face:
|
| 37 |
|
| 38 |
https://huggingface.co/ggerganov/whisper.cpp/tree/main
|
models/download-ggml-model.sh
CHANGED
|
@@ -62,7 +62,7 @@ if [ -f "ggml-$model.bin" ]; then
|
|
| 62 |
fi
|
| 63 |
|
| 64 |
if [ -x "$(command -v wget)" ]; then
|
| 65 |
-
wget --quiet --show-progress -O ggml-$model.bin $src/$pfx-$model.bin
|
| 66 |
elif [ -x "$(command -v curl)" ]; then
|
| 67 |
curl -L --output ggml-$model.bin $src/$pfx-$model.bin
|
| 68 |
else
|
|
|
|
| 62 |
fi
|
| 63 |
|
| 64 |
if [ -x "$(command -v wget)" ]; then
|
| 65 |
+
wget --no-config --quiet --show-progress -O ggml-$model.bin $src/$pfx-$model.bin
|
| 66 |
elif [ -x "$(command -v curl)" ]; then
|
| 67 |
curl -L --output ggml-$model.bin $src/$pfx-$model.bin
|
| 68 |
else
|