ggerganov commited on
Commit
05261df
·
unverified ·
1 Parent(s): c71bca8

extra : compute SHA of all models files

Browse files
Files changed (3) hide show
  1. README.md +24 -17
  2. extra/sha-all.sh +7 -0
  3. models/README.md +14 -0
README.md CHANGED
@@ -59,8 +59,8 @@ For a quick demo, simply run `make base.en`:
59
  ```java
60
  $ make base.en
61
 
62
- cc -I. -O3 -std=c11 -pthread -DGGML_USE_ACCELERATE -c ggml.c
63
- c++ -I. -I./examples -O3 -std=c++11 -pthread -c whisper.cpp
64
  c++ -I. -I./examples -O3 -std=c++11 -pthread examples/main/main.cpp whisper.o ggml.o -o main -framework Accelerate
65
  ./main -h
66
 
@@ -70,13 +70,17 @@ options:
70
  -h, --help show this help message and exit
71
  -s SEED, --seed SEED RNG seed (default: -1)
72
  -t N, --threads N number of threads to use during computation (default: 4)
 
73
  -ot N, --offset-t N time offset in milliseconds (default: 0)
74
  -on N, --offset-n N segment index offset (default: 0)
 
 
75
  -v, --verbose verbose output
76
  --translate translate from source language to english
77
  -otxt, --output-txt output result in a text file
78
  -ovtt, --output-vtt output result in a vtt file
79
  -osrt, --output-srt output result in a srt file
 
80
  -ps, --print_special print special tokens
81
  -pc, --print_colors print colors
82
  -nt, --no_timestamps do not print timestamps
@@ -86,7 +90,7 @@ options:
86
 
87
  bash ./models/download-ggml-model.sh base.en
88
  Downloading ggml model base.en ...
89
- ggml-base.en.bin 100%[========================>] 141.11M 6.34MB/s in 24s
90
  Done! Model 'base.en' saved in 'models/ggml-base.en.bin'
91
  You can now use it like this:
92
 
@@ -114,23 +118,26 @@ whisper_model_load: n_text_layer = 6
114
  whisper_model_load: n_mels = 80
115
  whisper_model_load: f16 = 1
116
  whisper_model_load: type = 2
117
- whisper_model_load: mem_required = 505.00 MB
118
  whisper_model_load: adding 1607 extra tokens
119
- whisper_model_load: ggml ctx size = 163.43 MB
120
  whisper_model_load: memory size = 22.83 MB
121
  whisper_model_load: model size = 140.54 MB
122
 
123
- main: processing 'samples/jfk.wav' (176000 samples, 11.0 sec), 4 threads, lang = en, task = transcribe, timestamps = 1 ...
124
 
125
- [00:00.000 --> 00:11.000] And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country.
126
 
127
 
128
- whisper_print_timings: load time = 87.21 ms
129
- whisper_print_timings: mel time = 24.26 ms
130
- whisper_print_timings: sample time = 3.87 ms
131
- whisper_print_timings: encode time = 323.67 ms / 53.94 ms per layer
132
- whisper_print_timings: decode time = 83.25 ms / 13.87 ms per layer
133
- whisper_print_timings: total time = 522.66 ms
 
 
 
134
  ```
135
 
136
  The command downloads the `base.en` model converted to custom `ggml` format and runs the inference on all `.wav` samples in the folder `samples`.
@@ -172,8 +179,8 @@ make large
172
 
173
  | Model | Disk | Mem | SHA |
174
  | --- | --- | --- | --- |
175
- | tiny | 75 MB | ~280 MB | `bd577a113a864445d4c299885e0cb97d4ba92b5f` |
176
- | base | 142 MB | ~430 MB | `465707469ff3a37a2b9b8d8f89f2f99de7299dac` |
177
  | small | 466 MB | ~1.0 GB | `55356645c2b361a969dfd0ef2c5a50d530afd8d5` |
178
  | medium | 1.5 GB | ~2.6 GB | `fd9727b6e1217c2f614f9b698455c4ffd82463b4` |
179
  | large | 2.9 GB | ~4.7 GB | `b1caaf735c4cc1429223d5a74f0f4d0b9b59a299` |
@@ -185,7 +192,7 @@ in about half a minute on a MacBook M1 Pro, using `medium.en` model:
185
 
186
  <details>
187
  <summary>Expand to see the result</summary>
188
-
189
  ```java
190
  $ ./main -m models/ggml-medium.en.bin -f samples/gb1.wav -t 8
191
 
@@ -315,7 +322,7 @@ https://user-images.githubusercontent.com/1991296/199337538-b7b0c7a3-2753-4a88-a
315
  ## Implementation details
316
 
317
  - The core tensor operations are implemented in C ([ggml.h](ggml.h) / [ggml.c](ggml.c))
318
- - The high-level C-style API is implemented in C++ ([whisper.h](whisper.h) / [whisper.cpp](whisper.cpp))
319
  - Sample usage is demonstrated in [main.cpp](examples/main)
320
  - Sample real-time audio transcription from the microphone is demonstrated in [stream.cpp](examples/stream)
321
  - Various other examples are available in the [examples](examples) folder
 
59
  ```java
60
  $ make base.en
61
 
62
+ cc -I. -O3 -std=c11 -pthread -DGGML_USE_ACCELERATE -c ggml.c -o ggml.o
63
+ c++ -I. -I./examples -O3 -std=c++11 -pthread -c whisper.cpp -o whisper.o
64
  c++ -I. -I./examples -O3 -std=c++11 -pthread examples/main/main.cpp whisper.o ggml.o -o main -framework Accelerate
65
  ./main -h
66
 
 
70
  -h, --help show this help message and exit
71
  -s SEED, --seed SEED RNG seed (default: -1)
72
  -t N, --threads N number of threads to use during computation (default: 4)
73
+ -p N, --processors N number of processors to use during computation (default: 1)
74
  -ot N, --offset-t N time offset in milliseconds (default: 0)
75
  -on N, --offset-n N segment index offset (default: 0)
76
+ -mc N, --max-context N maximum number of text context tokens to store (default: max)
77
+ -wt N, --word-thold N word timestamp probability threshold (default: 0.010000)
78
  -v, --verbose verbose output
79
  --translate translate from source language to english
80
  -otxt, --output-txt output result in a text file
81
  -ovtt, --output-vtt output result in a vtt file
82
  -osrt, --output-srt output result in a srt file
83
+ -owts, --output-words output word-level timestamps to a text file
84
  -ps, --print_special print special tokens
85
  -pc, --print_colors print colors
86
  -nt, --no_timestamps do not print timestamps
 
90
 
91
  bash ./models/download-ggml-model.sh base.en
92
  Downloading ggml model base.en ...
93
+ ggml-base.en.bin 100%[========================>] 141.11M 6.34MB/s in 24s
94
  Done! Model 'base.en' saved in 'models/ggml-base.en.bin'
95
  You can now use it like this:
96
 
 
118
  whisper_model_load: n_mels = 80
119
  whisper_model_load: f16 = 1
120
  whisper_model_load: type = 2
121
+ whisper_model_load: mem_required = 670.00 MB
122
  whisper_model_load: adding 1607 extra tokens
123
+ whisper_model_load: ggml ctx size = 140.60 MB
124
  whisper_model_load: memory size = 22.83 MB
125
  whisper_model_load: model size = 140.54 MB
126
 
127
+ system_info: n_threads = 4 / 10 | AVX2 = 0 | AVX512 = 0 | NEON = 1 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 |
128
 
129
+ main: processing 'samples/jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, lang = en, task = transcribe, timestamps = 1 ...
130
 
131
 
132
+ [00:00:00.000 --> 00:00:11.000] And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country.
133
+
134
+
135
+ whisper_print_timings: load time = 105.91 ms
136
+ whisper_print_timings: mel time = 24.62 ms
137
+ whisper_print_timings: sample time = 3.63 ms
138
+ whisper_print_timings: encode time = 324.71 ms / 54.12 ms per layer
139
+ whisper_print_timings: decode time = 83.58 ms / 13.93 ms per layer
140
+ whisper_print_timings: total time = 542.81 ms
141
  ```
142
 
143
  The command downloads the `base.en` model converted to custom `ggml` format and runs the inference on all `.wav` samples in the folder `samples`.
 
179
 
180
  | Model | Disk | Mem | SHA |
181
  | --- | --- | --- | --- |
182
+ | tiny | 75 MB | ~390 MB | `bd577a113a864445d4c299885e0cb97d4ba92b5f` |
183
+ | base | 142 MB | ~500 MB | `465707469ff3a37a2b9b8d8f89f2f99de7299dac` |
184
  | small | 466 MB | ~1.0 GB | `55356645c2b361a969dfd0ef2c5a50d530afd8d5` |
185
  | medium | 1.5 GB | ~2.6 GB | `fd9727b6e1217c2f614f9b698455c4ffd82463b4` |
186
  | large | 2.9 GB | ~4.7 GB | `b1caaf735c4cc1429223d5a74f0f4d0b9b59a299` |
 
192
 
193
  <details>
194
  <summary>Expand to see the result</summary>
195
+
196
  ```java
197
  $ ./main -m models/ggml-medium.en.bin -f samples/gb1.wav -t 8
198
 
 
322
  ## Implementation details
323
 
324
  - The core tensor operations are implemented in C ([ggml.h](ggml.h) / [ggml.c](ggml.c))
325
+ - The transformer model and the high-level C-style API are implemented in C++ ([whisper.h](whisper.h) / [whisper.cpp](whisper.cpp))
326
  - Sample usage is demonstrated in [main.cpp](examples/main)
327
  - Sample real-time audio transcription from the microphone is demonstrated in [stream.cpp](examples/stream)
328
  - Various other examples are available in the [examples](examples) folder
extra/sha-all.sh ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+
3
+ # Compute the SHA1 of all model files in ./models/ggml-*.bin
4
+
5
+ for f in ./models/ggml-*.bin; do
6
+ shasum "$f" -a 1
7
+ done
models/README.md CHANGED
@@ -22,6 +22,20 @@ A third option to obtain the model files is to download them from Hugging Face:
22
 
23
  https://huggingface.co/datasets/ggerganov/whisper.cpp/tree/main
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  ## Model files for testing purposes
26
 
27
  The model files pefixed with `for-tests-` are empty (i.e. do not contain any weights) and are used by the CI for testing purposes.
 
22
 
23
  https://huggingface.co/datasets/ggerganov/whisper.cpp/tree/main
24
 
25
+ ## Available models
26
+
27
+ | Model | Disk | Mem | SHA |
28
+ | --- | --- | --- | --- |
29
+ | tiny | 75 MB | ~390 MB | `bd577a113a864445d4c299885e0cb97d4ba92b5f` |
30
+ | tiny.en | 75 MB | ~390 MB | `c78c86eb1a8faa21b369bcd33207cc90d64ae9df` |
31
+ | base | 142 MB | ~500 MB | `465707469ff3a37a2b9b8d8f89f2f99de7299dac` |
32
+ | base.en | 142 MB | ~500 MB | `137c40403d78fd54d454da0f9bd998f78703390c` |
33
+ | small | 466 MB | ~1.0 GB | `55356645c2b361a969dfd0ef2c5a50d530afd8d5` |
34
+ | small.en | 466 MB | ~1.0 GB | `db8a495a91d927739e50b3fc1cc4c6b8f6c2d022` |
35
+ | medium | 1.5 GB | ~2.6 GB | `fd9727b6e1217c2f614f9b698455c4ffd82463b4` |
36
+ | medium.en | 1.5 GB | ~2.6 GB | `8c30f0e44ce9560643ebd10bbe50cd20eafd3723` |
37
+ | large | 2.9 GB | ~4.7 GB | `b1caaf735c4cc1429223d5a74f0f4d0b9b59a299` |
38
+
39
  ## Model files for testing purposes
40
 
41
  The model files pefixed with `for-tests-` are empty (i.e. do not contain any weights) and are used by the CI for testing purposes.