Add/update the quantized ONNX model files and README.md for Transformers.js v3 (#3)
Browse files- Add/update the quantized ONNX model files and README.md for Transformers.js v3 (f0a7df05368edd7199197c3cf609b0bced620d5b)
Co-authored-by: Yuichiro Tachibana <[email protected]>
- README.md +3 -1
- onnx/model_bnb4.onnx +3 -0
- onnx/model_int8.onnx +3 -0
- onnx/model_q4.onnx +3 -0
- onnx/model_q4f16.onnx +3 -0
- onnx/model_uint8.onnx +3 -0
README.md
CHANGED
@@ -48,7 +48,9 @@ console.log(output.tolist());
|
|
48 |
|
49 |
By default, an 8-bit quantized version of the model is used, but you can choose to use the full-precision (fp32) version by specifying `{ dtype: 'fp32' }` in the `pipeline` function:
|
50 |
```js
|
51 |
-
const extractor = await pipeline('feature-extraction', 'Xenova/gte-small', {
|
|
|
|
|
52 |
```
|
53 |
|
54 |
---
|
|
|
48 |
|
49 |
By default, an 8-bit quantized version of the model is used, but you can choose to use the full-precision (fp32) version by specifying `{ dtype: 'fp32' }` in the `pipeline` function:
|
50 |
```js
|
51 |
+
const extractor = await pipeline('feature-extraction', 'Xenova/gte-small', {
|
52 |
+
dtype: 'fp32' // Options: "fp32", "fp16", "q8", "q4"
|
53 |
+
});
|
54 |
```
|
55 |
|
56 |
---
|
onnx/model_bnb4.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:06bf293aba7dc80ddaab6c15fc647310302504d502d504fff773a3f107116986
|
3 |
+
size 60147542
|
onnx/model_int8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f1337f30686b7e7a410ec5b3ff2c1e814c74d0c92ef69be3512eab1e9ce545b0
|
3 |
+
size 33760831
|
onnx/model_q4.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:21f088eba0a3a6942efbd11ac4bf6fa697c5fcbd2ea81d27764f22df6d873fe1
|
3 |
+
size 61474190
|
onnx/model_q4f16.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c55901040c7ebbc26df6933a54bb8feb79053496153c06dc1b013b0406278e0c
|
3 |
+
size 36190171
|
onnx/model_uint8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bbebccc991415aa73dec524b3dca5f8b51eaad2f23b0be374f146c739aa6f69b
|
3 |
+
size 33760859
|