whitphx HF Staff commited on
Commit
58488b2
Β·
verified Β·
1 Parent(s): dd43b2a

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### βœ… Based on `decoder_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_bnb4.onnx` (added)

### βœ… Based on `decoder_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_bnb4.onnx` (added)

### βœ… Based on `encoder_model.onnx` *with* slimming

↳ βœ… `int8`: `encoder_model_int8.onnx` (added)
↳ βœ… `uint8`: `encoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `encoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `encoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `encoder_model_bnb4.onnx` (added)

### βœ… Based on `encoder_model.onnx` *with* slimming

↳ βœ… `int8`: `encoder_model_int8.onnx` (added)
↳ βœ… `uint8`: `encoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `encoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `encoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `encoder_model_bnb4.onnx` (added)

### βœ… Based on `decoder_with_past_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_with_past_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_with_past_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_with_past_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_with_past_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_with_past_model_bnb4.onnx` (added)

### βœ… Based on `decoder_with_past_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_with_past_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_with_past_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_with_past_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_with_past_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_with_past_model_bnb4.onnx` (added)

### βœ… Based on `decoder_model_merged.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_merged_fp16.onnx` (replaced because it was invalid)
↳ βœ… `int8`: `decoder_model_merged_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_merged_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_merged_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_merged_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_merged_bnb4.onnx` (added)

### βœ… Based on `decoder_model_merged.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_merged_fp16.onnx` (replaced because it was invalid)
↳ βœ… `int8`: `decoder_model_merged_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_merged_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_merged_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_merged_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_merged_bnb4.onnx` (added)

README.md CHANGED
@@ -6,4 +6,30 @@ pipeline_tag: summarization
6
 
7
  https://huggingface.co/sshleifer/distilbart-xsum-12-1 with ONNX weights to be compatible with Transformers.js.
8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
6
 
7
  https://huggingface.co/sshleifer/distilbart-xsum-12-1 with ONNX weights to be compatible with Transformers.js.
8
 
9
+ ## Usage (Transformers.js)
10
+
11
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
12
+ ```bash
13
+ npm i @huggingface/transformers
14
+ ```
15
+
16
+ **Example:** Summarization.
17
+
18
+ ```js
19
+ import { pipeline } from '@huggingface/transformers';
20
+
21
+ const generator = await pipeline('summarization', 'Xenova/distilbart-xsum-12-1');
22
+ const text = 'The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, ' +
23
+ 'and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. ' +
24
+ 'During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest ' +
25
+ 'man-made structure in the world, a title it held for 41 years until the Chrysler Building in New ' +
26
+ 'York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to ' +
27
+ 'the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the ' +
28
+ 'Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second ' +
29
+ 'tallest free-standing structure in France after the Millau Viaduct.';
30
+ const output = await generator(text, {
31
+ max_new_tokens: 100,
32
+ });
33
+ ```
34
+
35
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
onnx/decoder_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c82c1e2509ea12a6eb7b8d8b697649170120c2487d5dbe7f63c5b0d29cdd42ac
3
+ size 219665904
onnx/decoder_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:42d66b7df0f5f352b8a69e77aca8b13b79e38cdfd3c0b0bdabeac6f42583d30b
3
+ size 138697486
onnx/decoder_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d491670c2427b5c696f3e9497521a16cfb06dbc07dea791e08fb3971cc611ba9
3
+ size 69454603
onnx/decoder_model_merged_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8ec006d46c1a6a9db4f567ebe41666fa4b591e3b8cffe80b6d3de7bfbac29f08
3
+ size 219927012
onnx/decoder_model_merged_fp16.onnx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:044e711f23718d2e8860bed6a355ea10524163af46044ecca83e5c1f34c5014a
3
- size 138893241
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9130a5d87b7b2b0e91b026e1fbeadf3c9ffdc8c208087da65af8d84549e0e725
3
+ size 138858593
onnx/decoder_model_merged_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:35f777b6639c8d0e36d97a449310c913a3f027aeea72617ac1d28347940f2706
3
+ size 69725287
onnx/decoder_model_merged_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab431fdcb8128bb57bc9b52c400a4b871badfd7e9f6fcf04a4c099d31a6916f2
3
+ size 220975435
onnx/decoder_model_merged_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d27bf8ba768757054d51347f9c1880a9c033e65434af73a9a65700731f2d0540
3
+ size 114743918
onnx/decoder_model_merged_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:206db7864c0e7c0462f62cc1fb281cee7496f52f4cbf6311f002097cfd41b721
3
+ size 69725290
onnx/decoder_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cb5af8fd31d07a58a8269037b0f7de5e839009a2cac962b8235f3f260b13bf0c
3
+ size 220714399
onnx/decoder_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1dff6c1607537c26962e7519f828d2e46d29e89ae75c45236a4054c8831bdb2c
3
+ size 114581722
onnx/decoder_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2efe620ec8ac2b90837c2fe9bf64a2df641bdca79505b916ba944c14ecd9fc01
3
+ size 69454606
onnx/decoder_with_past_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:917f0ab5ebac934548f806e58d6552e52ff67067c57030f701a29236b00e4487
3
+ size 218463803
onnx/decoder_with_past_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:579ffaacf1704d2a6693879e29c006bd1a1ec08e4f0ddfe021b463bfad2d8156
3
+ size 134485751
onnx/decoder_with_past_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:56e50aa90b35f40681527893e4c43f3571d159e510d5647c58a2c1c855d4990b
3
+ size 67333183
onnx/decoder_with_past_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:795597068d487ded4221107519343328a18ad4ef1f34099aa961adfefdcf64dd
3
+ size 219381242
onnx/decoder_with_past_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:26f69364e32efdb35120d2d4120985820b073263263cec7e023a063af18619fa
3
+ size 113384351
onnx/decoder_with_past_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d664a13fbd6ecf13015f15e29f539d5839bb08481878b224244e452d37f1ebae
3
+ size 67333186
onnx/encoder_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d40e5630221e57e673ea22610f893a75b34021880731372d46c2d1aac6bd913c
3
+ size 295912467
onnx/encoder_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d304f1422d1e4a418d028e6fbfea39c5cbef90cceec4e37a5ea704ea1897d04f
3
+ size 204472178
onnx/encoder_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b9fd017b3489a8025c10c8d9973f5cf18ec085ab7d2e6b9aa61f7e4b9927fa0b
3
+ size 305349063
onnx/encoder_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8bf8d0d2be3ed2027a91840ffe05f73457948da88ee8177ff907549b1a81f52a
3
+ size 190546742
onnx/encoder_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9ebe9d46358044e69908239d67ccaae2ebf759cdc1ea4c4f76d9e17406ed46fd
3
+ size 204472217