rpanchum commited on
Commit
400f089
·
verified ·
1 Parent(s): 8ec02c1

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Falcon3-7B-Instruct OpenVINO INT4
2
+
3
+ This repository contains the [tiiuae/Falcon3-7B-Instruct](https://huggingface.co/tiiuae/Falcon3-7B-Instruct) model optimized for inference with Intel's OpenVINO runtime. The model has been quantized to INT4 using the AWQ quantization scheme for improved performance while maintaining quality.
4
+
5
+ ## Model Details
6
+
7
+ * **Original Model**: [tiiuae/Falcon3-7B-Instruct](https://huggingface.co/tiiuae/Falcon3-7B-Instruct)
8
+ * **Model Type**: Instruction-tuned Large Language Model
9
+ * **Parameters**: 7B
10
+ * **Quantization**: INT4 Symmetric AWQ (Activation-aware Weight Quantization)
11
+ * **Group Size**: -1 (per-channel quantization)
12
+
13
+ ## Optimization Details
14
+
15
+ This model was converted from the original Hugging Face model to OpenVINO format using the Optimum Intel library. The following optimization command was used:
16
+
17
+ ```bash
18
+ optimum-cli export openvino \
19
+ -m tiiuae/Falcon3-7B-Instruct \
20
+ --weight-format int4 \
21
+ --sym \
22
+ --dataset auto \
23
+ --awq \
24
+ --group-size -1 \
25
+ falcon3-7b-instruct-int4-sym-ov
26
+ ```
27
+
28
+ ## Usage
29
+
30
+ ### Prerequisites
31
+ - OpenVINO 2024.0 or newer
32
+ - optimum-intel
33
+ - transformers
34
+
35
+ ### Sample Inference code with Optimum Intel
36
+
37
+ ```python
38
+ from optimum.intel import OVModelForCausalLM
39
+ from transformers import AutoTokenizer
40
+
41
+ # Load tokenizer and model
42
+ model_id = "rpanchum/falcon3-7b-instruct-int4-sym-ov"
43
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
44
+ model = OVModelForCausalLM.from_pretrained(model_id)
45
+
46
+ # Generate text
47
+ prompt = "Write a short story about a robot learning to paint:"
48
+ input_ids = tokenizer(prompt, return_tensors="pt")
49
+ output = model.generate(
50
+ **input_ids,
51
+ max_new_tokens=512,
52
+ temperature=0.7,
53
+ top_p=0.9,
54
+ )
55
+ response = tokenizer.decode(output[0], skip_special_tokens=True)
56
+ print(response)
57
+ ```
58
+
59
+ ### Sample Inference code with Optimum Intel
60
+
61
+ 1. Install packages required for using OpenVINO GenAI.
62
+ ```
63
+ pip install openvino-genai huggingface_hub
64
+ ```
65
+
66
+ 2. Download model and run inference.
67
+
68
+ ```
69
+ import huggingface_hub as hf_hub
70
+
71
+ model_id = "rpanchum/falcon3-7b-instruct-int4-sym-ov"
72
+ model_path = "falcon3-7b-instruct-int4-sym-ov"
73
+
74
+ hf_hub.snapshot_download(model_id, local_dir=model_path)
75
+
76
+ import openvino_genai as ov_genai
77
+
78
+ device = "CPU"
79
+ pipe = ov_genai.LLMPipeline(model_path, device)
80
+ print(pipe.generate("What is OpenVINO?", max_length=200))
81
+
82
+ ```
83
+
84
+
85
+ ## License
86
+
87
+ This model inherits the license of the original [tiiuae/Falcon3-7B-Instruct](https://huggingface.co/tiiuae/Falcon3-7B-Instruct) model.
config.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "tiiuae/Falcon3-7B-Instruct",
3
+ "architectures": [
4
+ "LlamaForCausalLM"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.0,
8
+ "bos_token_id": 11,
9
+ "eos_token_id": 11,
10
+ "head_dim": 256,
11
+ "hidden_act": "silu",
12
+ "hidden_size": 3072,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 23040,
15
+ "max_position_embeddings": 32768,
16
+ "mlp_bias": false,
17
+ "model_type": "llama",
18
+ "num_attention_heads": 12,
19
+ "num_hidden_layers": 28,
20
+ "num_key_value_heads": 4,
21
+ "pretraining_tp": 1,
22
+ "rms_norm_eps": 1e-06,
23
+ "rope_scaling": null,
24
+ "rope_theta": 1000042,
25
+ "tie_word_embeddings": false,
26
+ "torch_dtype": "bfloat16",
27
+ "transformers_version": "4.48.3",
28
+ "use_cache": true,
29
+ "vocab_size": 131072
30
+ }
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 11,
4
+ "eos_token_id": 11,
5
+ "transformers_version": "4.48.3"
6
+ }
openvino_config.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "compression": null,
3
+ "dtype": "int4",
4
+ "input_info": null,
5
+ "optimum_version": "1.24.0",
6
+ "quantization_config": {
7
+ "all_layers": null,
8
+ "backup_precision": null,
9
+ "bits": 4,
10
+ "dataset": "auto",
11
+ "gptq": null,
12
+ "group_size": -1,
13
+ "ignored_scope": null,
14
+ "lora_correction": null,
15
+ "num_samples": null,
16
+ "processor": null,
17
+ "quant_method": "awq",
18
+ "ratio": 1.0,
19
+ "scale_estimation": null,
20
+ "sensitivity_metric": null,
21
+ "sym": true,
22
+ "tokenizer": null,
23
+ "trust_remote_code": false,
24
+ "weight_format": "int4"
25
+ },
26
+ "save_onnx_model": false,
27
+ "transformers_version": "4.48.3"
28
+ }
openvino_detokenizer.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8eabdf2a03d9e8e72bab728552871175f3eaf1a89b1b1c69f2145a424140361a
3
+ size 1449016
openvino_detokenizer.xml ADDED
@@ -0,0 +1,335 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <?xml version="1.0"?>
2
+ <net name="detokenizer" version="11">
3
+ <layers>
4
+ <layer id="0" name="Parameter_1236312" type="Parameter" version="opset1">
5
+ <data shape="?,?" element_type="i64" />
6
+ <output>
7
+ <port id="0" precision="I64" names="Parameter_1236312">
8
+ <dim>-1</dim>
9
+ <dim>-1</dim>
10
+ </port>
11
+ </output>
12
+ </layer>
13
+ <layer id="1" name="Convert_1236333" type="Convert" version="opset1">
14
+ <data destination_type="i32" />
15
+ <input>
16
+ <port id="0" precision="I64">
17
+ <dim>-1</dim>
18
+ <dim>-1</dim>
19
+ </port>
20
+ </input>
21
+ <output>
22
+ <port id="1" precision="I32">
23
+ <dim>-1</dim>
24
+ <dim>-1</dim>
25
+ </port>
26
+ </output>
27
+ </layer>
28
+ <layer id="2" name="Constant_1236287" type="Const" version="opset1">
29
+ <data element_type="u8" shape="1440855" offset="0" size="1440855" />
30
+ <output>
31
+ <port id="0" precision="U8">
32
+ <dim>1440855</dim>
33
+ </port>
34
+ </output>
35
+ </layer>
36
+ <layer id="3" name="StringTensorUnpack_1236288" type="StringTensorUnpack" version="extension">
37
+ <data mode="begins_ends" />
38
+ <input>
39
+ <port id="0" precision="U8">
40
+ <dim>1440855</dim>
41
+ </port>
42
+ </input>
43
+ <output>
44
+ <port id="1" precision="I32">
45
+ <dim>-1</dim>
46
+ </port>
47
+ <port id="2" precision="I32">
48
+ <dim>-1</dim>
49
+ </port>
50
+ <port id="3" precision="U8">
51
+ <dim>-1</dim>
52
+ </port>
53
+ </output>
54
+ </layer>
55
+ <layer id="4" name="Constant_1236316" type="Const" version="opset1">
56
+ <data element_type="i32" shape="2022" offset="1440855" size="8088" />
57
+ <output>
58
+ <port id="0" precision="I32">
59
+ <dim>2022</dim>
60
+ </port>
61
+ </output>
62
+ </layer>
63
+ <layer id="5" name="Constant_1236314" type="Const" version="opset1">
64
+ <data element_type="i32" shape="1" offset="1448943" size="4" />
65
+ <output>
66
+ <port id="0" precision="I32">
67
+ <dim>1</dim>
68
+ </port>
69
+ </output>
70
+ </layer>
71
+ <layer id="6" name="Constant_1236313" type="Const" version="opset1">
72
+ <data element_type="i32" shape="1" offset="1448947" size="4" />
73
+ <output>
74
+ <port id="0" precision="I32">
75
+ <dim>1</dim>
76
+ </port>
77
+ </output>
78
+ </layer>
79
+ <layer id="7" name="Constant_1236315" type="Const" version="opset1">
80
+ <data element_type="i32" shape="1" offset="1448951" size="4" />
81
+ <output>
82
+ <port id="0" precision="I32">
83
+ <dim>1</dim>
84
+ </port>
85
+ </output>
86
+ </layer>
87
+ <layer id="8" name="Constant_1236318" type="Const" version="opset1">
88
+ <data element_type="i64" shape="1" offset="1448955" size="8" />
89
+ <output>
90
+ <port id="0" precision="I64">
91
+ <dim>1</dim>
92
+ </port>
93
+ </output>
94
+ </layer>
95
+ <layer id="9" name="Slice_1236317" type="Slice" version="opset8">
96
+ <input>
97
+ <port id="0" precision="I32">
98
+ <dim>2022</dim>
99
+ </port>
100
+ <port id="1" precision="I32">
101
+ <dim>1</dim>
102
+ </port>
103
+ <port id="2" precision="I32">
104
+ <dim>1</dim>
105
+ </port>
106
+ <port id="3" precision="I32">
107
+ <dim>1</dim>
108
+ </port>
109
+ <port id="4" precision="I64">
110
+ <dim>1</dim>
111
+ </port>
112
+ </input>
113
+ <output>
114
+ <port id="5" precision="I32">
115
+ <dim>2022</dim>
116
+ </port>
117
+ </output>
118
+ </layer>
119
+ <layer id="10" name="VocabDecoder_1236319" type="VocabDecoder" version="extension">
120
+ <data skip_tokens="" />
121
+ <input>
122
+ <port id="0" precision="I32">
123
+ <dim>-1</dim>
124
+ <dim>-1</dim>
125
+ </port>
126
+ <port id="1" precision="I32">
127
+ <dim>-1</dim>
128
+ </port>
129
+ <port id="2" precision="I32">
130
+ <dim>-1</dim>
131
+ </port>
132
+ <port id="3" precision="U8">
133
+ <dim>-1</dim>
134
+ </port>
135
+ <port id="4" precision="I32">
136
+ <dim>2022</dim>
137
+ </port>
138
+ </input>
139
+ <output>
140
+ <port id="5" precision="I32">
141
+ <dim>-1</dim>
142
+ </port>
143
+ <port id="6" precision="I32">
144
+ <dim>-1</dim>
145
+ </port>
146
+ <port id="7" precision="I32">
147
+ <dim>-1</dim>
148
+ </port>
149
+ <port id="8" precision="I32">
150
+ <dim>-1</dim>
151
+ </port>
152
+ <port id="9" precision="U8">
153
+ <dim>-1</dim>
154
+ </port>
155
+ </output>
156
+ </layer>
157
+ <layer id="11" name="FuzeRagged_1236320" type="FuzeRagged" version="extension">
158
+ <input>
159
+ <port id="0" precision="I32">
160
+ <dim>-1</dim>
161
+ </port>
162
+ <port id="1" precision="I32">
163
+ <dim>-1</dim>
164
+ </port>
165
+ <port id="2" precision="I32">
166
+ <dim>-1</dim>
167
+ </port>
168
+ <port id="3" precision="I32">
169
+ <dim>-1</dim>
170
+ </port>
171
+ </input>
172
+ <output>
173
+ <port id="4" precision="I32">
174
+ <dim>-1</dim>
175
+ </port>
176
+ <port id="5" precision="I32">
177
+ <dim>-1</dim>
178
+ </port>
179
+ </output>
180
+ </layer>
181
+ <layer id="12" name="UTF8Validate_1236321" type="UTF8Validate" version="extension">
182
+ <data replace_mode="true" />
183
+ <input>
184
+ <port id="0" precision="I32">
185
+ <dim>-1</dim>
186
+ </port>
187
+ <port id="1" precision="I32">
188
+ <dim>-1</dim>
189
+ </port>
190
+ <port id="2" precision="U8">
191
+ <dim>-1</dim>
192
+ </port>
193
+ </input>
194
+ <output>
195
+ <port id="3" precision="I32">
196
+ <dim>-1</dim>
197
+ </port>
198
+ <port id="4" precision="I32">
199
+ <dim>-1</dim>
200
+ </port>
201
+ <port id="5" precision="U8">
202
+ <dim>-1</dim>
203
+ </port>
204
+ </output>
205
+ </layer>
206
+ <layer id="13" name="Constant_1236323" type="Const" version="opset1">
207
+ <data element_type="u8" shape="51" offset="1448963" size="51" />
208
+ <output>
209
+ <port id="0" precision="U8">
210
+ <dim>51</dim>
211
+ </port>
212
+ </output>
213
+ </layer>
214
+ <layer id="14" name="Constant_1236325" type="Const" version="opset1">
215
+ <data element_type="u8" shape="2" offset="1449014" size="2" />
216
+ <output>
217
+ <port id="0" precision="U8">
218
+ <dim>2</dim>
219
+ </port>
220
+ </output>
221
+ </layer>
222
+ <layer id="15" name="RegexNormalization_1236326" type="RegexNormalization" version="extension">
223
+ <data global_replace="true" />
224
+ <input>
225
+ <port id="0" precision="I32">
226
+ <dim>-1</dim>
227
+ </port>
228
+ <port id="1" precision="I32">
229
+ <dim>-1</dim>
230
+ </port>
231
+ <port id="2" precision="U8">
232
+ <dim>-1</dim>
233
+ </port>
234
+ <port id="3" precision="U8">
235
+ <dim>51</dim>
236
+ </port>
237
+ <port id="4" precision="U8">
238
+ <dim>2</dim>
239
+ </port>
240
+ </input>
241
+ <output>
242
+ <port id="5" precision="I32">
243
+ <dim>-1</dim>
244
+ </port>
245
+ <port id="6" precision="I32">
246
+ <dim>-1</dim>
247
+ </port>
248
+ <port id="7" precision="U8">
249
+ <dim>-1</dim>
250
+ </port>
251
+ </output>
252
+ </layer>
253
+ <layer id="16" name="StringTensorPack_1236327" type="StringTensorPack" version="extension">
254
+ <data mode="begins_ends" />
255
+ <input>
256
+ <port id="0" precision="I32">
257
+ <dim>-1</dim>
258
+ </port>
259
+ <port id="1" precision="I32">
260
+ <dim>-1</dim>
261
+ </port>
262
+ <port id="2" precision="U8">
263
+ <dim>-1</dim>
264
+ </port>
265
+ </input>
266
+ <output>
267
+ <port id="3" precision="STRING" names="string_output">
268
+ <dim>-1</dim>
269
+ </port>
270
+ </output>
271
+ </layer>
272
+ <layer id="17" name="Result_1236328" type="Result" version="opset1">
273
+ <input>
274
+ <port id="0" precision="STRING">
275
+ <dim>-1</dim>
276
+ </port>
277
+ </input>
278
+ </layer>
279
+ </layers>
280
+ <edges>
281
+ <edge from-layer="0" from-port="0" to-layer="1" to-port="0" />
282
+ <edge from-layer="1" from-port="1" to-layer="10" to-port="0" />
283
+ <edge from-layer="2" from-port="0" to-layer="3" to-port="0" />
284
+ <edge from-layer="3" from-port="1" to-layer="10" to-port="1" />
285
+ <edge from-layer="3" from-port="2" to-layer="10" to-port="2" />
286
+ <edge from-layer="3" from-port="3" to-layer="10" to-port="3" />
287
+ <edge from-layer="4" from-port="0" to-layer="9" to-port="0" />
288
+ <edge from-layer="5" from-port="0" to-layer="9" to-port="1" />
289
+ <edge from-layer="6" from-port="0" to-layer="9" to-port="2" />
290
+ <edge from-layer="7" from-port="0" to-layer="9" to-port="3" />
291
+ <edge from-layer="8" from-port="0" to-layer="9" to-port="4" />
292
+ <edge from-layer="9" from-port="5" to-layer="10" to-port="4" />
293
+ <edge from-layer="10" from-port="7" to-layer="11" to-port="2" />
294
+ <edge from-layer="10" from-port="9" to-layer="12" to-port="2" />
295
+ <edge from-layer="10" from-port="8" to-layer="11" to-port="3" />
296
+ <edge from-layer="10" from-port="6" to-layer="11" to-port="1" />
297
+ <edge from-layer="10" from-port="5" to-layer="11" to-port="0" />
298
+ <edge from-layer="11" from-port="4" to-layer="12" to-port="0" />
299
+ <edge from-layer="11" from-port="5" to-layer="12" to-port="1" />
300
+ <edge from-layer="12" from-port="3" to-layer="15" to-port="0" />
301
+ <edge from-layer="12" from-port="4" to-layer="15" to-port="1" />
302
+ <edge from-layer="12" from-port="5" to-layer="15" to-port="2" />
303
+ <edge from-layer="13" from-port="0" to-layer="15" to-port="3" />
304
+ <edge from-layer="14" from-port="0" to-layer="15" to-port="4" />
305
+ <edge from-layer="15" from-port="5" to-layer="16" to-port="0" />
306
+ <edge from-layer="15" from-port="6" to-layer="16" to-port="1" />
307
+ <edge from-layer="15" from-port="7" to-layer="16" to-port="2" />
308
+ <edge from-layer="16" from-port="3" to-layer="17" to-port="0" />
309
+ </edges>
310
+ <rt_info>
311
+ <add_attention_mask value="True" />
312
+ <add_prefix_space />
313
+ <add_special_tokens value="True" />
314
+ <chat_template value="{%- if tools %}&#10;{{- '&lt;|system|>\n' }}&#10;{%- if messages[0]['role'] == 'system' %}&#10;{{- messages[0]['content'] }}&#10;{%- set remaining_messages = messages[1:] %}&#10;{%- else %}&#10;{%- set remaining_messages = messages %}&#10;{%- endif %}&#10;{{- 'You are a Falcon assistant skilled in function calling. You are helpful, respectful, and concise.\n\n# Tools\n\nYou have access to the following functions. You MUST use them to answer questions when needed. For each function call, you MUST return a JSON object inside &lt;tool_call>&lt;/tool_call> tags.\n\n&lt;tools>' + tools|tojson(indent=2) + '&lt;/tools>\n\n# Output Format\n\nYour response MUST follow this format when making function calls:\n&lt;tool_call>\n[\n {&quot;name&quot;: &quot;function_name&quot;, &quot;arguments&quot;: {&quot;arg1&quot;: &quot;value1&quot;, &quot;arg2&quot;: &quot;value2&quot;}},\n {&quot;name&quot;: &quot;another_function&quot;, &quot;arguments&quot;: {&quot;arg&quot;: &quot;value&quot;}}\n]\n&lt;/tool_call>\nIf no function calls are needed, respond normally without the tool_call tags.\n' }}&#10;{%- for message in remaining_messages %}&#10;{%- if message['role'] == 'user' %}&#10;{{- '&lt;|user|>\n' + message['content'] + '\n' }}&#10;{%- elif message['role'] == 'assistant' %}&#10;{%- if message.content %}&#10;{{- '&lt;|assistant|>\n' + message['content'] }}&#10;{%- endif %}&#10;{%- if message.tool_calls %}&#10;{{- '\n&lt;tool_call>\n' }}&#10;{{- message.tool_calls|tojson(indent=2) }}&#10;{{- '\n&lt;/tool_call>' }}&#10;{%- endif %}&#10;{{- eos_token + '\n' }}&#10;{%- elif message['role'] == 'tool' %}&#10;{{- '&lt;|assistant|>\n&lt;tool_response>\n' + message['content'] + '\n&lt;/tool_response>\n' }}&#10;{%- endif %}&#10;{%- endfor %}&#10;{{- '&lt;|assistant|>\n' if add_generation_prompt }}&#10;{%- else %}&#10;{%- for message in messages %}&#10;{%- if message['role'] == 'system' %}&#10;{{- '&lt;|system|>\n' + message['content'] + '\n' }}&#10;{%- elif message['role'] == 'user' %}&#10;{{- '&lt;|user|>\n' + message['content'] + '\n' }}&#10;{%- elif message['role'] == 'assistant' %}&#10;{%- if not loop.last %}&#10;{{- '&lt;|assistant|>\n' + message['content'] + eos_token + '\n' }}&#10;{%- else %}&#10;{{- '&lt;|assistant|>\n' + message['content'] + eos_token }}&#10;{%- endif %}&#10;{%- endif %}&#10;{%- if loop.last and add_generation_prompt %}&#10;{{- '&lt;|assistant|>\n' }}&#10;{%- endif %}&#10;{%- endfor %}&#10;{%- endif %}" />
315
+ <clean_up_tokenization_spaces />
316
+ <detokenizer_input_type value="i64" />
317
+ <eos_token_id value="11" />
318
+ <handle_special_tokens_with_re />
319
+ <number_of_inputs value="1" />
320
+ <openvino_tokenizers_version value="2025.0.0.0" />
321
+ <openvino_version value="2025.0.0" />
322
+ <original_tokenizer_class value="&lt;class 'transformers.tokenization_utils_fast.PreTrainedTokenizerFast'>" />
323
+ <pad_token_id value="2023" />
324
+ <sentencepiece_version value="0.2.0" />
325
+ <skip_special_tokens value="True" />
326
+ <streaming_detokenizer value="False" />
327
+ <tokenizer_output_type value="i64" />
328
+ <tokenizers_version value="0.21.0" />
329
+ <transformers_version value="4.48.3" />
330
+ <use_max_padding value="False" />
331
+ <use_sentencepiece_backend value="False" />
332
+ <utf8_replace_mode value="replace" />
333
+ <with_detokenizer value="True" />
334
+ </rt_info>
335
+ </net>
openvino_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4fb06861bb9e3b337275a80e2978d9c5291ba11649b67573810f1e7f4d8d7892
3
+ size 4135039764
openvino_model.xml ADDED
The diff for this file is too large to render. See raw diff
 
openvino_tokenizer.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a07162a70d6f7c661f3641206c430c4c5a06ba34ab73bf59fb7ddd45d770f78e
3
+ size 3446806
openvino_tokenizer.xml ADDED
@@ -0,0 +1,822 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <?xml version="1.0"?>
2
+ <net name="tokenizer" version="11">
3
+ <layers>
4
+ <layer id="0" name="Parameter_1236200" type="Parameter" version="opset1">
5
+ <data shape="?" element_type="string" />
6
+ <output>
7
+ <port id="0" precision="STRING" names="Parameter_1236200">
8
+ <dim>-1</dim>
9
+ </port>
10
+ </output>
11
+ </layer>
12
+ <layer id="1" name="Constant_1236206" type="Const" version="opset1">
13
+ <data element_type="i64" shape="" offset="0" size="8" />
14
+ <output>
15
+ <port id="0" precision="I64" />
16
+ </output>
17
+ </layer>
18
+ <layer id="2" name="StringTensorUnpack_1236201" type="StringTensorUnpack" version="extension">
19
+ <data mode="begins_ends" />
20
+ <input>
21
+ <port id="0" precision="STRING">
22
+ <dim>-1</dim>
23
+ </port>
24
+ </input>
25
+ <output>
26
+ <port id="1" precision="I32">
27
+ <dim>-1</dim>
28
+ </port>
29
+ <port id="2" precision="I32">
30
+ <dim>-1</dim>
31
+ </port>
32
+ <port id="3" precision="U8">
33
+ <dim>-1</dim>
34
+ </port>
35
+ </output>
36
+ </layer>
37
+ <layer id="3" name="ShapeOf_1236202" type="ShapeOf" version="opset3">
38
+ <data output_type="i64" />
39
+ <input>
40
+ <port id="0" precision="I32">
41
+ <dim>-1</dim>
42
+ </port>
43
+ </input>
44
+ <output>
45
+ <port id="1" precision="I64">
46
+ <dim>1</dim>
47
+ </port>
48
+ </output>
49
+ </layer>
50
+ <layer id="4" name="Constant_1236203" type="Const" version="opset1">
51
+ <data element_type="i64" shape="" offset="0" size="8" />
52
+ <output>
53
+ <port id="0" precision="I64" />
54
+ </output>
55
+ </layer>
56
+ <layer id="5" name="Constant_1236204" type="Const" version="opset1">
57
+ <data element_type="i64" shape="" offset="0" size="8" />
58
+ <output>
59
+ <port id="0" precision="I64" />
60
+ </output>
61
+ </layer>
62
+ <layer id="6" name="Gather_1236205" type="Gather" version="opset8">
63
+ <data batch_dims="0" />
64
+ <input>
65
+ <port id="0" precision="I64">
66
+ <dim>1</dim>
67
+ </port>
68
+ <port id="1" precision="I64" />
69
+ <port id="2" precision="I64" />
70
+ </input>
71
+ <output>
72
+ <port id="3" precision="I64" />
73
+ </output>
74
+ </layer>
75
+ <layer id="7" name="Constant_1236207" type="Const" version="opset1">
76
+ <data element_type="i64" shape="" offset="8" size="8" />
77
+ <output>
78
+ <port id="0" precision="I64" />
79
+ </output>
80
+ </layer>
81
+ <layer id="8" name="Range_1236208" type="Range" version="opset4">
82
+ <data output_type="i32" />
83
+ <input>
84
+ <port id="0" precision="I64" />
85
+ <port id="1" precision="I64" />
86
+ <port id="2" precision="I64" />
87
+ </input>
88
+ <output>
89
+ <port id="3" precision="I32">
90
+ <dim>-1</dim>
91
+ </port>
92
+ </output>
93
+ </layer>
94
+ <layer id="9" name="Constant_1236209" type="Const" version="opset1">
95
+ <data element_type="i64" shape="" offset="8" size="8" />
96
+ <output>
97
+ <port id="0" precision="I64" />
98
+ </output>
99
+ </layer>
100
+ <layer id="10" name="Constant_1236210" type="Const" version="opset1">
101
+ <data element_type="i64" shape="" offset="8" size="8" />
102
+ <output>
103
+ <port id="0" precision="I64" />
104
+ </output>
105
+ </layer>
106
+ <layer id="11" name="Add_1236211" type="Add" version="opset1">
107
+ <data auto_broadcast="numpy" />
108
+ <input>
109
+ <port id="0" precision="I64" />
110
+ <port id="1" precision="I64" />
111
+ </input>
112
+ <output>
113
+ <port id="2" precision="I64" />
114
+ </output>
115
+ </layer>
116
+ <layer id="12" name="Constant_1236212" type="Const" version="opset1">
117
+ <data element_type="i64" shape="" offset="8" size="8" />
118
+ <output>
119
+ <port id="0" precision="I64" />
120
+ </output>
121
+ </layer>
122
+ <layer id="13" name="Range_1236213" type="Range" version="opset4">
123
+ <data output_type="i32" />
124
+ <input>
125
+ <port id="0" precision="I64" />
126
+ <port id="1" precision="I64" />
127
+ <port id="2" precision="I64" />
128
+ </input>
129
+ <output>
130
+ <port id="3" precision="I32">
131
+ <dim>-1</dim>
132
+ </port>
133
+ </output>
134
+ </layer>
135
+ <layer id="14" name="Constant_1236275" type="Const" version="opset1">
136
+ <data element_type="u8" shape="42834" offset="16" size="42834" />
137
+ <output>
138
+ <port id="0" precision="U8">
139
+ <dim>42834</dim>
140
+ </port>
141
+ </output>
142
+ </layer>
143
+ <layer id="15" name="SpecialTokensSplit_1236276" type="SpecialTokensSplit" version="extension">
144
+ <input>
145
+ <port id="0" precision="I32">
146
+ <dim>-1</dim>
147
+ </port>
148
+ <port id="1" precision="I32">
149
+ <dim>-1</dim>
150
+ </port>
151
+ <port id="2" precision="I32">
152
+ <dim>-1</dim>
153
+ </port>
154
+ <port id="3" precision="I32">
155
+ <dim>-1</dim>
156
+ </port>
157
+ <port id="4" precision="U8">
158
+ <dim>-1</dim>
159
+ </port>
160
+ <port id="5" precision="U8">
161
+ <dim>42834</dim>
162
+ </port>
163
+ </input>
164
+ <output>
165
+ <port id="6" precision="I32">
166
+ <dim>-1</dim>
167
+ </port>
168
+ <port id="7" precision="I32">
169
+ <dim>-1</dim>
170
+ </port>
171
+ <port id="8" precision="I32">
172
+ <dim>-1</dim>
173
+ </port>
174
+ <port id="9" precision="I32">
175
+ <dim>-1</dim>
176
+ </port>
177
+ <port id="10" precision="U8">
178
+ <dim>-1</dim>
179
+ </port>
180
+ <port id="11" precision="BOOL">
181
+ <dim>-1</dim>
182
+ </port>
183
+ </output>
184
+ </layer>
185
+ <layer id="16" name="Constant_1236278" type="Const" version="opset1">
186
+ <data element_type="u8" shape="5" offset="42850" size="5" />
187
+ <output>
188
+ <port id="0" precision="U8">
189
+ <dim>5</dim>
190
+ </port>
191
+ </output>
192
+ </layer>
193
+ <layer id="17" name="RegexSplit_1236279" type="RegexSplit" version="extension">
194
+ <data behaviour="contiguous" invert="false" max_splits="-1" />
195
+ <input>
196
+ <port id="0" precision="I32">
197
+ <dim>-1</dim>
198
+ </port>
199
+ <port id="1" precision="I32">
200
+ <dim>-1</dim>
201
+ </port>
202
+ <port id="2" precision="I32">
203
+ <dim>-1</dim>
204
+ </port>
205
+ <port id="3" precision="I32">
206
+ <dim>-1</dim>
207
+ </port>
208
+ <port id="4" precision="U8">
209
+ <dim>-1</dim>
210
+ </port>
211
+ <port id="5" precision="BOOL">
212
+ <dim>-1</dim>
213
+ </port>
214
+ <port id="6" precision="U8">
215
+ <dim>5</dim>
216
+ </port>
217
+ </input>
218
+ <output>
219
+ <port id="7" precision="I32">
220
+ <dim>-1</dim>
221
+ </port>
222
+ <port id="8" precision="I32">
223
+ <dim>-1</dim>
224
+ </port>
225
+ <port id="9" precision="I32">
226
+ <dim>-1</dim>
227
+ </port>
228
+ <port id="10" precision="I32">
229
+ <dim>-1</dim>
230
+ </port>
231
+ <port id="11" precision="U8">
232
+ <dim>-1</dim>
233
+ </port>
234
+ <port id="12" precision="BOOL">
235
+ <dim>-1</dim>
236
+ </port>
237
+ </output>
238
+ </layer>
239
+ <layer id="18" name="Constant_1236281" type="Const" version="opset1">
240
+ <data element_type="u8" shape="64" offset="42855" size="64" />
241
+ <output>
242
+ <port id="0" precision="U8">
243
+ <dim>64</dim>
244
+ </port>
245
+ </output>
246
+ </layer>
247
+ <layer id="19" name="RegexSplit_1236282" type="RegexSplit" version="extension">
248
+ <data behaviour="isolate" invert="false" max_splits="-1" />
249
+ <input>
250
+ <port id="0" precision="I32">
251
+ <dim>-1</dim>
252
+ </port>
253
+ <port id="1" precision="I32">
254
+ <dim>-1</dim>
255
+ </port>
256
+ <port id="2" precision="I32">
257
+ <dim>-1</dim>
258
+ </port>
259
+ <port id="3" precision="I32">
260
+ <dim>-1</dim>
261
+ </port>
262
+ <port id="4" precision="U8">
263
+ <dim>-1</dim>
264
+ </port>
265
+ <port id="5" precision="BOOL">
266
+ <dim>-1</dim>
267
+ </port>
268
+ <port id="6" precision="U8">
269
+ <dim>64</dim>
270
+ </port>
271
+ </input>
272
+ <output>
273
+ <port id="7" precision="I32">
274
+ <dim>-1</dim>
275
+ </port>
276
+ <port id="8" precision="I32">
277
+ <dim>-1</dim>
278
+ </port>
279
+ <port id="9" precision="I32">
280
+ <dim>-1</dim>
281
+ </port>
282
+ <port id="10" precision="I32">
283
+ <dim>-1</dim>
284
+ </port>
285
+ <port id="11" precision="U8">
286
+ <dim>-1</dim>
287
+ </port>
288
+ <port id="12" precision="BOOL">
289
+ <dim>-1</dim>
290
+ </port>
291
+ </output>
292
+ </layer>
293
+ <layer id="20" name="Constant_1236284" type="Const" version="opset1">
294
+ <data element_type="u8" shape="20" offset="42919" size="20" />
295
+ <output>
296
+ <port id="0" precision="U8">
297
+ <dim>20</dim>
298
+ </port>
299
+ </output>
300
+ </layer>
301
+ <layer id="21" name="RegexSplit_1236285" type="RegexSplit" version="extension">
302
+ <data behaviour="isolate" invert="false" max_splits="-1" />
303
+ <input>
304
+ <port id="0" precision="I32">
305
+ <dim>-1</dim>
306
+ </port>
307
+ <port id="1" precision="I32">
308
+ <dim>-1</dim>
309
+ </port>
310
+ <port id="2" precision="I32">
311
+ <dim>-1</dim>
312
+ </port>
313
+ <port id="3" precision="I32">
314
+ <dim>-1</dim>
315
+ </port>
316
+ <port id="4" precision="U8">
317
+ <dim>-1</dim>
318
+ </port>
319
+ <port id="5" precision="BOOL">
320
+ <dim>-1</dim>
321
+ </port>
322
+ <port id="6" precision="U8">
323
+ <dim>20</dim>
324
+ </port>
325
+ </input>
326
+ <output>
327
+ <port id="7" precision="I32">
328
+ <dim>-1</dim>
329
+ </port>
330
+ <port id="8" precision="I32">
331
+ <dim>-1</dim>
332
+ </port>
333
+ <port id="9" precision="I32">
334
+ <dim>-1</dim>
335
+ </port>
336
+ <port id="10" precision="I32">
337
+ <dim>-1</dim>
338
+ </port>
339
+ <port id="11" precision="U8">
340
+ <dim>-1</dim>
341
+ </port>
342
+ <port id="12" precision="BOOL">
343
+ <dim>-1</dim>
344
+ </port>
345
+ </output>
346
+ </layer>
347
+ <layer id="22" name="Constant_1236287" type="Const" version="opset1">
348
+ <data element_type="u8" shape="1440855" offset="42939" size="1440855" />
349
+ <output>
350
+ <port id="0" precision="U8">
351
+ <dim>1440855</dim>
352
+ </port>
353
+ </output>
354
+ </layer>
355
+ <layer id="23" name="StringTensorUnpack_1236288" type="StringTensorUnpack" version="extension">
356
+ <data mode="begins_ends" />
357
+ <input>
358
+ <port id="0" precision="U8">
359
+ <dim>1440855</dim>
360
+ </port>
361
+ </input>
362
+ <output>
363
+ <port id="1" precision="I32">
364
+ <dim>-1</dim>
365
+ </port>
366
+ <port id="2" precision="I32">
367
+ <dim>-1</dim>
368
+ </port>
369
+ <port id="3" precision="U8">
370
+ <dim>-1</dim>
371
+ </port>
372
+ </output>
373
+ </layer>
374
+ <layer id="24" name="Constant_1236293" type="Const" version="opset1">
375
+ <data element_type="u8" shape="974381" offset="1483794" size="974381" />
376
+ <output>
377
+ <port id="0" precision="U8">
378
+ <dim>974381</dim>
379
+ </port>
380
+ </output>
381
+ </layer>
382
+ <layer id="25" name="StringTensorUnpack_1236294" type="StringTensorUnpack" version="extension">
383
+ <data mode="begins_ends" />
384
+ <input>
385
+ <port id="0" precision="U8">
386
+ <dim>974381</dim>
387
+ </port>
388
+ </input>
389
+ <output>
390
+ <port id="1" precision="I32">
391
+ <dim>-1</dim>
392
+ </port>
393
+ <port id="2" precision="I32">
394
+ <dim>-1</dim>
395
+ </port>
396
+ <port id="3" precision="U8">
397
+ <dim>-1</dim>
398
+ </port>
399
+ </output>
400
+ </layer>
401
+ <layer id="26" name="Constant_1236296" type="Const" version="opset1">
402
+ <data element_type="u8" shape="943761" offset="2458175" size="943761" />
403
+ <output>
404
+ <port id="0" precision="U8">
405
+ <dim>943761</dim>
406
+ </port>
407
+ </output>
408
+ </layer>
409
+ <layer id="27" name="StringTensorUnpack_1236297" type="StringTensorUnpack" version="extension">
410
+ <data mode="begins_ends" />
411
+ <input>
412
+ <port id="0" precision="U8">
413
+ <dim>943761</dim>
414
+ </port>
415
+ </input>
416
+ <output>
417
+ <port id="1" precision="I32">
418
+ <dim>-1</dim>
419
+ </port>
420
+ <port id="2" precision="I32">
421
+ <dim>-1</dim>
422
+ </port>
423
+ <port id="3" precision="U8">
424
+ <dim>-1</dim>
425
+ </port>
426
+ </output>
427
+ </layer>
428
+ <layer id="28" name="Constant_1236290" type="Const" version="opset1">
429
+ <data element_type="u8" shape="36766" offset="3401936" size="36766" />
430
+ <output>
431
+ <port id="0" precision="U8">
432
+ <dim>36766</dim>
433
+ </port>
434
+ </output>
435
+ </layer>
436
+ <layer id="29" name="StringTensorUnpack_1236291" type="StringTensorUnpack" version="extension">
437
+ <data mode="begins_ends" />
438
+ <input>
439
+ <port id="0" precision="U8">
440
+ <dim>36766</dim>
441
+ </port>
442
+ </input>
443
+ <output>
444
+ <port id="1" precision="I32">
445
+ <dim>-1</dim>
446
+ </port>
447
+ <port id="2" precision="I32">
448
+ <dim>-1</dim>
449
+ </port>
450
+ <port id="3" precision="U8">
451
+ <dim>-1</dim>
452
+ </port>
453
+ </output>
454
+ </layer>
455
+ <layer id="30" name="Constant_1236298" type="Const" version="opset1">
456
+ <data element_type="i32" shape="2023" offset="3438702" size="8092" />
457
+ <output>
458
+ <port id="0" precision="I32">
459
+ <dim>2023</dim>
460
+ </port>
461
+ </output>
462
+ </layer>
463
+ <layer id="31" name="BPETokenizer_1236299" type="BPETokenizer" version="extension">
464
+ <data unk_token="" fuse_unk="false" suffix_indicator="" end_suffix="" byte_fallback="false" cache_capacity="26214" />
465
+ <input>
466
+ <port id="0" precision="I32">
467
+ <dim>-1</dim>
468
+ </port>
469
+ <port id="1" precision="I32">
470
+ <dim>-1</dim>
471
+ </port>
472
+ <port id="2" precision="I32">
473
+ <dim>-1</dim>
474
+ </port>
475
+ <port id="3" precision="I32">
476
+ <dim>-1</dim>
477
+ </port>
478
+ <port id="4" precision="U8">
479
+ <dim>-1</dim>
480
+ </port>
481
+ <port id="5" precision="I32">
482
+ <dim>-1</dim>
483
+ </port>
484
+ <port id="6" precision="I32">
485
+ <dim>-1</dim>
486
+ </port>
487
+ <port id="7" precision="U8">
488
+ <dim>-1</dim>
489
+ </port>
490
+ <port id="8" precision="I32">
491
+ <dim>-1</dim>
492
+ </port>
493
+ <port id="9" precision="I32">
494
+ <dim>-1</dim>
495
+ </port>
496
+ <port id="10" precision="U8">
497
+ <dim>-1</dim>
498
+ </port>
499
+ <port id="11" precision="I32">
500
+ <dim>-1</dim>
501
+ </port>
502
+ <port id="12" precision="I32">
503
+ <dim>-1</dim>
504
+ </port>
505
+ <port id="13" precision="U8">
506
+ <dim>-1</dim>
507
+ </port>
508
+ <port id="14" precision="I32">
509
+ <dim>-1</dim>
510
+ </port>
511
+ <port id="15" precision="I32">
512
+ <dim>-1</dim>
513
+ </port>
514
+ <port id="16" precision="U8">
515
+ <dim>-1</dim>
516
+ </port>
517
+ <port id="17" precision="I32">
518
+ <dim>2023</dim>
519
+ </port>
520
+ </input>
521
+ <output>
522
+ <port id="18" precision="I32">
523
+ <dim>-1</dim>
524
+ </port>
525
+ <port id="19" precision="I32">
526
+ <dim>-1</dim>
527
+ </port>
528
+ <port id="20" precision="I32">
529
+ <dim>-1</dim>
530
+ </port>
531
+ </output>
532
+ </layer>
533
+ <layer id="32" name="Subtract_1236300" type="Subtract" version="opset1">
534
+ <data auto_broadcast="numpy" />
535
+ <input>
536
+ <port id="0" precision="I32">
537
+ <dim>-1</dim>
538
+ </port>
539
+ <port id="1" precision="I32">
540
+ <dim>-1</dim>
541
+ </port>
542
+ </input>
543
+ <output>
544
+ <port id="2" precision="I32">
545
+ <dim>-1</dim>
546
+ </port>
547
+ </output>
548
+ </layer>
549
+ <layer id="33" name="Constant_1236301" type="Const" version="opset1">
550
+ <data element_type="i32" shape="" offset="3446794" size="4" />
551
+ <output>
552
+ <port id="0" precision="I32" />
553
+ </output>
554
+ </layer>
555
+ <layer id="34" name="Minimum_1236302" type="Minimum" version="opset1">
556
+ <data auto_broadcast="numpy" />
557
+ <input>
558
+ <port id="0" precision="I32">
559
+ <dim>-1</dim>
560
+ </port>
561
+ <port id="1" precision="I32" />
562
+ </input>
563
+ <output>
564
+ <port id="2" precision="I32">
565
+ <dim>-1</dim>
566
+ </port>
567
+ </output>
568
+ </layer>
569
+ <layer id="35" name="Subtract_1236303" type="Subtract" version="opset1">
570
+ <data auto_broadcast="numpy" />
571
+ <input>
572
+ <port id="0" precision="I32">
573
+ <dim>-1</dim>
574
+ </port>
575
+ <port id="1" precision="I32">
576
+ <dim>-1</dim>
577
+ </port>
578
+ </input>
579
+ <output>
580
+ <port id="2" precision="I32">
581
+ <dim>-1</dim>
582
+ </port>
583
+ </output>
584
+ </layer>
585
+ <layer id="36" name="Subtract_1236304" type="Subtract" version="opset1">
586
+ <data auto_broadcast="numpy" />
587
+ <input>
588
+ <port id="0" precision="I32">
589
+ <dim>-1</dim>
590
+ </port>
591
+ <port id="1" precision="I32">
592
+ <dim>-1</dim>
593
+ </port>
594
+ </input>
595
+ <output>
596
+ <port id="2" precision="I32">
597
+ <dim>-1</dim>
598
+ </port>
599
+ </output>
600
+ </layer>
601
+ <layer id="37" name="Constant_1236305" type="Const" version="opset1">
602
+ <data element_type="i32" shape="" offset="3446798" size="4" />
603
+ <output>
604
+ <port id="0" precision="I32" />
605
+ </output>
606
+ </layer>
607
+ <layer id="38" name="ReduceMax_1236306" type="ReduceMax" version="opset1">
608
+ <data keep_dims="false" />
609
+ <input>
610
+ <port id="0" precision="I32">
611
+ <dim>-1</dim>
612
+ </port>
613
+ <port id="1" precision="I32" />
614
+ </input>
615
+ <output>
616
+ <port id="2" precision="I32" />
617
+ </output>
618
+ </layer>
619
+ <layer id="39" name="Constant_1236307" type="Const" version="opset1">
620
+ <data element_type="i32" shape="" offset="3446802" size="4" />
621
+ <output>
622
+ <port id="0" precision="I32" />
623
+ </output>
624
+ </layer>
625
+ <layer id="40" name="RaggedToDense_1236308" type="RaggedToDense" version="extension">
626
+ <data pad_right="false" />
627
+ <input>
628
+ <port id="0" precision="I32">
629
+ <dim>-1</dim>
630
+ </port>
631
+ <port id="1" precision="I32">
632
+ <dim>-1</dim>
633
+ </port>
634
+ <port id="2" precision="I32">
635
+ <dim>-1</dim>
636
+ </port>
637
+ <port id="3" precision="I32" />
638
+ <port id="4" precision="I32" />
639
+ </input>
640
+ <output>
641
+ <port id="5" precision="I32">
642
+ <dim>-1</dim>
643
+ <dim>-1</dim>
644
+ </port>
645
+ <port id="6" precision="BOOL">
646
+ <dim>-1</dim>
647
+ <dim>-1</dim>
648
+ </port>
649
+ </output>
650
+ </layer>
651
+ <layer id="41" name="Convert_1236309" type="Convert" version="opset1">
652
+ <data destination_type="i32" />
653
+ <input>
654
+ <port id="0" precision="BOOL">
655
+ <dim>-1</dim>
656
+ <dim>-1</dim>
657
+ </port>
658
+ </input>
659
+ <output>
660
+ <port id="1" precision="I32">
661
+ <dim>-1</dim>
662
+ <dim>-1</dim>
663
+ </port>
664
+ </output>
665
+ </layer>
666
+ <layer id="42" name="Convert_1236309.0" type="Convert" version="opset1">
667
+ <data destination_type="i64" />
668
+ <input>
669
+ <port id="0" precision="I32">
670
+ <dim>-1</dim>
671
+ <dim>-1</dim>
672
+ </port>
673
+ </input>
674
+ <output>
675
+ <port id="1" precision="I64" names="attention_mask">
676
+ <dim>-1</dim>
677
+ <dim>-1</dim>
678
+ </port>
679
+ </output>
680
+ </layer>
681
+ <layer id="44" name="RaggedToDense_1236308.0" type="Convert" version="opset1">
682
+ <data destination_type="i64" />
683
+ <input>
684
+ <port id="0" precision="I32">
685
+ <dim>-1</dim>
686
+ <dim>-1</dim>
687
+ </port>
688
+ </input>
689
+ <output>
690
+ <port id="1" precision="I64" names="input_ids">
691
+ <dim>-1</dim>
692
+ <dim>-1</dim>
693
+ </port>
694
+ </output>
695
+ </layer>
696
+ <layer id="45" name="Result_1236310" type="Result" version="opset1">
697
+ <input>
698
+ <port id="0" precision="I64">
699
+ <dim>-1</dim>
700
+ <dim>-1</dim>
701
+ </port>
702
+ </input>
703
+ </layer>
704
+ <layer id="43" name="Result_1236311" type="Result" version="opset1">
705
+ <input>
706
+ <port id="0" precision="I64">
707
+ <dim>-1</dim>
708
+ <dim>-1</dim>
709
+ </port>
710
+ </input>
711
+ </layer>
712
+ </layers>
713
+ <edges>
714
+ <edge from-layer="0" from-port="0" to-layer="2" to-port="0" />
715
+ <edge from-layer="1" from-port="0" to-layer="8" to-port="0" />
716
+ <edge from-layer="2" from-port="1" to-layer="3" to-port="0" />
717
+ <edge from-layer="2" from-port="3" to-layer="15" to-port="4" />
718
+ <edge from-layer="2" from-port="2" to-layer="15" to-port="3" />
719
+ <edge from-layer="2" from-port="1" to-layer="15" to-port="2" />
720
+ <edge from-layer="3" from-port="1" to-layer="6" to-port="0" />
721
+ <edge from-layer="4" from-port="0" to-layer="6" to-port="1" />
722
+ <edge from-layer="5" from-port="0" to-layer="6" to-port="2" />
723
+ <edge from-layer="6" from-port="3" to-layer="8" to-port="1" />
724
+ <edge from-layer="6" from-port="3" to-layer="11" to-port="0" />
725
+ <edge from-layer="7" from-port="0" to-layer="8" to-port="2" />
726
+ <edge from-layer="8" from-port="3" to-layer="15" to-port="0" />
727
+ <edge from-layer="9" from-port="0" to-layer="13" to-port="0" />
728
+ <edge from-layer="10" from-port="0" to-layer="11" to-port="1" />
729
+ <edge from-layer="11" from-port="2" to-layer="13" to-port="1" />
730
+ <edge from-layer="12" from-port="0" to-layer="13" to-port="2" />
731
+ <edge from-layer="13" from-port="3" to-layer="15" to-port="1" />
732
+ <edge from-layer="14" from-port="0" to-layer="15" to-port="5" />
733
+ <edge from-layer="15" from-port="6" to-layer="17" to-port="0" />
734
+ <edge from-layer="15" from-port="7" to-layer="17" to-port="1" />
735
+ <edge from-layer="15" from-port="8" to-layer="17" to-port="2" />
736
+ <edge from-layer="15" from-port="9" to-layer="17" to-port="3" />
737
+ <edge from-layer="15" from-port="10" to-layer="17" to-port="4" />
738
+ <edge from-layer="15" from-port="11" to-layer="17" to-port="5" />
739
+ <edge from-layer="16" from-port="0" to-layer="17" to-port="6" />
740
+ <edge from-layer="17" from-port="8" to-layer="19" to-port="1" />
741
+ <edge from-layer="17" from-port="9" to-layer="19" to-port="2" />
742
+ <edge from-layer="17" from-port="10" to-layer="19" to-port="3" />
743
+ <edge from-layer="17" from-port="11" to-layer="19" to-port="4" />
744
+ <edge from-layer="17" from-port="12" to-layer="19" to-port="5" />
745
+ <edge from-layer="17" from-port="7" to-layer="19" to-port="0" />
746
+ <edge from-layer="18" from-port="0" to-layer="19" to-port="6" />
747
+ <edge from-layer="19" from-port="12" to-layer="21" to-port="5" />
748
+ <edge from-layer="19" from-port="11" to-layer="21" to-port="4" />
749
+ <edge from-layer="19" from-port="10" to-layer="21" to-port="3" />
750
+ <edge from-layer="19" from-port="9" to-layer="21" to-port="2" />
751
+ <edge from-layer="19" from-port="8" to-layer="21" to-port="1" />
752
+ <edge from-layer="19" from-port="7" to-layer="21" to-port="0" />
753
+ <edge from-layer="20" from-port="0" to-layer="21" to-port="6" />
754
+ <edge from-layer="21" from-port="7" to-layer="31" to-port="0" />
755
+ <edge from-layer="21" from-port="8" to-layer="31" to-port="1" />
756
+ <edge from-layer="21" from-port="9" to-layer="31" to-port="2" />
757
+ <edge from-layer="21" from-port="10" to-layer="31" to-port="3" />
758
+ <edge from-layer="21" from-port="11" to-layer="31" to-port="4" />
759
+ <edge from-layer="22" from-port="0" to-layer="23" to-port="0" />
760
+ <edge from-layer="23" from-port="3" to-layer="31" to-port="7" />
761
+ <edge from-layer="23" from-port="2" to-layer="31" to-port="6" />
762
+ <edge from-layer="23" from-port="1" to-layer="31" to-port="5" />
763
+ <edge from-layer="24" from-port="0" to-layer="25" to-port="0" />
764
+ <edge from-layer="25" from-port="1" to-layer="31" to-port="8" />
765
+ <edge from-layer="25" from-port="2" to-layer="31" to-port="9" />
766
+ <edge from-layer="25" from-port="3" to-layer="31" to-port="10" />
767
+ <edge from-layer="26" from-port="0" to-layer="27" to-port="0" />
768
+ <edge from-layer="27" from-port="1" to-layer="31" to-port="11" />
769
+ <edge from-layer="27" from-port="2" to-layer="31" to-port="12" />
770
+ <edge from-layer="27" from-port="3" to-layer="31" to-port="13" />
771
+ <edge from-layer="28" from-port="0" to-layer="29" to-port="0" />
772
+ <edge from-layer="29" from-port="1" to-layer="31" to-port="14" />
773
+ <edge from-layer="29" from-port="2" to-layer="31" to-port="15" />
774
+ <edge from-layer="29" from-port="3" to-layer="31" to-port="16" />
775
+ <edge from-layer="30" from-port="0" to-layer="31" to-port="17" />
776
+ <edge from-layer="31" from-port="19" to-layer="35" to-port="0" />
777
+ <edge from-layer="31" from-port="20" to-layer="40" to-port="2" />
778
+ <edge from-layer="31" from-port="19" to-layer="40" to-port="1" />
779
+ <edge from-layer="31" from-port="19" to-layer="36" to-port="0" />
780
+ <edge from-layer="31" from-port="18" to-layer="32" to-port="1" />
781
+ <edge from-layer="31" from-port="19" to-layer="32" to-port="0" />
782
+ <edge from-layer="32" from-port="2" to-layer="34" to-port="0" />
783
+ <edge from-layer="33" from-port="0" to-layer="34" to-port="1" />
784
+ <edge from-layer="34" from-port="2" to-layer="35" to-port="1" />
785
+ <edge from-layer="35" from-port="2" to-layer="36" to-port="1" />
786
+ <edge from-layer="35" from-port="2" to-layer="40" to-port="0" />
787
+ <edge from-layer="36" from-port="2" to-layer="38" to-port="0" />
788
+ <edge from-layer="37" from-port="0" to-layer="38" to-port="1" />
789
+ <edge from-layer="38" from-port="2" to-layer="40" to-port="3" />
790
+ <edge from-layer="39" from-port="0" to-layer="40" to-port="4" />
791
+ <edge from-layer="40" from-port="6" to-layer="41" to-port="0" />
792
+ <edge from-layer="40" from-port="5" to-layer="44" to-port="0" />
793
+ <edge from-layer="41" from-port="1" to-layer="42" to-port="0" />
794
+ <edge from-layer="42" from-port="1" to-layer="43" to-port="0" />
795
+ <edge from-layer="44" from-port="1" to-layer="45" to-port="0" />
796
+ </edges>
797
+ <rt_info>
798
+ <add_attention_mask value="True" />
799
+ <add_prefix_space />
800
+ <add_special_tokens value="True" />
801
+ <chat_template value="{%- if tools %}&#10;{{- '&lt;|system|>\n' }}&#10;{%- if messages[0]['role'] == 'system' %}&#10;{{- messages[0]['content'] }}&#10;{%- set remaining_messages = messages[1:] %}&#10;{%- else %}&#10;{%- set remaining_messages = messages %}&#10;{%- endif %}&#10;{{- 'You are a Falcon assistant skilled in function calling. You are helpful, respectful, and concise.\n\n# Tools\n\nYou have access to the following functions. You MUST use them to answer questions when needed. For each function call, you MUST return a JSON object inside &lt;tool_call>&lt;/tool_call> tags.\n\n&lt;tools>' + tools|tojson(indent=2) + '&lt;/tools>\n\n# Output Format\n\nYour response MUST follow this format when making function calls:\n&lt;tool_call>\n[\n {&quot;name&quot;: &quot;function_name&quot;, &quot;arguments&quot;: {&quot;arg1&quot;: &quot;value1&quot;, &quot;arg2&quot;: &quot;value2&quot;}},\n {&quot;name&quot;: &quot;another_function&quot;, &quot;arguments&quot;: {&quot;arg&quot;: &quot;value&quot;}}\n]\n&lt;/tool_call>\nIf no function calls are needed, respond normally without the tool_call tags.\n' }}&#10;{%- for message in remaining_messages %}&#10;{%- if message['role'] == 'user' %}&#10;{{- '&lt;|user|>\n' + message['content'] + '\n' }}&#10;{%- elif message['role'] == 'assistant' %}&#10;{%- if message.content %}&#10;{{- '&lt;|assistant|>\n' + message['content'] }}&#10;{%- endif %}&#10;{%- if message.tool_calls %}&#10;{{- '\n&lt;tool_call>\n' }}&#10;{{- message.tool_calls|tojson(indent=2) }}&#10;{{- '\n&lt;/tool_call>' }}&#10;{%- endif %}&#10;{{- eos_token + '\n' }}&#10;{%- elif message['role'] == 'tool' %}&#10;{{- '&lt;|assistant|>\n&lt;tool_response>\n' + message['content'] + '\n&lt;/tool_response>\n' }}&#10;{%- endif %}&#10;{%- endfor %}&#10;{{- '&lt;|assistant|>\n' if add_generation_prompt }}&#10;{%- else %}&#10;{%- for message in messages %}&#10;{%- if message['role'] == 'system' %}&#10;{{- '&lt;|system|>\n' + message['content'] + '\n' }}&#10;{%- elif message['role'] == 'user' %}&#10;{{- '&lt;|user|>\n' + message['content'] + '\n' }}&#10;{%- elif message['role'] == 'assistant' %}&#10;{%- if not loop.last %}&#10;{{- '&lt;|assistant|>\n' + message['content'] + eos_token + '\n' }}&#10;{%- else %}&#10;{{- '&lt;|assistant|>\n' + message['content'] + eos_token }}&#10;{%- endif %}&#10;{%- endif %}&#10;{%- if loop.last and add_generation_prompt %}&#10;{{- '&lt;|assistant|>\n' }}&#10;{%- endif %}&#10;{%- endfor %}&#10;{%- endif %}" />
802
+ <clean_up_tokenization_spaces />
803
+ <detokenizer_input_type value="i64" />
804
+ <eos_token_id value="11" />
805
+ <handle_special_tokens_with_re />
806
+ <number_of_inputs value="1" />
807
+ <openvino_tokenizers_version value="2025.0.0.0" />
808
+ <openvino_version value="2025.0.0" />
809
+ <original_tokenizer_class value="&lt;class 'transformers.tokenization_utils_fast.PreTrainedTokenizerFast'>" />
810
+ <pad_token_id value="2023" />
811
+ <sentencepiece_version value="0.2.0" />
812
+ <skip_special_tokens value="True" />
813
+ <streaming_detokenizer value="False" />
814
+ <tokenizer_output_type value="i64" />
815
+ <tokenizers_version value="0.21.0" />
816
+ <transformers_version value="4.48.3" />
817
+ <use_max_padding value="False" />
818
+ <use_sentencepiece_backend value="False" />
819
+ <utf8_replace_mode value="replace" />
820
+ <with_detokenizer value="True" />
821
+ </rt_info>
822
+ </net>
special_tokens_map.json ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ ">>TITLE<<",
4
+ ">>ABSTRACT<<",
5
+ ">>INTRODUCTION<<",
6
+ ">>SUMMARY<<",
7
+ ">>COMMENT<<",
8
+ ">>ANSWER<<",
9
+ ">>QUESTION<<",
10
+ ">>DOMAIN<<",
11
+ ">>EMAIL_ADDRESS<<",
12
+ ">>IP_ADDRESS<<",
13
+ "<|startoftext|>",
14
+ ">>IP_ADDRESS_0<<",
15
+ ">>IP_ADDRESS_1<<",
16
+ ">>IP_ADDRESS_2<<",
17
+ ">>IP_ADDRESS_3<<",
18
+ ">>IP_ADDRESS_4<<",
19
+ ">>IP_ADDRESS_5<<",
20
+ ">>IP_ADDRESS_6<<",
21
+ ">>IP_ADDRESS_7<<",
22
+ ">>IP_ADDRESS_8<<",
23
+ ">>IP_ADDRESS_9<<",
24
+ ">>PASSWORD<<",
25
+ ">>KEY<<"
26
+ ],
27
+ "eos_token": {
28
+ "content": "<|endoftext|>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false
33
+ },
34
+ "pad_token": {
35
+ "content": "<|pad|>",
36
+ "lstrip": false,
37
+ "normalized": false,
38
+ "rstrip": false,
39
+ "single_word": false
40
+ }
41
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff