cnmoro commited on
Commit
64170fe
·
verified ·
1 Parent(s): 19565bb

Add SetFit model

Browse files
1_Pooling/config.json CHANGED
@@ -1,10 +1,10 @@
1
  {
2
- "word_embedding_dimension": 384,
3
- "pooling_mode_cls_token": true,
4
- "pooling_mode_mean_tokens": false,
5
- "pooling_mode_max_tokens": false,
6
- "pooling_mode_mean_sqrt_len_tokens": false,
7
- "pooling_mode_weightedmean_tokens": false,
8
- "pooling_mode_lasttoken": false,
9
- "include_prompt": true
10
  }
 
1
  {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
  }
README.md CHANGED
@@ -5,16 +5,11 @@ tags:
5
  - text-classification
6
  - generated_from_setfit_trainer
7
  widget:
8
- - text: Solicite um relatório financeiro trimestral via ERP conectado.
9
- - text: >-
10
- If you save $200 monthly, how much money will you have saved after 18
11
- months?
12
- - text: Get the stock price history of Tesla for the last month.
13
- - text: >-
14
- Given a historical archive of economic indicators, build a forecasting model
15
- that predicts recessions, incorporating leading, lagging, and coincident
16
- indicators with explainable outputs.
17
- - text: Narrate the experience of a character born without the ability to dream.
18
  metrics:
19
  - accuracy
20
  pipeline_tag: text-classification
@@ -33,12 +28,8 @@ model-index:
33
  split: test
34
  metrics:
35
  - type: accuracy
36
- value: 0.9966555183946488
37
  name: Accuracy
38
- license: apache-2.0
39
- language:
40
- - pt
41
- - en
42
  ---
43
 
44
  # SetFit with ibm-granite/granite-embedding-107m-multilingual
@@ -57,7 +48,7 @@ The model has been trained using an efficient few-shot learning technique that i
57
  - **Sentence Transformer body:** [ibm-granite/granite-embedding-107m-multilingual](https://huggingface.co/ibm-granite/granite-embedding-107m-multilingual)
58
  - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
59
  - **Maximum Sequence Length:** 512 tokens
60
- - **Number of Classes:** 8 classes
61
  <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
62
  <!-- - **Language:** Unknown -->
63
  <!-- - **License:** Unknown -->
@@ -71,21 +62,22 @@ The model has been trained using an efficient few-shot learning technique that i
71
  ### Model Labels
72
  | Label | Examples |
73
  |:------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
74
- | summarization | <ul><li>'Resuma um texto acadêmico sobre psicologia do comportamento.'</li><li>'Summarize the timeline and outcomes of a historical event based on multiple eyewitness accounts.'</li><li>'Extract and summarize the key lessons learned from multiple post-project reviews.'</li></ul> |
75
- | general_knowledge | <ul><li>'Qual é a importância da agricultura para a economia brasileira?'</li><li>'Quais são os principais países membros da Organização dos Países Exportadores de Petróleo (OPEP)?'</li><li>'What is the mechanism by which vaccines provide immunity?'</li></ul> |
76
- | roleplay | <ul><li>'Personifique um chef pâtissier criando uma sobremesa para um júri exigente.'</li><li>'You are a software tester devising scenarios to uncover bugs in a complex system.'</li><li>'Simule uma reunião de conselho editorial decidindo o rumo de uma grande publicação.'</li></ul> |
77
- | creativity | <ul><li>'Write a thriller in which the protagonist communicates only through artwork.'</li><li>'Imagine um poema narrativo sobre a relação entre o sertão e a poesia de uma geração esquecida.'</li><li>'Write a story from the perspective of a shadow that gains independence.'</li></ul> |
78
- | complex_reasoning | <ul><li>'Analise as implicações do uso de drones autônomos para entregas em áreas urbanas densas.'</li><li>'Proponha um sistema para avaliação automatizada e justa de currículos em processos seletivos corporativos.'</li><li>'Proponha um modelo para prever o crescimento urbano sustentável considerando variáveis ambientais e sociais.'</li></ul> |
79
- | coding | <ul><li>'Implemente uma função para decompor números inteiros em fatores primos eficientemente para valores grandes.'</li><li>'Create an integration that consumes streaming data from an external message broker and processes events in real-time with backpressure management.'</li><li>'Escreva um algoritmo para encontrar os pontos de articulação (cut vertices) em um grafo não direcionado.'</li></ul> |
80
- | basic_reasoning | <ul><li>'Se um carro consome 12 litros de gasolina para 100 km, quantos litros usará para 150 km?'</li><li>'If a ladder leans against a wall forming a 60-degree angle and the ladder length is 10 feet, how high does it reach on the wall?'</li><li>'Quantos centímetros tem 1 metro?'</li></ul> |
81
- | tool | <ul><li>'Fetch comprehensive user reviews and ratings for a mobile app across platforms.'</li><li>'Analyze sentiment of a tweet and classify it as positive, neutral, or negative.'</li><li>'Retrieve country-wise COVID-19 vaccination rates from an authoritative source.'</li></ul> |
 
82
 
83
  ## Evaluation
84
 
85
  ### Metrics
86
  | Label | Accuracy |
87
  |:--------|:---------|
88
- | **all** | 0.9967 |
89
 
90
  ## Uses
91
 
@@ -105,7 +97,7 @@ from setfit import SetFitModel
105
  # Download from the 🤗 Hub
106
  model = SetFitModel.from_pretrained("cnmoro/prompt-router")
107
  # Run inference
108
- preds = model("Get the stock price history of Tesla for the last month.")
109
  ```
110
 
111
  <!--
@@ -137,18 +129,19 @@ preds = model("Get the stock price history of Tesla for the last month.")
137
  ### Training Set Metrics
138
  | Training set | Min | Median | Max |
139
  |:-------------|:----|:--------|:----|
140
- | Word count | 5 | 13.6792 | 38 |
141
 
142
  | Label | Training Sample Count |
143
  |:------------------|:----------------------|
144
- | summarization | 160 |
145
- | tool | 144 |
146
- | general_knowledge | 154 |
147
- | roleplay | 145 |
148
- | complex_reasoning | 130 |
149
- | creativity | 164 |
150
- | coding | 152 |
151
- | basic_reasoning | 148 |
 
152
 
153
  ### Training Hyperparameters
154
  - batch_size: (8, 8)
@@ -172,64 +165,64 @@ preds = model("Get the stock price history of Tesla for the last month.")
172
  ### Training Results
173
  | Epoch | Step | Training Loss | Validation Loss |
174
  |:------:|:----:|:-------------:|:---------------:|
175
- | 0.0004 | 1 | 0.1954 | - |
176
- | 0.0208 | 50 | 0.2125 | - |
177
- | 0.0417 | 100 | 0.2131 | - |
178
- | 0.0625 | 150 | 0.2072 | - |
179
- | 0.0833 | 200 | 0.2029 | 0.1902 |
180
- | 0.1042 | 250 | 0.1925 | - |
181
- | 0.125 | 300 | 0.1764 | - |
182
- | 0.1458 | 350 | 0.1512 | - |
183
- | 0.1667 | 400 | 0.1229 | 0.1072 |
184
- | 0.1875 | 450 | 0.1015 | - |
185
- | 0.2083 | 500 | 0.0862 | - |
186
- | 0.2292 | 550 | 0.065 | - |
187
- | 0.25 | 600 | 0.0505 | 0.0504 |
188
- | 0.2708 | 650 | 0.0532 | - |
189
- | 0.2917 | 700 | 0.0427 | - |
190
- | 0.3125 | 750 | 0.0378 | - |
191
- | 0.3333 | 800 | 0.0357 | 0.0322 |
192
- | 0.3542 | 850 | 0.0286 | - |
193
- | 0.375 | 900 | 0.0381 | - |
194
- | 0.3958 | 950 | 0.0333 | - |
195
- | 0.4167 | 1000 | 0.0307 | 0.0235 |
196
- | 0.4375 | 1050 | 0.0245 | - |
197
- | 0.4583 | 1100 | 0.0245 | - |
198
- | 0.4792 | 1150 | 0.0217 | - |
199
- | 0.5 | 1200 | 0.0193 | 0.0168 |
200
- | 0.5208 | 1250 | 0.0167 | - |
201
- | 0.5417 | 1300 | 0.0158 | - |
202
- | 0.5625 | 1350 | 0.02 | - |
203
- | 0.5833 | 1400 | 0.0167 | 0.0120 |
204
- | 0.6042 | 1450 | 0.0176 | - |
205
- | 0.625 | 1500 | 0.0159 | - |
206
- | 0.6458 | 1550 | 0.0141 | - |
207
- | 0.6667 | 1600 | 0.0131 | 0.0094 |
208
- | 0.6875 | 1650 | 0.0097 | - |
209
- | 0.7083 | 1700 | 0.0109 | - |
210
- | 0.7292 | 1750 | 0.0126 | - |
211
- | 0.75 | 1800 | 0.0115 | 0.0079 |
212
- | 0.7708 | 1850 | 0.0122 | - |
213
- | 0.7917 | 1900 | 0.0104 | - |
214
- | 0.8125 | 1950 | 0.0111 | - |
215
- | 0.8333 | 2000 | 0.011 | 0.0071 |
216
- | 0.8542 | 2050 | 0.0095 | - |
217
- | 0.875 | 2100 | 0.009 | - |
218
- | 0.8958 | 2150 | 0.0107 | - |
219
- | 0.9167 | 2200 | 0.0099 | 0.0067 |
220
- | 0.9375 | 2250 | 0.0084 | - |
221
- | 0.9583 | 2300 | 0.0086 | - |
222
- | 0.9792 | 2350 | 0.0089 | - |
223
- | 1.0 | 2400 | 0.0098 | 0.0066 |
224
 
225
  ### Framework Versions
226
  - Python: 3.11.11
227
  - SetFit: 1.2.0.dev0
228
- - Sentence Transformers: 4.0.2
229
- - Transformers: 4.51.3
230
- - PyTorch: 2.6.0+cu124
231
- - Datasets: 3.5.0
232
- - Tokenizers: 0.21.1
233
 
234
  ## Citation
235
 
 
5
  - text-classification
6
  - generated_from_setfit_trainer
7
  widget:
8
+ - text: Check the availability and prices of iPhone 13 models across online retailers.
9
+ - text: Se um copo está metade cheio, quanto falta para encher completamente?
10
+ - text: Compose an epic poem about the journey of a single grain of sand.
11
+ - text: Resuma os conceitos básicos e aplicações de um novo material científico.
12
+ - text: Explain the importance of the human microbiome.
 
 
 
 
 
13
  metrics:
14
  - accuracy
15
  pipeline_tag: text-classification
 
28
  split: test
29
  metrics:
30
  - type: accuracy
31
+ value: 0.9908675799086758
32
  name: Accuracy
 
 
 
 
33
  ---
34
 
35
  # SetFit with ibm-granite/granite-embedding-107m-multilingual
 
48
  - **Sentence Transformer body:** [ibm-granite/granite-embedding-107m-multilingual](https://huggingface.co/ibm-granite/granite-embedding-107m-multilingual)
49
  - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
50
  - **Maximum Sequence Length:** 512 tokens
51
+ - **Number of Classes:** 9 classes
52
  <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
53
  <!-- - **Language:** Unknown -->
54
  <!-- - **License:** Unknown -->
 
62
  ### Model Labels
63
  | Label | Examples |
64
  |:------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
65
+ | coding | <ul><li>'Desenvolva uma função que gere mapas de calor baseados em dados geoespaciais para visualização de clusters.'</li><li>'Crie um sistema que implemente um cache LRU (Least Recently Used) para otimizar buscas repetitivas.'</li><li>'Desenvolva uma função que gere todos os anagramas possíveis de uma palavra sem repetições redundantes.'</li></ul> |
66
+ | general_knowledge | <ul><li>'Which planets in the solar system have rings and what are they made of?'</li><li>'Describe the process of cell division including mitosis and meiosis.'</li><li>'How do vaccines develop herd immunity in populations?'</li></ul> |
67
+ | complex_reasoning | <ul><li>'Projete uma estratégia para otimizar a alocação de recursos em desastres naturais usando análise preditiva.'</li><li>'Develop an AI for cross-domain transfer learning that can generalize control policies from simulated environments to real-world robots.'</li><li>'Descreva as vantagens e desvantagens da utilização de contratos inteligentes para gestão de cadeias de suprimentos.'</li></ul> |
68
+ | summarization | <ul><li>'Extract and summarize the key lessons learned from multiple post-project reviews.'</li><li>'Resuma os principais conceitos de um curso online em administração.'</li><li>'Create a summary that captures the essential themes and motifs in a series of poems.'</li></ul> |
69
+ | extraction | <ul><li>"It's important to recognize the main technical hurdles the engineering team overcame during the last sprint."</li><li>'The task is to classify the types of user feedback from the beta test, organizing it by severity and feature area.'</li><li>'Analyze the primary factors contributing to the recent decline in user engagement on the platform.'</li></ul> |
70
+ | roleplay | <ul><li>'Finja ser um arqueólogo submarino explorando um navio naufragado cheio de tesouros.'</li><li>'Act as an archaeologist analyzing bone fragments to determine ancient diets.'</li><li>'Act as a diplomat negotiating trade terms with a foreign delegation hostile to your country.'</li></ul> |
71
+ | basic_reasoning | <ul><li>'Se hoje é 15 de março, que dia será daqui a 10 dias?'</li><li>'If an object travels 100 meters in 20 seconds, what is its speed in meters per second?'</li><li>'If the average of five numbers is 20, what is their sum?'</li></ul> |
72
+ | tool | <ul><li>'Retrieve the current weather forecast for Tokyo for the next 7 days.'</li><li>'Retrieve a business’s credit score and financial risk rating from a commercial database.'</li><li>'Busque restaurantes italianos abertos agora no Rio de Janeiro.'</li></ul> |
73
+ | creativity | <ul><li>'Describe a character who can taste emotions and how they use this ability.'</li><li>'Escreva uma crônica sobre o impacto da migração rural-urbana no comportamento social nas periferias.'</li><li>'Describe a society where storytelling is forbidden and the underground movements that resist this law.'</li></ul> |
74
 
75
  ## Evaluation
76
 
77
  ### Metrics
78
  | Label | Accuracy |
79
  |:--------|:---------|
80
+ | **all** | 0.9909 |
81
 
82
  ## Uses
83
 
 
97
  # Download from the 🤗 Hub
98
  model = SetFitModel.from_pretrained("cnmoro/prompt-router")
99
  # Run inference
100
+ preds = model("Explain the importance of the human microbiome.")
101
  ```
102
 
103
  <!--
 
129
  ### Training Set Metrics
130
  | Training set | Min | Median | Max |
131
  |:-------------|:----|:--------|:----|
132
+ | Word count | 5 | 14.5374 | 38 |
133
 
134
  | Label | Training Sample Count |
135
  |:------------------|:----------------------|
136
+ | extraction | 284 |
137
+ | coding | 150 |
138
+ | creativity | 169 |
139
+ | tool | 171 |
140
+ | general_knowledge | 175 |
141
+ | basic_reasoning | 158 |
142
+ | roleplay | 173 |
143
+ | summarization | 172 |
144
+ | complex_reasoning | 152 |
145
 
146
  ### Training Hyperparameters
147
  - batch_size: (8, 8)
 
165
  ### Training Results
166
  | Epoch | Step | Training Loss | Validation Loss |
167
  |:------:|:----:|:-------------:|:---------------:|
168
+ | 0.0004 | 1 | 0.2456 | - |
169
+ | 0.0208 | 50 | 0.2121 | - |
170
+ | 0.0417 | 100 | 0.212 | - |
171
+ | 0.0625 | 150 | 0.2158 | - |
172
+ | 0.0833 | 200 | 0.2074 | 0.1897 |
173
+ | 0.1042 | 250 | 0.2023 | - |
174
+ | 0.125 | 300 | 0.1833 | - |
175
+ | 0.1458 | 350 | 0.1766 | - |
176
+ | 0.1667 | 400 | 0.1602 | 0.1255 |
177
+ | 0.1875 | 450 | 0.1327 | - |
178
+ | 0.2083 | 500 | 0.1187 | - |
179
+ | 0.2292 | 550 | 0.0915 | - |
180
+ | 0.25 | 600 | 0.073 | 0.0499 |
181
+ | 0.2708 | 650 | 0.0618 | - |
182
+ | 0.2917 | 700 | 0.0575 | - |
183
+ | 0.3125 | 750 | 0.0559 | - |
184
+ | 0.3333 | 800 | 0.0463 | 0.0307 |
185
+ | 0.3542 | 850 | 0.0409 | - |
186
+ | 0.375 | 900 | 0.033 | - |
187
+ | 0.3958 | 950 | 0.0356 | - |
188
+ | 0.4167 | 1000 | 0.0331 | 0.0212 |
189
+ | 0.4375 | 1050 | 0.0353 | - |
190
+ | 0.4583 | 1100 | 0.0337 | - |
191
+ | 0.4792 | 1150 | 0.0326 | - |
192
+ | 0.5 | 1200 | 0.0274 | 0.0162 |
193
+ | 0.5208 | 1250 | 0.0281 | - |
194
+ | 0.5417 | 1300 | 0.0245 | - |
195
+ | 0.5625 | 1350 | 0.0235 | - |
196
+ | 0.5833 | 1400 | 0.0237 | 0.0130 |
197
+ | 0.6042 | 1450 | 0.0249 | - |
198
+ | 0.625 | 1500 | 0.0232 | - |
199
+ | 0.6458 | 1550 | 0.0196 | - |
200
+ | 0.6667 | 1600 | 0.0231 | 0.0114 |
201
+ | 0.6875 | 1650 | 0.0219 | - |
202
+ | 0.7083 | 1700 | 0.0198 | - |
203
+ | 0.7292 | 1750 | 0.0237 | - |
204
+ | 0.75 | 1800 | 0.0151 | 0.0104 |
205
+ | 0.7708 | 1850 | 0.0193 | - |
206
+ | 0.7917 | 1900 | 0.016 | - |
207
+ | 0.8125 | 1950 | 0.0214 | - |
208
+ | 0.8333 | 2000 | 0.0122 | 0.0090 |
209
+ | 0.8542 | 2050 | 0.016 | - |
210
+ | 0.875 | 2100 | 0.0152 | - |
211
+ | 0.8958 | 2150 | 0.0152 | - |
212
+ | 0.9167 | 2200 | 0.0157 | 0.0084 |
213
+ | 0.9375 | 2250 | 0.0174 | - |
214
+ | 0.9583 | 2300 | 0.0171 | - |
215
+ | 0.9792 | 2350 | 0.0136 | - |
216
+ | 1.0 | 2400 | 0.0123 | 0.0083 |
217
 
218
  ### Framework Versions
219
  - Python: 3.11.11
220
  - SetFit: 1.2.0.dev0
221
+ - Sentence Transformers: 5.0.0
222
+ - Transformers: 4.53.2
223
+ - PyTorch: 2.7.1+cu126
224
+ - Datasets: 3.2.0
225
+ - Tokenizers: 0.21.0
226
 
227
  ## Citation
228
 
config.json CHANGED
@@ -19,7 +19,7 @@
19
  "pad_token_id": 1,
20
  "position_embedding_type": "absolute",
21
  "torch_dtype": "float32",
22
- "transformers_version": "4.51.3",
23
  "type_vocab_size": 2,
24
  "use_cache": true,
25
  "vocab_size": 250002
 
19
  "pad_token_id": 1,
20
  "position_embedding_type": "absolute",
21
  "torch_dtype": "float32",
22
+ "transformers_version": "4.53.2",
23
  "type_vocab_size": 2,
24
  "use_cache": true,
25
  "vocab_size": 250002
config_sentence_transformers.json CHANGED
@@ -1,10 +1,14 @@
1
  {
 
2
  "__version__": {
3
- "sentence_transformers": "4.0.2",
4
- "transformers": "4.51.3",
5
- "pytorch": "2.6.0+cu124"
 
 
 
 
6
  },
7
- "prompts": {},
8
  "default_prompt_name": null,
9
  "similarity_fn_name": "cosine"
10
  }
 
1
  {
2
+ "model_type": "SentenceTransformer",
3
  "__version__": {
4
+ "sentence_transformers": "5.0.0",
5
+ "transformers": "4.53.2",
6
+ "pytorch": "2.7.1+cu126"
7
+ },
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
  },
 
12
  "default_prompt_name": null,
13
  "similarity_fn_name": "cosine"
14
  }
config_setfit.json CHANGED
@@ -1,13 +1,14 @@
1
  {
2
  "normalize_embeddings": false,
3
  "labels": [
4
- "summarization",
 
 
5
  "tool",
6
  "general_knowledge",
 
7
  "roleplay",
8
- "complex_reasoning",
9
- "creativity",
10
- "coding",
11
- "basic_reasoning"
12
  ]
13
  }
 
1
  {
2
  "normalize_embeddings": false,
3
  "labels": [
4
+ "extraction",
5
+ "coding",
6
+ "creativity",
7
  "tool",
8
  "general_knowledge",
9
+ "basic_reasoning",
10
  "roleplay",
11
+ "summarization",
12
+ "complex_reasoning"
 
 
13
  ]
14
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:281a011d611debd6600bae6fee3b7e3087e1bb491a62cbb3401dcb39ac51fdc2
3
  size 427988744
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:daac8d67133b1e9161d174dc9327bbc7c185d34b1b0b66e201d84c6d5d041483
3
  size 427988744
model_head.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2b9f216693a908380e6acdd99416fb9d40e83a484a36ccd262021cd06bd50ada
3
- size 26023
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:221391f5090fdd0937071b7d658818b67e6c998633ac40560aa487dcfa5b6c3c
3
+ size 29167
sentence_bert_config.json CHANGED
@@ -1,4 +1,4 @@
1
  {
2
- "max_seq_length": 512,
3
- "do_lower_case": false
4
  }
 
1
  {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
  }