Bo8dady commited on
Commit
2764fbf
·
verified ·
1 Parent(s): 82ed5af

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,1305 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:4030
8
+ - loss:MultipleNegativesRankingLoss
9
+ base_model: sentence-transformers/all-distilroberta-v1
10
+ widget:
11
+ - source_sentence: What is the contact email for Dr. Amr Ashraf Mohamed Amin?
12
+ sentences:
13
+ - "Topic: Second Level Courses (Mainstream)\nSummary: Outlines the course list for\
14
+ \ the third and fourth semesters, including course codes, titles, credit hours,\
15
+ \ and prerequisites.\nChunk: \"Second Level Courses (Mainstream) \nThird Semester\n\
16
+ \ • HUM113: Report Writing (2 Credit Hours) \n• CIS250: Object-Oriented Programming\
17
+ \ (3 Credit Hours) – Prerequisite: CIS150 \n(Structured Programming) \n• BSC221:\
18
+ \ Discrete Mathematics (3 Credit Hours) \n• CIS260: Logic Design (3 Credit Hours)\
19
+ \ – Prerequisite: BSC121 (Physics I) \n• CIS280: Database Management Systems (3\
20
+ \ Credit Hours) – Prerequisite: CIS150 \n(Structured Programming) \n• CIS240:\
21
+ \ Statistical Analysis (3 Credit Hours) – Prerequisite: BSC123 (Probability &\
22
+ \ \nStatistics) \n• Total Credit Hours: 17 \nFourth Semester \n• CIS220: Computer\
23
+ \ Organization & Architecture (3 Credit Hours) – Prerequisite: CIS260 \n(Logic\
24
+ \ Design) \n• CIS270: Data Structure (3 Credit Hours) – Prerequisite: CIS250 (Object-Oriented\
25
+ \ \nProgramming) \n• BSC225: Linear Algebra (3 Credit Hours) \n• CIS230: Operations\
26
+ \ Research (3 Credit Hours) \n• CIS243: Artificial Intelligence (3 Credit Hours)\
27
+ \ – Prerequisite: CIS150 (Structured \nProgramming) \n• Total Credit Hours: 15\""
28
+ - 'The final exam for the Structured programming course, offered by the general
29
+ department, from 2022, is available at the following link: [https://drive.google.com/file/d/1Bpqoa78DcFNC8335i7vucV0nBN-J01v9/view?usp=sharing'
30
+ - Dr. Amr Ashraf Mohamed Amin is part of the Unknown department and can be reached
31
32
+ - source_sentence: What systems have been developed for quickly locating missing children?
33
+ sentences:
34
+ - 'The final exam for Digital Signal Processing course, offered by the computer
35
+ science department, from 2024, is available at the following link: [https://drive.google.com/file/d/1RO0aPoom-TA-qgsopwR9krszD_pQIzfJ/view?usp=sharing'
36
+ - '**Lost People Finder**
37
+
38
+
39
+ ### **Abstract**
40
+
41
+
42
+ **Missing Persons Statistics**
43
+
44
+ Recently, there has been a clear increase in the population. As stated in a 2005
45
+ report, published by the US Department of Justice, over 340,500 of children''s
46
+ population go missing, from their parents, for at least an hour. Not only was
47
+ this issue minor in between children, but also it has been evident that the elderly
48
+ and people with special needs seem missing whenever their guardians get distracted.
49
+
50
+
51
+ **Lost People Finder Application**
52
+
53
+ Through the Lost People Finder application, we can search for missing people quickly
54
+ and efficiently by entering the missing person''s picture in the application,
55
+ and the application searches for him immediately.'
56
+ - 'The final exam for the English 1course, offered by the general department, from
57
+ 2022, is available at the following link: [https://drive.google.com/file/d/1IbqLbHuyZoDyhsL1BERpI2P0iLFZmgt8/view].'
58
+ - source_sentence: What are the conditions for the College Council granting a final
59
+ chance?
60
+ sentences:
61
+ - Dr. Zeina Rayan is part of the Unknown department and can be reached at [email protected].
62
+ - 'Topic: Academic Warning and Dismissal
63
+
64
+ Summary: Students receive academic warnings for low GPAs and may be dismissed
65
+ if the GPA remains low for six semesters or if graduation requirements aren''t
66
+ met within double the study years. Students can re-study courses to improve their
67
+ average, with certain conditions and grade limits.
68
+
69
+ Chunk: "Academic warning - dismissal from study - mechanisms of raising the cumulative
70
+ average
71
+
72
+ 1. The student is given an academic warning if he obtains a cumulative average
73
+ less than "2" for any semester that he must raise his cumulative average to at
74
+ least 2.00.
75
+
76
+ 2. A student who is academically probated is dismissed from the study if the GPA
77
+ drops below 2.00 is repeated during six main semesters.
78
+
79
+ 3. If the student does not meet the graduation requirements within the maximum
80
+ period of study, which is double the years of study according to the law, he will
81
+ be dismissed.
82
+
83
+ 4. The College Council may consider the possibility of granting the student exposed
84
+ to dismissal as a result of his inability to raise his cumulative average to At
85
+ least one and final chance of two semesters to raise his/her GPA to 2.00 and meet
86
+ graduation requirements if he/she has successfully completed at least 80% of the
87
+ credit hours required for graduation.
88
+
89
+ 5. The student may re-study the courses in which he has previously passed in order
90
+ to improve the cumulative average, and the repetition is a study and an exam,
91
+ and the grade he obtained the last time he studied the course is calculated for
92
+ him. A maximum of (5) courses unless the improvement is for the purpose of raising
93
+ the academic warning or achieving the graduation requirements, and in all cases,
94
+ both grades are mentioned in his academic record.
95
+
96
+ 6. For the student to re-study a course in which he has previously obtained a
97
+ grade of (F), the grade he obtained in the repetition is calculated with a maximum
98
+ of (B), and for calculating the cumulative average, the last grade is calculated
99
+ for him only, provided that both grades are mentioned in the student''s academic
100
+ record."'
101
+ - '**Abstract**
102
+
103
+
104
+ **Introduction to Renewable Energy**
105
+
106
+ Renewable energy is gaining great importance nowadays. Solar energy is one of
107
+ the most popular renewable energy sources as it is carbon dioxide free, has low
108
+ operating costs, and its exploitation helps improve public health.
109
+
110
+
111
+ **Project Overview**
112
+
113
+ This project deals with the introduction of an embedded automatic solar energy
114
+ tracking system that can be monitored remotely. The main objective of the system
115
+ is to exploit the maximum amount of sunlight and convert it into electricity so
116
+ that it can be used easily and efficiently. This can be done by rendering and
117
+ aligning a model that drives the solar panels to be perpendicular to and track
118
+ the sun''s rays so that more energy is generated.
119
+
120
+
121
+ **Advantages of the Tracker System**
122
+
123
+ The main advantage of this tracker is that the various readings received from
124
+ the sensors can be tracked remotely with a decentralized technological system
125
+ that allows analysis of results, detection of faults and making tracking decisions.
126
+ The advantage of this system is to provide access to a permanent and contamination-free
127
+ power supply source. When connected to large battery banks, they can independently
128
+ fill the needs of local areas.'
129
+ - source_sentence: How can I contact Dr. Doaa Mahmoud?
130
+ sentences:
131
+ - Dr. Hanan Hindy is part of the CS department and can be reached at [email protected].
132
+ - 'The final exam for Database Management System course, offered by the general
133
+ department, from 2019, is available at the following link: [https://drive.google.com/file/d/1OOIPr48WI8Cm3TVzPdel2Dh3SZUQTVxA/view'
134
+ - Dr. Doaa Mahmoud is part of the Unknown department and can be reached at [email protected].
135
+ - source_sentence: Where can I find Abdel Badi Salem's email address?
136
+ sentences:
137
+ - '# **Abstract**
138
+
139
+
140
+ ## **Introduction**
141
+
142
+ One of the main issues we are aiming to help in society are those of the disabled.
143
+ Disabilities do not have a single type or manner in which it attacks the body
144
+ but comes in a very wide range. At the present time, the amount of disabled people
145
+ is **increasing annually**, so we aim to make a standard wheelchair to aid the
146
+ mobility of disabled people who cannot walk; by designing two mechanisms, one
147
+ uses eye-movement guidance and the other uses EEG Signals, which goes through
148
+ pre-processing stage to extract more information from the data. This'' done by
149
+ segmentation using a window of size 200 (Sampling frequency), then features extraction.
150
+ That takes us to classification, the highest accuracy we got is on subject [E]
151
+ for motor imaginary dataset on Classical paradigm, Multi Level Perceptron classifier
152
+ (with accuracy of 60.5%), The result of this classification''s used as a command
153
+ to move the wheelchair after that.'
154
+ - '# **Abstract**
155
+
156
+
157
+ ## **Sports Analytics Overview**
158
+
159
+ Sports analytics has been successfully applied in sports like football and basketball.
160
+ However, its application in soccer has been limited. Research in soccer analytics
161
+ with Machine Learning techniques is limited and is mostly employed only for predictions.
162
+ There is a need to find out if the application of Machine Learning can bring better
163
+ and more insightful results in soccer analytics. In this thesis, we perform descriptive
164
+ as well as predictive analysis of soccer matches and player performances.
165
+
166
+
167
+ ## **Football Rating Analysis**
168
+
169
+ In football, it is popular to rely on ratings by experts to assess a player''s
170
+ performance. However, the experts do not unravel the criteria they use for their
171
+ rating. We attempt to identify the most important attributes of player''s performance
172
+ which determine the expert ratings. In this way we find the latent knowledge which
173
+ the experts use to assign ratings to players. We performed a series of classifications
174
+ with three different pruning strategies and an array of Machine Learning algorithms.
175
+ The best results for predicting ratings using performance metrics had mean absolute
176
+ error of 0.17. We obtained a list of most important performance metrics for each
177
+ of the playing positions which approximates the attributes considered by the experts
178
+ for assigning ratings. Then we find the most influential performance metrics of
179
+ the players for determining the match outcome and we examine the extent to which
180
+ the outcome is characterized by the performance attributes of the players. We
181
+ found 34 performance attributes'
182
+ - Dr. Abdel Badi Salem is part of the CS department and can be reached at [email protected].
183
+ pipeline_tag: sentence-similarity
184
+ library_name: sentence-transformers
185
+ metrics:
186
+ - cosine_accuracy@1
187
+ - cosine_accuracy@3
188
+ - cosine_accuracy@5
189
+ - cosine_accuracy@10
190
+ - cosine_precision@1
191
+ - cosine_precision@3
192
+ - cosine_precision@5
193
+ - cosine_precision@10
194
+ - cosine_recall@1
195
+ - cosine_recall@3
196
+ - cosine_recall@5
197
+ - cosine_recall@10
198
+ - cosine_ndcg@10
199
+ - cosine_mrr@10
200
+ - cosine_map@100
201
+ model-index:
202
+ - name: SentenceTransformer based on sentence-transformers/all-distilroberta-v1
203
+ results:
204
+ - task:
205
+ type: information-retrieval
206
+ name: Information Retrieval
207
+ dataset:
208
+ name: ai college validation
209
+ type: ai-college-validation
210
+ metrics:
211
+ - type: cosine_accuracy@1
212
+ value: 0.18810557968593383
213
+ name: Cosine Accuracy@1
214
+ - type: cosine_accuracy@3
215
+ value: 0.4186435015035082
216
+ name: Cosine Accuracy@3
217
+ - type: cosine_accuracy@5
218
+ value: 0.5676578683595055
219
+ name: Cosine Accuracy@5
220
+ - type: cosine_accuracy@10
221
+ value: 0.8463080521216171
222
+ name: Cosine Accuracy@10
223
+ - type: cosine_precision@1
224
+ value: 0.18810557968593383
225
+ name: Cosine Precision@1
226
+ - type: cosine_precision@3
227
+ value: 0.13954783383450275
228
+ name: Cosine Precision@3
229
+ - type: cosine_precision@5
230
+ value: 0.1135315736719011
231
+ name: Cosine Precision@5
232
+ - type: cosine_precision@10
233
+ value: 0.08463080521216171
234
+ name: Cosine Precision@10
235
+ - type: cosine_recall@1
236
+ value: 0.18810557968593383
237
+ name: Cosine Recall@1
238
+ - type: cosine_recall@3
239
+ value: 0.4186435015035082
240
+ name: Cosine Recall@3
241
+ - type: cosine_recall@5
242
+ value: 0.5676578683595055
243
+ name: Cosine Recall@5
244
+ - type: cosine_recall@10
245
+ value: 0.8463080521216171
246
+ name: Cosine Recall@10
247
+ - type: cosine_ndcg@10
248
+ value: 0.47259073953229414
249
+ name: Cosine Ndcg@10
250
+ - type: cosine_mrr@10
251
+ value: 0.3588172667440963
252
+ name: Cosine Mrr@10
253
+ - type: cosine_map@100
254
+ value: 0.3678298256041653
255
+ name: Cosine Map@100
256
+ - type: cosine_accuracy@1
257
+ value: 0.18843969261610424
258
+ name: Cosine Accuracy@1
259
+ - type: cosine_accuracy@3
260
+ value: 0.4173070497828266
261
+ name: Cosine Accuracy@3
262
+ - type: cosine_accuracy@5
263
+ value: 0.5669896424991647
264
+ name: Cosine Accuracy@5
265
+ - type: cosine_accuracy@10
266
+ value: 0.8456398262612763
267
+ name: Cosine Accuracy@10
268
+ - type: cosine_precision@1
269
+ value: 0.18843969261610424
270
+ name: Cosine Precision@1
271
+ - type: cosine_precision@3
272
+ value: 0.13910234992760886
273
+ name: Cosine Precision@3
274
+ - type: cosine_precision@5
275
+ value: 0.11339792849983296
276
+ name: Cosine Precision@5
277
+ - type: cosine_precision@10
278
+ value: 0.08456398262612765
279
+ name: Cosine Precision@10
280
+ - type: cosine_recall@1
281
+ value: 0.18843969261610424
282
+ name: Cosine Recall@1
283
+ - type: cosine_recall@3
284
+ value: 0.4173070497828266
285
+ name: Cosine Recall@3
286
+ - type: cosine_recall@5
287
+ value: 0.5669896424991647
288
+ name: Cosine Recall@5
289
+ - type: cosine_recall@10
290
+ value: 0.8456398262612763
291
+ name: Cosine Recall@10
292
+ - type: cosine_ndcg@10
293
+ value: 0.47223133269915585
294
+ name: Cosine Ndcg@10
295
+ - type: cosine_mrr@10
296
+ value: 0.3585802056650706
297
+ name: Cosine Mrr@10
298
+ - type: cosine_map@100
299
+ value: 0.3676667485080777
300
+ name: Cosine Map@100
301
+ - type: cosine_accuracy@1
302
+ value: 0.1102813476901702
303
+ name: Cosine Accuracy@1
304
+ - type: cosine_accuracy@3
305
+ value: 0.3218131295588746
306
+ name: Cosine Accuracy@3
307
+ - type: cosine_accuracy@5
308
+ value: 0.5451545675581799
309
+ name: Cosine Accuracy@5
310
+ - type: cosine_accuracy@10
311
+ value: 0.8817297672803056
312
+ name: Cosine Accuracy@10
313
+ - type: cosine_precision@1
314
+ value: 0.1102813476901702
315
+ name: Cosine Precision@1
316
+ - type: cosine_precision@3
317
+ value: 0.1072710431862915
318
+ name: Cosine Precision@3
319
+ - type: cosine_precision@5
320
+ value: 0.10903091351163598
321
+ name: Cosine Precision@5
322
+ - type: cosine_precision@10
323
+ value: 0.08817297672803058
324
+ name: Cosine Precision@10
325
+ - type: cosine_recall@1
326
+ value: 0.1102813476901702
327
+ name: Cosine Recall@1
328
+ - type: cosine_recall@3
329
+ value: 0.3218131295588746
330
+ name: Cosine Recall@3
331
+ - type: cosine_recall@5
332
+ value: 0.5451545675581799
333
+ name: Cosine Recall@5
334
+ - type: cosine_recall@10
335
+ value: 0.8817297672803056
336
+ name: Cosine Recall@10
337
+ - type: cosine_ndcg@10
338
+ value: 0.4323392922230707
339
+ name: Cosine Ndcg@10
340
+ - type: cosine_mrr@10
341
+ value: 0.2959338835684789
342
+ name: Cosine Mrr@10
343
+ - type: cosine_map@100
344
+ value: 0.30305652186931414
345
+ name: Cosine Map@100
346
+ - type: cosine_accuracy@1
347
+ value: 0.18576678917474107
348
+ name: Cosine Accuracy@1
349
+ - type: cosine_accuracy@3
350
+ value: 0.42064817908453056
351
+ name: Cosine Accuracy@3
352
+ - type: cosine_accuracy@5
353
+ value: 0.5699966588706983
354
+ name: Cosine Accuracy@5
355
+ - type: cosine_accuracy@10
356
+ value: 0.858002004677581
357
+ name: Cosine Accuracy@10
358
+ - type: cosine_precision@1
359
+ value: 0.18576678917474107
360
+ name: Cosine Precision@1
361
+ - type: cosine_precision@3
362
+ value: 0.14021605969484352
363
+ name: Cosine Precision@3
364
+ - type: cosine_precision@5
365
+ value: 0.11399933177413965
366
+ name: Cosine Precision@5
367
+ - type: cosine_precision@10
368
+ value: 0.08580020046775809
369
+ name: Cosine Precision@10
370
+ - type: cosine_recall@1
371
+ value: 0.18576678917474107
372
+ name: Cosine Recall@1
373
+ - type: cosine_recall@3
374
+ value: 0.42064817908453056
375
+ name: Cosine Recall@3
376
+ - type: cosine_recall@5
377
+ value: 0.5699966588706983
378
+ name: Cosine Recall@5
379
+ - type: cosine_recall@10
380
+ value: 0.858002004677581
381
+ name: Cosine Recall@10
382
+ - type: cosine_ndcg@10
383
+ value: 0.47488287423350733
384
+ name: Cosine Ndcg@10
385
+ - type: cosine_mrr@10
386
+ value: 0.35840307277828215
387
+ name: Cosine Mrr@10
388
+ - type: cosine_map@100
389
+ value: 0.3669503238927413
390
+ name: Cosine Map@100
391
+ - type: cosine_accuracy@1
392
+ value: 0.1827597728032075
393
+ name: Cosine Accuracy@1
394
+ - type: cosine_accuracy@3
395
+ value: 0.42198463080521215
396
+ name: Cosine Accuracy@3
397
+ - type: cosine_accuracy@5
398
+ value: 0.5750083528232542
399
+ name: Cosine Accuracy@5
400
+ - type: cosine_accuracy@10
401
+ value: 0.8683595055128633
402
+ name: Cosine Accuracy@10
403
+ - type: cosine_precision@1
404
+ value: 0.1827597728032075
405
+ name: Cosine Precision@1
406
+ - type: cosine_precision@3
407
+ value: 0.14066154360173738
408
+ name: Cosine Precision@3
409
+ - type: cosine_precision@5
410
+ value: 0.11500167056465085
411
+ name: Cosine Precision@5
412
+ - type: cosine_precision@10
413
+ value: 0.08683595055128634
414
+ name: Cosine Precision@10
415
+ - type: cosine_recall@1
416
+ value: 0.1827597728032075
417
+ name: Cosine Recall@1
418
+ - type: cosine_recall@3
419
+ value: 0.42198463080521215
420
+ name: Cosine Recall@3
421
+ - type: cosine_recall@5
422
+ value: 0.5750083528232542
423
+ name: Cosine Recall@5
424
+ - type: cosine_recall@10
425
+ value: 0.8683595055128633
426
+ name: Cosine Recall@10
427
+ - type: cosine_ndcg@10
428
+ value: 0.4780584736286147
429
+ name: Cosine Ndcg@10
430
+ - type: cosine_mrr@10
431
+ value: 0.3594039531393358
432
+ name: Cosine Mrr@10
433
+ - type: cosine_map@100
434
+ value: 0.3674823360981191
435
+ name: Cosine Map@100
436
+ - type: cosine_accuracy@1
437
+ value: 0.17674574006014032
438
+ name: Cosine Accuracy@1
439
+ - type: cosine_accuracy@3
440
+ value: 0.42098229201470094
441
+ name: Cosine Accuracy@3
442
+ - type: cosine_accuracy@5
443
+ value: 0.5720013364517207
444
+ name: Cosine Accuracy@5
445
+ - type: cosine_accuracy@10
446
+ value: 0.8763782158369529
447
+ name: Cosine Accuracy@10
448
+ - type: cosine_precision@1
449
+ value: 0.17674574006014032
450
+ name: Cosine Precision@1
451
+ - type: cosine_precision@3
452
+ value: 0.140327430671567
453
+ name: Cosine Precision@3
454
+ - type: cosine_precision@5
455
+ value: 0.11440026729034415
456
+ name: Cosine Precision@5
457
+ - type: cosine_precision@10
458
+ value: 0.08763782158369529
459
+ name: Cosine Precision@10
460
+ - type: cosine_recall@1
461
+ value: 0.17674574006014032
462
+ name: Cosine Recall@1
463
+ - type: cosine_recall@3
464
+ value: 0.42098229201470094
465
+ name: Cosine Recall@3
466
+ - type: cosine_recall@5
467
+ value: 0.5720013364517207
468
+ name: Cosine Recall@5
469
+ - type: cosine_recall@10
470
+ value: 0.8763782158369529
471
+ name: Cosine Recall@10
472
+ - type: cosine_ndcg@10
473
+ value: 0.47784861917490756
474
+ name: Cosine Ndcg@10
475
+ - type: cosine_mrr@10
476
+ value: 0.356773211567732
477
+ name: Cosine Mrr@10
478
+ - type: cosine_map@100
479
+ value: 0.3644323168133691
480
+ name: Cosine Map@100
481
+ - type: cosine_accuracy@1
482
+ value: 0.18843969261610424
483
+ name: Cosine Accuracy@1
484
+ - type: cosine_accuracy@3
485
+ value: 0.42398930838623455
486
+ name: Cosine Accuracy@3
487
+ - type: cosine_accuracy@5
488
+ value: 0.5833611760775143
489
+ name: Cosine Accuracy@5
490
+ - type: cosine_accuracy@10
491
+ value: 0.884062813230872
492
+ name: Cosine Accuracy@10
493
+ - type: cosine_precision@1
494
+ value: 0.18843969261610424
495
+ name: Cosine Precision@1
496
+ - type: cosine_precision@3
497
+ value: 0.14132976946207818
498
+ name: Cosine Precision@3
499
+ - type: cosine_precision@5
500
+ value: 0.11667223521550285
501
+ name: Cosine Precision@5
502
+ - type: cosine_precision@10
503
+ value: 0.08840628132308721
504
+ name: Cosine Precision@10
505
+ - type: cosine_recall@1
506
+ value: 0.18843969261610424
507
+ name: Cosine Recall@1
508
+ - type: cosine_recall@3
509
+ value: 0.42398930838623455
510
+ name: Cosine Recall@3
511
+ - type: cosine_recall@5
512
+ value: 0.5833611760775143
513
+ name: Cosine Recall@5
514
+ - type: cosine_recall@10
515
+ value: 0.884062813230872
516
+ name: Cosine Recall@10
517
+ - type: cosine_ndcg@10
518
+ value: 0.48660967480465983
519
+ name: Cosine Ndcg@10
520
+ - type: cosine_mrr@10
521
+ value: 0.36589860468076263
522
+ name: Cosine Mrr@10
523
+ - type: cosine_map@100
524
+ value: 0.3731321313039561
525
+ name: Cosine Map@100
526
+ - type: cosine_accuracy@1
527
+ value: 0.18843969261610424
528
+ name: Cosine Accuracy@1
529
+ - type: cosine_accuracy@3
530
+ value: 0.42332108252589373
531
+ name: Cosine Accuracy@3
532
+ - type: cosine_accuracy@5
533
+ value: 0.5830270631473438
534
+ name: Cosine Accuracy@5
535
+ - type: cosine_accuracy@10
536
+ value: 0.8837287003007016
537
+ name: Cosine Accuracy@10
538
+ - type: cosine_precision@1
539
+ value: 0.18843969261610424
540
+ name: Cosine Precision@1
541
+ - type: cosine_precision@3
542
+ value: 0.14110702750863124
543
+ name: Cosine Precision@3
544
+ - type: cosine_precision@5
545
+ value: 0.11660541262946876
546
+ name: Cosine Precision@5
547
+ - type: cosine_precision@10
548
+ value: 0.08837287003007016
549
+ name: Cosine Precision@10
550
+ - type: cosine_recall@1
551
+ value: 0.18843969261610424
552
+ name: Cosine Recall@1
553
+ - type: cosine_recall@3
554
+ value: 0.42332108252589373
555
+ name: Cosine Recall@3
556
+ - type: cosine_recall@5
557
+ value: 0.5830270631473438
558
+ name: Cosine Recall@5
559
+ - type: cosine_recall@10
560
+ value: 0.8837287003007016
561
+ name: Cosine Recall@10
562
+ - type: cosine_ndcg@10
563
+ value: 0.4864124568497682
564
+ name: Cosine Ndcg@10
565
+ - type: cosine_mrr@10
566
+ value: 0.3657506403831158
567
+ name: Cosine Mrr@10
568
+ - type: cosine_map@100
569
+ value: 0.37301454090905195
570
+ name: Cosine Map@100
571
+ - task:
572
+ type: information-retrieval
573
+ name: Information Retrieval
574
+ dataset:
575
+ name: ai college modefied validation
576
+ type: ai-college_modefied-validation
577
+ metrics:
578
+ - type: cosine_accuracy@1
579
+ value: 0.1127127474817645
580
+ name: Cosine Accuracy@1
581
+ - type: cosine_accuracy@3
582
+ value: 0.3218131295588746
583
+ name: Cosine Accuracy@3
584
+ - type: cosine_accuracy@5
585
+ value: 0.5481069815908302
586
+ name: Cosine Accuracy@5
587
+ - type: cosine_accuracy@10
588
+ value: 0.8931920805835359
589
+ name: Cosine Accuracy@10
590
+ - type: cosine_precision@1
591
+ value: 0.1127127474817645
592
+ name: Cosine Precision@1
593
+ - type: cosine_precision@3
594
+ value: 0.10727104318629153
595
+ name: Cosine Precision@3
596
+ - type: cosine_precision@5
597
+ value: 0.10962139631816603
598
+ name: Cosine Precision@5
599
+ - type: cosine_precision@10
600
+ value: 0.08931920805835358
601
+ name: Cosine Precision@10
602
+ - type: cosine_recall@1
603
+ value: 0.1127127474817645
604
+ name: Cosine Recall@1
605
+ - type: cosine_recall@3
606
+ value: 0.3218131295588746
607
+ name: Cosine Recall@3
608
+ - type: cosine_recall@5
609
+ value: 0.5481069815908302
610
+ name: Cosine Recall@5
611
+ - type: cosine_recall@10
612
+ value: 0.8931920805835359
613
+ name: Cosine Recall@10
614
+ - type: cosine_ndcg@10
615
+ value: 0.4379716529188091
616
+ name: Cosine Ndcg@10
617
+ - type: cosine_mrr@10
618
+ value: 0.2999361137299657
619
+ name: Cosine Mrr@10
620
+ - type: cosine_map@100
621
+ value: 0.30656764876713344
622
+ name: Cosine Map@100
623
+ - type: cosine_accuracy@1
624
+ value: 0.10993400486279958
625
+ name: Cosine Accuracy@1
626
+ - type: cosine_accuracy@3
627
+ value: 0.32737061479680446
628
+ name: Cosine Accuracy@3
629
+ - type: cosine_accuracy@5
630
+ value: 0.5489753386592567
631
+ name: Cosine Accuracy@5
632
+ - type: cosine_accuracy@10
633
+ value: 0.8989232372351511
634
+ name: Cosine Accuracy@10
635
+ - type: cosine_precision@1
636
+ value: 0.10993400486279958
637
+ name: Cosine Precision@1
638
+ - type: cosine_precision@3
639
+ value: 0.10912353826560146
640
+ name: Cosine Precision@3
641
+ - type: cosine_precision@5
642
+ value: 0.10979506773185134
643
+ name: Cosine Precision@5
644
+ - type: cosine_precision@10
645
+ value: 0.08989232372351512
646
+ name: Cosine Precision@10
647
+ - type: cosine_recall@1
648
+ value: 0.10993400486279958
649
+ name: Cosine Recall@1
650
+ - type: cosine_recall@3
651
+ value: 0.32737061479680446
652
+ name: Cosine Recall@3
653
+ - type: cosine_recall@5
654
+ value: 0.5489753386592567
655
+ name: Cosine Recall@5
656
+ - type: cosine_recall@10
657
+ value: 0.8989232372351511
658
+ name: Cosine Recall@10
659
+ - type: cosine_ndcg@10
660
+ value: 0.43927652334969547
661
+ name: Cosine Ndcg@10
662
+ - type: cosine_mrr@10
663
+ value: 0.2998494158575775
664
+ name: Cosine Mrr@10
665
+ - type: cosine_map@100
666
+ value: 0.30624915588054374
667
+ name: Cosine Map@100
668
+ - type: cosine_accuracy@1
669
+ value: 0.10993400486279958
670
+ name: Cosine Accuracy@1
671
+ - type: cosine_accuracy@3
672
+ value: 0.3268496005557485
673
+ name: Cosine Accuracy@3
674
+ - type: cosine_accuracy@5
675
+ value: 0.548627995831886
676
+ name: Cosine Accuracy@5
677
+ - type: cosine_accuracy@10
678
+ value: 0.8989232372351511
679
+ name: Cosine Accuracy@10
680
+ - type: cosine_precision@1
681
+ value: 0.10993400486279958
682
+ name: Cosine Precision@1
683
+ - type: cosine_precision@3
684
+ value: 0.10894986685191616
685
+ name: Cosine Precision@3
686
+ - type: cosine_precision@5
687
+ value: 0.10972559916637721
688
+ name: Cosine Precision@5
689
+ - type: cosine_precision@10
690
+ value: 0.08989232372351512
691
+ name: Cosine Precision@10
692
+ - type: cosine_recall@1
693
+ value: 0.10993400486279958
694
+ name: Cosine Recall@1
695
+ - type: cosine_recall@3
696
+ value: 0.3268496005557485
697
+ name: Cosine Recall@3
698
+ - type: cosine_recall@5
699
+ value: 0.548627995831886
700
+ name: Cosine Recall@5
701
+ - type: cosine_recall@10
702
+ value: 0.8989232372351511
703
+ name: Cosine Recall@10
704
+ - type: cosine_ndcg@10
705
+ value: 0.43919844728741414
706
+ name: Cosine Ndcg@10
707
+ - type: cosine_mrr@10
708
+ value: 0.29975865186875866
709
+ name: Cosine Mrr@10
710
+ - type: cosine_map@100
711
+ value: 0.3061583918917249
712
+ name: Cosine Map@100
713
+ - type: cosine_accuracy@1
714
+ value: 0.10680791941646404
715
+ name: Cosine Accuracy@1
716
+ - type: cosine_accuracy@3
717
+ value: 0.33014935741576934
718
+ name: Cosine Accuracy@3
719
+ - type: cosine_accuracy@5
720
+ value: 0.558179923584578
721
+ name: Cosine Accuracy@5
722
+ - type: cosine_accuracy@10
723
+ value: 0.8997915943035776
724
+ name: Cosine Accuracy@10
725
+ - type: cosine_precision@1
726
+ value: 0.10680791941646404
727
+ name: Cosine Precision@1
728
+ - type: cosine_precision@3
729
+ value: 0.11004978580525644
730
+ name: Cosine Precision@3
731
+ - type: cosine_precision@5
732
+ value: 0.11163598471691559
733
+ name: Cosine Precision@5
734
+ - type: cosine_precision@10
735
+ value: 0.08997915943035775
736
+ name: Cosine Precision@10
737
+ - type: cosine_recall@1
738
+ value: 0.10680791941646404
739
+ name: Cosine Recall@1
740
+ - type: cosine_recall@3
741
+ value: 0.33014935741576934
742
+ name: Cosine Recall@3
743
+ - type: cosine_recall@5
744
+ value: 0.558179923584578
745
+ name: Cosine Recall@5
746
+ - type: cosine_recall@10
747
+ value: 0.8997915943035776
748
+ name: Cosine Recall@10
749
+ - type: cosine_ndcg@10
750
+ value: 0.4393835206266066
751
+ name: Cosine Ndcg@10
752
+ - type: cosine_mrr@10
753
+ value: 0.2994972488242717
754
+ name: Cosine Mrr@10
755
+ - type: cosine_map@100
756
+ value: 0.3060162279226998
757
+ name: Cosine Map@100
758
+ ---
759
+
760
+ # SentenceTransformer based on sentence-transformers/all-distilroberta-v1
761
+
762
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-distilroberta-v1](https://huggingface.co/sentence-transformers/all-distilroberta-v1). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
763
+
764
+ ## Model Details
765
+
766
+ ### Model Description
767
+ - **Model Type:** Sentence Transformer
768
+ - **Base model:** [sentence-transformers/all-distilroberta-v1](https://huggingface.co/sentence-transformers/all-distilroberta-v1) <!-- at revision 842eaed40bee4d61673a81c92d5689a8fed7a09f -->
769
+ - **Maximum Sequence Length:** 512 tokens
770
+ - **Output Dimensionality:** 768 dimensions
771
+ - **Similarity Function:** Cosine Similarity
772
+ <!-- - **Training Dataset:** Unknown -->
773
+ <!-- - **Language:** Unknown -->
774
+ <!-- - **License:** Unknown -->
775
+
776
+ ### Model Sources
777
+
778
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
779
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
780
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
781
+
782
+ ### Full Model Architecture
783
+
784
+ ```
785
+ SentenceTransformer(
786
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel
787
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
788
+ (2): Normalize()
789
+ )
790
+ ```
791
+
792
+ ## Usage
793
+
794
+ ### Direct Usage (Sentence Transformers)
795
+
796
+ First install the Sentence Transformers library:
797
+
798
+ ```bash
799
+ pip install -U sentence-transformers
800
+ ```
801
+
802
+ Then you can load this model and run inference.
803
+ ```python
804
+ from sentence_transformers import SentenceTransformer
805
+
806
+ # Download from the 🤗 Hub
807
+ model = SentenceTransformer("Bo8dady/finetuned4-College-embeddings")
808
+ # Run inference
809
+ sentences = [
810
+ "Where can I find Abdel Badi Salem's email address?",
811
+ 'Dr. Abdel Badi Salem is part of the CS department and can be reached at [email protected].',
812
+ "# **Abstract**\n\n## **Sports Analytics Overview**\nSports analytics has been successfully applied in sports like football and basketball. However, its application in soccer has been limited. Research in soccer analytics with Machine Learning techniques is limited and is mostly employed only for predictions. There is a need to find out if the application of Machine Learning can bring better and more insightful results in soccer analytics. In this thesis, we perform descriptive as well as predictive analysis of soccer matches and player performances.\n\n## **Football Rating Analysis**\nIn football, it is popular to rely on ratings by experts to assess a player's performance. However, the experts do not unravel the criteria they use for their rating. We attempt to identify the most important attributes of player's performance which determine the expert ratings. In this way we find the latent knowledge which the experts use to assign ratings to players. We performed a series of classifications with three different pruning strategies and an array of Machine Learning algorithms. The best results for predicting ratings using performance metrics had mean absolute error of 0.17. We obtained a list of most important performance metrics for each of the playing positions which approximates the attributes considered by the experts for assigning ratings. Then we find the most influential performance metrics of the players for determining the match outcome and we examine the extent to which the outcome is characterized by the performance attributes of the players. We found 34 performance attributes",
813
+ ]
814
+ embeddings = model.encode(sentences)
815
+ print(embeddings.shape)
816
+ # [3, 768]
817
+
818
+ # Get the similarity scores for the embeddings
819
+ similarities = model.similarity(embeddings, embeddings)
820
+ print(similarities.shape)
821
+ # [3, 3]
822
+ ```
823
+
824
+ <!--
825
+ ### Direct Usage (Transformers)
826
+
827
+ <details><summary>Click to see the direct usage in Transformers</summary>
828
+
829
+ </details>
830
+ -->
831
+
832
+ <!--
833
+ ### Downstream Usage (Sentence Transformers)
834
+
835
+ You can finetune this model on your own dataset.
836
+
837
+ <details><summary>Click to expand</summary>
838
+
839
+ </details>
840
+ -->
841
+
842
+ <!--
843
+ ### Out-of-Scope Use
844
+
845
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
846
+ -->
847
+
848
+ ## Evaluation
849
+
850
+ ### Metrics
851
+
852
+ #### Information Retrieval
853
+
854
+ * Datasets: `ai-college-validation`, `ai-college_modefied-validation`, `ai-college-validation`, `ai-college_modefied-validation`, `ai-college-validation`, `ai-college_modefied-validation`, `ai-college-validation`, `ai-college-validation` and `ai-college_modefied-validation`
855
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
856
+
857
+ | Metric | ai-college-validation | ai-college_modefied-validation |
858
+ |:--------------------|:----------------------|:-------------------------------|
859
+ | cosine_accuracy@1 | 0.1884 | 0.1068 |
860
+ | cosine_accuracy@3 | 0.4233 | 0.3301 |
861
+ | cosine_accuracy@5 | 0.583 | 0.5582 |
862
+ | cosine_accuracy@10 | 0.8837 | 0.8998 |
863
+ | cosine_precision@1 | 0.1884 | 0.1068 |
864
+ | cosine_precision@3 | 0.1411 | 0.11 |
865
+ | cosine_precision@5 | 0.1166 | 0.1116 |
866
+ | cosine_precision@10 | 0.0884 | 0.09 |
867
+ | cosine_recall@1 | 0.1884 | 0.1068 |
868
+ | cosine_recall@3 | 0.4233 | 0.3301 |
869
+ | cosine_recall@5 | 0.583 | 0.5582 |
870
+ | cosine_recall@10 | 0.8837 | 0.8998 |
871
+ | **cosine_ndcg@10** | **0.4864** | **0.4394** |
872
+ | cosine_mrr@10 | 0.3658 | 0.2995 |
873
+ | cosine_map@100 | 0.373 | 0.306 |
874
+
875
+ #### Information Retrieval
876
+
877
+ * Dataset: `ai-college-validation`
878
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
879
+
880
+ | Metric | Value |
881
+ |:--------------------|:-----------|
882
+ | cosine_accuracy@1 | 0.1884 |
883
+ | cosine_accuracy@3 | 0.4173 |
884
+ | cosine_accuracy@5 | 0.567 |
885
+ | cosine_accuracy@10 | 0.8456 |
886
+ | cosine_precision@1 | 0.1884 |
887
+ | cosine_precision@3 | 0.1391 |
888
+ | cosine_precision@5 | 0.1134 |
889
+ | cosine_precision@10 | 0.0846 |
890
+ | cosine_recall@1 | 0.1884 |
891
+ | cosine_recall@3 | 0.4173 |
892
+ | cosine_recall@5 | 0.567 |
893
+ | cosine_recall@10 | 0.8456 |
894
+ | **cosine_ndcg@10** | **0.4722** |
895
+ | cosine_mrr@10 | 0.3586 |
896
+ | cosine_map@100 | 0.3677 |
897
+
898
+ #### Information Retrieval
899
+
900
+ * Dataset: `ai-college-validation`
901
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
902
+
903
+ | Metric | Value |
904
+ |:--------------------|:-----------|
905
+ | cosine_accuracy@1 | 0.1103 |
906
+ | cosine_accuracy@3 | 0.3218 |
907
+ | cosine_accuracy@5 | 0.5452 |
908
+ | cosine_accuracy@10 | 0.8817 |
909
+ | cosine_precision@1 | 0.1103 |
910
+ | cosine_precision@3 | 0.1073 |
911
+ | cosine_precision@5 | 0.109 |
912
+ | cosine_precision@10 | 0.0882 |
913
+ | cosine_recall@1 | 0.1103 |
914
+ | cosine_recall@3 | 0.3218 |
915
+ | cosine_recall@5 | 0.5452 |
916
+ | cosine_recall@10 | 0.8817 |
917
+ | **cosine_ndcg@10** | **0.4323** |
918
+ | cosine_mrr@10 | 0.2959 |
919
+ | cosine_map@100 | 0.3031 |
920
+
921
+ #### Information Retrieval
922
+
923
+ * Dataset: `ai-college-validation`
924
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
925
+
926
+ | Metric | Value |
927
+ |:--------------------|:-----------|
928
+ | cosine_accuracy@1 | 0.1858 |
929
+ | cosine_accuracy@3 | 0.4206 |
930
+ | cosine_accuracy@5 | 0.57 |
931
+ | cosine_accuracy@10 | 0.858 |
932
+ | cosine_precision@1 | 0.1858 |
933
+ | cosine_precision@3 | 0.1402 |
934
+ | cosine_precision@5 | 0.114 |
935
+ | cosine_precision@10 | 0.0858 |
936
+ | cosine_recall@1 | 0.1858 |
937
+ | cosine_recall@3 | 0.4206 |
938
+ | cosine_recall@5 | 0.57 |
939
+ | cosine_recall@10 | 0.858 |
940
+ | **cosine_ndcg@10** | **0.4749** |
941
+ | cosine_mrr@10 | 0.3584 |
942
+ | cosine_map@100 | 0.367 |
943
+
944
+ <!--
945
+ ## Bias, Risks and Limitations
946
+
947
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
948
+ -->
949
+
950
+ <!--
951
+ ### Recommendations
952
+
953
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
954
+ -->
955
+
956
+ ## Training Details
957
+
958
+ ### Training Dataset
959
+
960
+ #### Unnamed Dataset
961
+
962
+ * Size: 4,030 training samples
963
+ * Columns: <code>Question</code> and <code>chunk</code>
964
+ * Approximate statistics based on the first 1000 samples:
965
+ | | Question | chunk |
966
+ |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
967
+ | type | string | string |
968
+ | details | <ul><li>min: 8 tokens</li><li>mean: 15.99 tokens</li><li>max: 31 tokens</li></ul> | <ul><li>min: 21 tokens</li><li>mean: 133.41 tokens</li><li>max: 512 tokens</li></ul> |
969
+ * Samples:
970
+ | Question | chunk |
971
+ |:------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
972
+ | <code>Could you share the link to the 2018 Distributed Computing final exam?</code> | <code>The final exam for Distributed Computing course, offered by the computer science department, from 2018, is available at the following link: [https://drive.google.com/file/d/1YSzMeYStlFEztP0TloIcBqnfPr60o4ez/view?usp=sharing</code> |
973
+ | <code>What databases exist for footstep recognition research?</code> | <code>**Abstract**<br><br>**Documentation Overview**<br>This documentation reports an experimental analysis of footsteps as a biometric. The focus here is on information extracted from the time domain of signals collected from an array of piezoelectric sensors.<br><br>**Database Information**<br>Results are related to the largest footstep database collected to date, with almost 20,000 valid footstep signals and more than 120 persons, which is well beyond previous related databases.<br><br>**Feature Extraction**<br>Three feature approaches have been extracted, the popular ground reaction force (GRF), the spatial average and the upper and lower contours of the pressure signals.<br><br>**Experimental Results**<br>Experimental work is based on a verification mode with a holistic approach based on PCA and SVM, achieving results in the range of 5 to 15% equal error rate(EER) depending on the experimental conditions of quantity of data used in the reference models.</code> |
974
+ | <code>Is there a maximum duration of study specified in the text?</code> | <code>Topic: Duration of Study<br>Summary: A bachelor's degree at the Faculty of Computers and Information requires at least four years of study, contingent on fulfilling degree requirements.<br>Chunk: "Duration of study<br>• The duration of study at the Faculty of Computers and Information to obtain a bachelor's degree is not less than 4 years, provided that the requirements for obtaining the scientific degree are completed."</code> |
975
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
976
+ ```json
977
+ {
978
+ "scale": 20.0,
979
+ "similarity_fct": "cos_sim"
980
+ }
981
+ ```
982
+
983
+ ### Evaluation Dataset
984
+
985
+ #### Unnamed Dataset
986
+
987
+ * Size: 575 evaluation samples
988
+ * Columns: <code>Question</code> and <code>chunk</code>
989
+ * Approximate statistics based on the first 575 samples:
990
+ | | Question | chunk |
991
+ |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
992
+ | type | string | string |
993
+ | details | <ul><li>min: 9 tokens</li><li>mean: 15.97 tokens</li><li>max: 29 tokens</li></ul> | <ul><li>min: 21 tokens</li><li>mean: 134.83 tokens</li><li>max: 484 tokens</li></ul> |
994
+ * Samples:
995
+ | Question | chunk |
996
+ |:---------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
997
+ | <code>Are there projects that use machine learning for automatic brain tumor identification?</code> | <code># **Abstract**<br><br>## **Brain and Tumor Description**<br>A human brain is center of the nervous system; it is a collection of white mass of cells. A tumor of brain is collection of uncontrolled increasing of these cells abnormally found in different part of the brain namely Glial cells, neurons, lymphatic tissues, blood vessels, pituitary glands and other part of brain which lead to the cancer.<br><br>## **Detection and Identification**<br>Manually it is not so easily possible to detect and identify the tumor. Programming division method by MRI is way to detect and identify the tumor. In order to give precise output a strong segmentation method is needed. Brain tumor identification is really challenging task in early stages of life. But now it became advanced with various machine learning and deep learning algorithms. Now a day's issue of brain tumor automatic identification is of great interest. In Order to detect the brain tumor of a patient we consider the data of patients like MRI images of a pat...</code> |
998
+ | <code>Are there studies that propose solutions to the challenges of plant pest detection using deep learning?</code> | <code>**Abstract**<br><br>**Introduction**<br>Identification of the plant diseases is the key to preventing the losses in the yield and quantity of the agricultural product. Disease diagnosis based on the detection of early symptoms is a usual threshold taken into account for integrated pest management strategies. through deep learning methodologies, plant diseases can be detected and diagnosed.<br><br>**Study Discussion**<br>On this basis, this study discusses possible challenges in practical applications of plant diseases and pests detection based on deep learning. In addition, possible solutions and research ideas are proposed for the challenges, and several suggestions are given. Finally, this study gives the analysis and prospect of the future trend of plant diseases and pests detection based on deep learning.<br><br>5 | Page</code> |
999
+ | <code>Is there a link available for the 2025 Calc 1 course exam?</code> | <code>The final exam for the calculus1 course, offered by the general department, from 2025, is available at the following link: [https://drive.google.com/file/d/1g8iiGUo4HCUzNNWBJJrW1QZAsz-RYehw/view?usp=sharing].</code> |
1000
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
1001
+ ```json
1002
+ {
1003
+ "scale": 20.0,
1004
+ "similarity_fct": "cos_sim"
1005
+ }
1006
+ ```
1007
+
1008
+ ### Training Hyperparameters
1009
+ #### Non-Default Hyperparameters
1010
+
1011
+ - `eval_strategy`: steps
1012
+ - `per_device_train_batch_size`: 16
1013
+ - `per_device_eval_batch_size`: 16
1014
+ - `learning_rate`: 1e-06
1015
+ - `num_train_epochs`: 15
1016
+ - `warmup_ratio`: 0.2
1017
+ - `batch_sampler`: no_duplicates
1018
+
1019
+ #### All Hyperparameters
1020
+ <details><summary>Click to expand</summary>
1021
+
1022
+ - `overwrite_output_dir`: False
1023
+ - `do_predict`: False
1024
+ - `eval_strategy`: steps
1025
+ - `prediction_loss_only`: True
1026
+ - `per_device_train_batch_size`: 16
1027
+ - `per_device_eval_batch_size`: 16
1028
+ - `per_gpu_train_batch_size`: None
1029
+ - `per_gpu_eval_batch_size`: None
1030
+ - `gradient_accumulation_steps`: 1
1031
+ - `eval_accumulation_steps`: None
1032
+ - `torch_empty_cache_steps`: None
1033
+ - `learning_rate`: 1e-06
1034
+ - `weight_decay`: 0.0
1035
+ - `adam_beta1`: 0.9
1036
+ - `adam_beta2`: 0.999
1037
+ - `adam_epsilon`: 1e-08
1038
+ - `max_grad_norm`: 1.0
1039
+ - `num_train_epochs`: 15
1040
+ - `max_steps`: -1
1041
+ - `lr_scheduler_type`: linear
1042
+ - `lr_scheduler_kwargs`: {}
1043
+ - `warmup_ratio`: 0.2
1044
+ - `warmup_steps`: 0
1045
+ - `log_level`: passive
1046
+ - `log_level_replica`: warning
1047
+ - `log_on_each_node`: True
1048
+ - `logging_nan_inf_filter`: True
1049
+ - `save_safetensors`: True
1050
+ - `save_on_each_node`: False
1051
+ - `save_only_model`: False
1052
+ - `restore_callback_states_from_checkpoint`: False
1053
+ - `no_cuda`: False
1054
+ - `use_cpu`: False
1055
+ - `use_mps_device`: False
1056
+ - `seed`: 42
1057
+ - `data_seed`: None
1058
+ - `jit_mode_eval`: False
1059
+ - `use_ipex`: False
1060
+ - `bf16`: False
1061
+ - `fp16`: False
1062
+ - `fp16_opt_level`: O1
1063
+ - `half_precision_backend`: auto
1064
+ - `bf16_full_eval`: False
1065
+ - `fp16_full_eval`: False
1066
+ - `tf32`: None
1067
+ - `local_rank`: 0
1068
+ - `ddp_backend`: None
1069
+ - `tpu_num_cores`: None
1070
+ - `tpu_metrics_debug`: False
1071
+ - `debug`: []
1072
+ - `dataloader_drop_last`: False
1073
+ - `dataloader_num_workers`: 0
1074
+ - `dataloader_prefetch_factor`: None
1075
+ - `past_index`: -1
1076
+ - `disable_tqdm`: False
1077
+ - `remove_unused_columns`: True
1078
+ - `label_names`: None
1079
+ - `load_best_model_at_end`: False
1080
+ - `ignore_data_skip`: False
1081
+ - `fsdp`: []
1082
+ - `fsdp_min_num_params`: 0
1083
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
1084
+ - `tp_size`: 0
1085
+ - `fsdp_transformer_layer_cls_to_wrap`: None
1086
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
1087
+ - `deepspeed`: None
1088
+ - `label_smoothing_factor`: 0.0
1089
+ - `optim`: adamw_torch
1090
+ - `optim_args`: None
1091
+ - `adafactor`: False
1092
+ - `group_by_length`: False
1093
+ - `length_column_name`: length
1094
+ - `ddp_find_unused_parameters`: None
1095
+ - `ddp_bucket_cap_mb`: None
1096
+ - `ddp_broadcast_buffers`: False
1097
+ - `dataloader_pin_memory`: True
1098
+ - `dataloader_persistent_workers`: False
1099
+ - `skip_memory_metrics`: True
1100
+ - `use_legacy_prediction_loop`: False
1101
+ - `push_to_hub`: False
1102
+ - `resume_from_checkpoint`: None
1103
+ - `hub_model_id`: None
1104
+ - `hub_strategy`: every_save
1105
+ - `hub_private_repo`: None
1106
+ - `hub_always_push`: False
1107
+ - `gradient_checkpointing`: False
1108
+ - `gradient_checkpointing_kwargs`: None
1109
+ - `include_inputs_for_metrics`: False
1110
+ - `include_for_metrics`: []
1111
+ - `eval_do_concat_batches`: True
1112
+ - `fp16_backend`: auto
1113
+ - `push_to_hub_model_id`: None
1114
+ - `push_to_hub_organization`: None
1115
+ - `mp_parameters`:
1116
+ - `auto_find_batch_size`: False
1117
+ - `full_determinism`: False
1118
+ - `torchdynamo`: None
1119
+ - `ray_scope`: last
1120
+ - `ddp_timeout`: 1800
1121
+ - `torch_compile`: False
1122
+ - `torch_compile_backend`: None
1123
+ - `torch_compile_mode`: None
1124
+ - `include_tokens_per_second`: False
1125
+ - `include_num_input_tokens_seen`: False
1126
+ - `neftune_noise_alpha`: None
1127
+ - `optim_target_modules`: None
1128
+ - `batch_eval_metrics`: False
1129
+ - `eval_on_start`: False
1130
+ - `use_liger_kernel`: False
1131
+ - `eval_use_gather_object`: False
1132
+ - `average_tokens_across_devices`: False
1133
+ - `prompts`: None
1134
+ - `batch_sampler`: no_duplicates
1135
+ - `multi_dataset_batch_sampler`: proportional
1136
+
1137
+ </details>
1138
+
1139
+ ### Training Logs
1140
+ <details><summary>Click to expand</summary>
1141
+
1142
+ | Epoch | Step | Training Loss | Validation Loss | ai-college-validation_cosine_ndcg@10 | ai-college_modefied-validation_cosine_ndcg@10 |
1143
+ |:-------:|:----:|:-------------:|:---------------:|:------------------------------------:|:---------------------------------------------:|
1144
+ | -1 | -1 | - | - | 0.4208 | - |
1145
+ | 0.3968 | 100 | 0.1371 | 0.0785 | 0.4483 | - |
1146
+ | 0.7937 | 200 | 0.0575 | 0.0357 | 0.4600 | - |
1147
+ | 1.1905 | 300 | 0.0346 | 0.0286 | 0.4640 | - |
1148
+ | 1.5873 | 400 | 0.0313 | 0.0264 | 0.4698 | - |
1149
+ | 1.9841 | 500 | 0.0189 | 0.0256 | 0.4716 | - |
1150
+ | 2.3810 | 600 | 0.021 | 0.0249 | 0.4703 | - |
1151
+ | 2.7778 | 700 | 0.0264 | 0.0247 | 0.4726 | - |
1152
+ | -1 | -1 | - | - | 0.4252 | - |
1153
+ | 0.3968 | 100 | 0.0132 | 0.0238 | 0.4277 | - |
1154
+ | 0.7937 | 200 | 0.0192 | 0.0221 | 0.4295 | - |
1155
+ | 1.1905 | 300 | 0.0169 | 0.0214 | 0.4316 | - |
1156
+ | 1.5873 | 400 | 0.02 | 0.0200 | 0.4315 | - |
1157
+ | 1.9841 | 500 | 0.0124 | 0.0201 | 0.4315 | - |
1158
+ | 2.3810 | 600 | 0.0152 | 0.0195 | 0.4311 | - |
1159
+ | 2.7778 | 700 | 0.0189 | 0.0193 | 0.4309 | - |
1160
+ | 3.1746 | 800 | 0.0222 | 0.0182 | 0.4307 | - |
1161
+ | 3.5714 | 900 | 0.0158 | 0.0182 | 0.4312 | - |
1162
+ | 3.9683 | 1000 | 0.0144 | 0.0181 | 0.4311 | - |
1163
+ | 4.3651 | 1100 | 0.0237 | 0.0176 | 0.4315 | - |
1164
+ | 4.7619 | 1200 | 0.0132 | 0.0178 | 0.4323 | - |
1165
+ | -1 | -1 | - | - | 0.4749 | 0.4326 |
1166
+ | 0.3968 | 100 | 0.0077 | 0.0175 | - | 0.4322 |
1167
+ | 0.7937 | 200 | 0.0116 | 0.0171 | - | 0.4320 |
1168
+ | 1.1905 | 300 | 0.0093 | 0.0169 | - | 0.4339 |
1169
+ | 1.5873 | 400 | 0.0125 | 0.0160 | - | 0.4340 |
1170
+ | 1.9841 | 500 | 0.0083 | 0.0161 | - | 0.4340 |
1171
+ | 2.3810 | 600 | 0.0105 | 0.0156 | - | 0.4350 |
1172
+ | 2.7778 | 700 | 0.0132 | 0.0155 | - | 0.4357 |
1173
+ | 3.1746 | 800 | 0.0161 | 0.0145 | - | 0.4362 |
1174
+ | 3.5714 | 900 | 0.0113 | 0.0144 | - | 0.4372 |
1175
+ | 3.9683 | 1000 | 0.0112 | 0.0140 | - | 0.4368 |
1176
+ | 4.3651 | 1100 | 0.0185 | 0.0136 | - | 0.4366 |
1177
+ | 4.7619 | 1200 | 0.0101 | 0.0139 | - | 0.4367 |
1178
+ | 5.1587 | 1300 | 0.0118 | 0.0138 | - | 0.4366 |
1179
+ | 5.5556 | 1400 | 0.0145 | 0.0139 | - | 0.4366 |
1180
+ | 5.9524 | 1500 | 0.0104 | 0.0139 | - | 0.4376 |
1181
+ | 6.3492 | 1600 | 0.0105 | 0.0137 | - | 0.4380 |
1182
+ | 6.7460 | 1700 | 0.0125 | 0.0137 | - | 0.4380 |
1183
+ | -1 | -1 | - | - | 0.4781 | 0.4375 |
1184
+ | 0.3968 | 100 | 0.0054 | 0.0135 | - | 0.4380 |
1185
+ | 0.7937 | 200 | 0.0078 | 0.0133 | - | 0.4374 |
1186
+ | 1.1905 | 300 | 0.0053 | 0.0132 | - | 0.4381 |
1187
+ | 1.5873 | 400 | 0.0077 | 0.0127 | - | 0.4387 |
1188
+ | 1.9841 | 500 | 0.0054 | 0.0129 | - | 0.4374 |
1189
+ | 2.3810 | 600 | 0.0067 | 0.0122 | - | 0.4392 |
1190
+ | 2.7778 | 700 | 0.0094 | 0.0120 | - | 0.4387 |
1191
+ | 3.1746 | 800 | 0.0111 | 0.0116 | - | 0.4360 |
1192
+ | 3.5714 | 900 | 0.0079 | 0.0113 | - | 0.4368 |
1193
+ | 3.9683 | 1000 | 0.0081 | 0.0111 | - | 0.4369 |
1194
+ | 4.3651 | 1100 | 0.0134 | 0.0109 | - | 0.4375 |
1195
+ | 4.7619 | 1200 | 0.0072 | 0.0110 | - | 0.4371 |
1196
+ | 5.1587 | 1300 | 0.0091 | 0.0109 | - | 0.4378 |
1197
+ | 5.5556 | 1400 | 0.0121 | 0.0111 | - | 0.4379 |
1198
+ | 5.9524 | 1500 | 0.0081 | 0.0111 | - | 0.4376 |
1199
+ | 6.3492 | 1600 | 0.008 | 0.0110 | - | 0.4376 |
1200
+ | 6.7460 | 1700 | 0.0103 | 0.0109 | - | 0.4389 |
1201
+ | 7.1429 | 1800 | 0.013 | 0.0108 | - | 0.4397 |
1202
+ | 7.5397 | 1900 | 0.0134 | 0.0109 | - | 0.4385 |
1203
+ | 7.9365 | 2000 | 0.0133 | 0.0108 | - | 0.4398 |
1204
+ | 8.3333 | 2100 | 0.0109 | 0.0109 | - | 0.4389 |
1205
+ | 8.7302 | 2200 | 0.0109 | 0.0107 | - | 0.4386 |
1206
+ | 9.1270 | 2300 | 0.0077 | 0.0104 | - | 0.4395 |
1207
+ | 9.5238 | 2400 | 0.0107 | 0.0104 | - | 0.4387 |
1208
+ | 9.9206 | 2500 | 0.0117 | 0.0104 | - | 0.4393 |
1209
+ | -1 | -1 | - | - | 0.4778 | 0.4392 |
1210
+ | 0.3968 | 100 | 0.004 | 0.0104 | 0.4787 | - |
1211
+ | 0.7937 | 200 | 0.0055 | 0.0102 | 0.4785 | - |
1212
+ | 1.1905 | 300 | 0.0035 | 0.0102 | 0.4782 | - |
1213
+ | 1.5873 | 400 | 0.0055 | 0.0100 | 0.4771 | - |
1214
+ | 1.9841 | 500 | 0.0038 | 0.0101 | 0.4770 | - |
1215
+ | 2.3810 | 600 | 0.004 | 0.0097 | 0.4772 | - |
1216
+ | 2.7778 | 700 | 0.0066 | 0.0096 | 0.4797 | - |
1217
+ | 3.1746 | 800 | 0.0073 | 0.0097 | 0.4813 | - |
1218
+ | 3.5714 | 900 | 0.0055 | 0.0092 | 0.4812 | - |
1219
+ | 3.9683 | 1000 | 0.0048 | 0.0095 | 0.4816 | - |
1220
+ | 4.3651 | 1100 | 0.0085 | 0.0093 | 0.4819 | - |
1221
+ | 4.7619 | 1200 | 0.0047 | 0.0091 | 0.4800 | - |
1222
+ | 5.1587 | 1300 | 0.0062 | 0.0091 | 0.4806 | - |
1223
+ | 5.5556 | 1400 | 0.0088 | 0.0091 | 0.4807 | - |
1224
+ | 5.9524 | 1500 | 0.0059 | 0.0091 | 0.4816 | - |
1225
+ | 6.3492 | 1600 | 0.0053 | 0.0092 | 0.4804 | - |
1226
+ | 6.7460 | 1700 | 0.0075 | 0.0092 | 0.4798 | - |
1227
+ | 7.1429 | 1800 | 0.0102 | 0.0090 | 0.4800 | - |
1228
+ | 7.5397 | 1900 | 0.0104 | 0.0090 | 0.4834 | - |
1229
+ | 7.9365 | 2000 | 0.0107 | 0.0088 | 0.4827 | - |
1230
+ | 8.3333 | 2100 | 0.0092 | 0.0088 | 0.4848 | - |
1231
+ | 8.7302 | 2200 | 0.0096 | 0.0086 | 0.4843 | - |
1232
+ | 9.1270 | 2300 | 0.0058 | 0.0084 | 0.4823 | - |
1233
+ | 9.5238 | 2400 | 0.0091 | 0.0084 | 0.4849 | - |
1234
+ | 9.9206 | 2500 | 0.0108 | 0.0083 | 0.4833 | - |
1235
+ | 10.3175 | 2600 | 0.0068 | 0.0083 | 0.4836 | - |
1236
+ | 10.7143 | 2700 | 0.0072 | 0.0083 | 0.4846 | - |
1237
+ | 11.1111 | 2800 | 0.0048 | 0.0082 | 0.4838 | - |
1238
+ | 11.5079 | 2900 | 0.0102 | 0.0082 | 0.4849 | - |
1239
+ | 11.9048 | 3000 | 0.0078 | 0.0082 | 0.4851 | - |
1240
+ | 12.3016 | 3100 | 0.0074 | 0.0082 | 0.4844 | - |
1241
+ | 12.6984 | 3200 | 0.0077 | 0.0081 | 0.4853 | - |
1242
+ | 13.0952 | 3300 | 0.0099 | 0.0081 | 0.4844 | - |
1243
+ | 13.4921 | 3400 | 0.0074 | 0.0081 | 0.4856 | - |
1244
+ | 13.8889 | 3500 | 0.0074 | 0.0081 | 0.4870 | - |
1245
+ | 14.2857 | 3600 | 0.0109 | 0.0081 | 0.4866 | - |
1246
+ | 14.6825 | 3700 | 0.0055 | 0.0081 | 0.4866 | - |
1247
+ | -1 | -1 | - | - | 0.4864 | 0.4394 |
1248
+
1249
+ </details>
1250
+
1251
+ ### Framework Versions
1252
+ - Python: 3.11.11
1253
+ - Sentence Transformers: 3.4.1
1254
+ - Transformers: 4.51.1
1255
+ - PyTorch: 2.5.1+cu124
1256
+ - Accelerate: 1.3.0
1257
+ - Datasets: 3.5.0
1258
+ - Tokenizers: 0.21.0
1259
+
1260
+ ## Citation
1261
+
1262
+ ### BibTeX
1263
+
1264
+ #### Sentence Transformers
1265
+ ```bibtex
1266
+ @inproceedings{reimers-2019-sentence-bert,
1267
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
1268
+ author = "Reimers, Nils and Gurevych, Iryna",
1269
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
1270
+ month = "11",
1271
+ year = "2019",
1272
+ publisher = "Association for Computational Linguistics",
1273
+ url = "https://arxiv.org/abs/1908.10084",
1274
+ }
1275
+ ```
1276
+
1277
+ #### MultipleNegativesRankingLoss
1278
+ ```bibtex
1279
+ @misc{henderson2017efficient,
1280
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
1281
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
1282
+ year={2017},
1283
+ eprint={1705.00652},
1284
+ archivePrefix={arXiv},
1285
+ primaryClass={cs.CL}
1286
+ }
1287
+ ```
1288
+
1289
+ <!--
1290
+ ## Glossary
1291
+
1292
+ *Clearly define terms in order to be accessible across audiences.*
1293
+ -->
1294
+
1295
+ <!--
1296
+ ## Model Card Authors
1297
+
1298
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
1299
+ -->
1300
+
1301
+ <!--
1302
+ ## Model Card Contact
1303
+
1304
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
1305
+ -->
config.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "RobertaModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "bos_token_id": 0,
7
+ "classifier_dropout": null,
8
+ "eos_token_id": 2,
9
+ "gradient_checkpointing": false,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 768,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 3072,
15
+ "layer_norm_eps": 1e-05,
16
+ "max_position_embeddings": 514,
17
+ "model_type": "roberta",
18
+ "num_attention_heads": 12,
19
+ "num_hidden_layers": 6,
20
+ "pad_token_id": 1,
21
+ "position_embedding_type": "absolute",
22
+ "torch_dtype": "float32",
23
+ "transformers_version": "4.51.1",
24
+ "type_vocab_size": 1,
25
+ "use_cache": true,
26
+ "vocab_size": 50265
27
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.1",
4
+ "transformers": "4.51.1",
5
+ "pytorch": "2.5.1+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ee486c13fcf3a15cff2bb4a2c600654a76e93b50a764a5dcc28f38a4468e42a3
3
+ size 328485128
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "<unk>",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<s>",
6
+ "lstrip": false,
7
+ "normalized": false,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<pad>",
14
+ "lstrip": false,
15
+ "normalized": false,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "</s>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": false,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "50264": {
37
+ "content": "<mask>",
38
+ "lstrip": true,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ }
44
+ },
45
+ "bos_token": "<s>",
46
+ "clean_up_tokenization_spaces": false,
47
+ "cls_token": "<s>",
48
+ "eos_token": "</s>",
49
+ "errors": "replace",
50
+ "extra_special_tokens": {},
51
+ "mask_token": "<mask>",
52
+ "max_length": 128,
53
+ "model_max_length": 512,
54
+ "pad_to_multiple_of": null,
55
+ "pad_token": "<pad>",
56
+ "pad_token_type_id": 0,
57
+ "padding_side": "right",
58
+ "sep_token": "</s>",
59
+ "stride": 0,
60
+ "tokenizer_class": "RobertaTokenizer",
61
+ "trim_offsets": true,
62
+ "truncation_side": "right",
63
+ "truncation_strategy": "longest_first",
64
+ "unk_token": "<unk>"
65
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff