SentenceTransformer based on BAAI/bge-small-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-small-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the ๐Ÿค— Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Market news from [DATES]',
    '[{"get_news_articles(None,None,None,\'<DATES>\')": "news_data"}, {"get_attribute([\'SPY\'],[\'returns\'],\'<DATES>\')":"SPY_returns"},  {"get_attribute([\'DIA\'],[\'returns\'],\'<DATES>\')":"DIA_returns"}, {"get_attribute([\'QQQ\'],[\'returns\'],\'<DATES>\')":"QQQ_returns"}]',
    '[{"get_dividend_history([\'<TICKER>\'],None)": "<TICKER>_dividend_history"}]',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.7277
cosine_accuracy@3 0.933
cosine_accuracy@5 0.9643
cosine_accuracy@10 0.9911
cosine_precision@1 0.7277
cosine_precision@3 0.311
cosine_precision@5 0.1929
cosine_precision@10 0.0991
cosine_recall@1 0.0202
cosine_recall@3 0.0259
cosine_recall@5 0.0268
cosine_recall@10 0.0275
cosine_ndcg@10 0.1915
cosine_mrr@10 0.8297
cosine_map@100 0.0231

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,327 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 4 tokens
    • mean: 13.03 tokens
    • max: 35 tokens
    • min: 20 tokens
    • mean: 81.5 tokens
    • max: 279 tokens
  • Samples:
    sentence_0 sentence_1
    show my holding [{"get_portfolio(['marketValue'],True,None)": "portfolio"}, {"aggregate('portfolio','ticker','marketValue','sum',None)": "total_value"}]
    what are my portfolios holdings [{"get_portfolio(['marketValue'],True,None)": "portfolio"}, {"aggregate('portfolio','ticker','marketValue','sum',None)": "total_value"}]
    Provide a summary of my investments [{"get_portfolio(['marketValue'],True,None)": "portfolio"}, {"aggregate('portfolio','ticker','marketValue','sum',None)": "total_value"}]
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • num_train_epochs: 6
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 6
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand
Epoch Step Training Loss cosine_ndcg@10
0.0150 2 - 0.0835
0.0301 4 - 0.0837
0.0451 6 - 0.0846
0.0602 8 - 0.0864
0.0752 10 - 0.0886
0.0902 12 - 0.0907
0.1053 14 - 0.0937
0.1203 16 - 0.0976
0.1353 18 - 0.1018
0.1504 20 - 0.1068
0.1654 22 - 0.1113
0.1805 24 - 0.1176
0.1955 26 - 0.1208
0.2105 28 - 0.1231
0.2256 30 - 0.1256
0.2406 32 - 0.1281
0.2556 34 - 0.1302
0.2707 36 - 0.1320
0.2857 38 - 0.1335
0.3008 40 - 0.1342
0.3158 42 - 0.1363
0.3308 44 - 0.1380
0.3459 46 - 0.1393
0.3609 48 - 0.1413
0.3759 50 - 0.1424
0.3910 52 - 0.1434
0.4060 54 - 0.1452
0.4211 56 - 0.1455
0.4361 58 - 0.1467
0.4511 60 - 0.1480
0.4662 62 - 0.1493
0.4812 64 - 0.1504
0.4962 66 - 0.1512
0.5113 68 - 0.1531
0.5263 70 - 0.1538
0.5414 72 - 0.1549
0.5564 74 - 0.1557
0.5714 76 - 0.1570
0.5865 78 - 0.1578
0.6015 80 - 0.1586
0.6165 82 - 0.1589
0.6316 84 - 0.1596
0.6466 86 - 0.1597
0.6617 88 - 0.1607
0.6767 90 - 0.1612
0.6917 92 - 0.1626
0.7068 94 - 0.1632
0.7218 96 - 0.1631
0.7368 98 - 0.1634
0.7519 100 - 0.1639
0.7669 102 - 0.1638
0.7820 104 - 0.1645
0.7970 106 - 0.1648
0.8120 108 - 0.1646
0.8271 110 - 0.1651
0.8421 112 - 0.1652
0.8571 114 - 0.1656
0.8722 116 - 0.1660
0.8872 118 - 0.1670
0.9023 120 - 0.1674
0.9173 122 - 0.1684
0.9323 124 - 0.1682
0.9474 126 - 0.1687
0.9624 128 - 0.1691
0.9774 130 - 0.1689
0.9925 132 - 0.1693
1.0 133 - 0.1696
1.0075 134 - 0.1696
1.0226 136 - 0.1696
1.0376 138 - 0.1694
1.0526 140 - 0.1698
1.0677 142 - 0.1706
1.0827 144 - 0.1711
1.0977 146 - 0.1714
1.1128 148 - 0.1719
1.1278 150 - 0.1720
1.1429 152 - 0.1721
1.1579 154 - 0.1718
1.1729 156 - 0.1722
1.1880 158 - 0.1726
1.2030 160 - 0.1731
1.2180 162 - 0.1740
1.2331 164 - 0.1742
1.2481 166 - 0.1751
1.2632 168 - 0.1754
1.2782 170 - 0.1756
1.2932 172 - 0.1757
1.3083 174 - 0.1765
1.3233 176 - 0.1764
1.3383 178 - 0.1764
1.3534 180 - 0.1766
1.3684 182 - 0.1774
1.3835 184 - 0.1771
1.3985 186 - 0.1767
1.4135 188 - 0.1769
1.4286 190 - 0.1762
1.4436 192 - 0.1762
1.4586 194 - 0.1764
1.4737 196 - 0.1773
1.4887 198 - 0.1775
1.5038 200 - 0.1776
1.5188 202 - 0.1778
1.5338 204 - 0.1778
1.5489 206 - 0.1779
1.5639 208 - 0.1775
1.5789 210 - 0.1777
1.5940 212 - 0.1780
1.6090 214 - 0.1777
1.6241 216 - 0.1783
1.6391 218 - 0.1783
1.6541 220 - 0.1794
1.6692 222 - 0.1792
1.6842 224 - 0.1795
1.6992 226 - 0.1798
1.7143 228 - 0.1794
1.7293 230 - 0.1797
1.7444 232 - 0.1804
1.7594 234 - 0.1803
1.7744 236 - 0.1800
1.7895 238 - 0.1802
1.8045 240 - 0.1808
1.8195 242 - 0.1804
1.8346 244 - 0.1797
1.8496 246 - 0.1806
1.8647 248 - 0.1808
1.8797 250 - 0.1810
1.8947 252 - 0.1810
1.9098 254 - 0.1815
1.9248 256 - 0.1822
1.9398 258 - 0.1821
1.9549 260 - 0.1827
1.9699 262 - 0.1822
1.9850 264 - 0.1826
2.0 266 - 0.1829
2.0150 268 - 0.1826
2.0301 270 - 0.1824
2.0451 272 - 0.1829
2.0602 274 - 0.1832
2.0752 276 - 0.1830
2.0902 278 - 0.1836
2.1053 280 - 0.1841
2.1203 282 - 0.1844
2.1353 284 - 0.1843
2.1504 286 - 0.1842
2.1654 288 - 0.1829
2.1805 290 - 0.1827
2.1955 292 - 0.1825
2.2105 294 - 0.1820
2.2256 296 - 0.1821
2.2406 298 - 0.1822
2.2556 300 - 0.1822
2.2707 302 - 0.1820
2.2857 304 - 0.1823
2.3008 306 - 0.1817
2.3158 308 - 0.1827
2.3308 310 - 0.1831
2.3459 312 - 0.1826
2.3609 314 - 0.1833
2.3759 316 - 0.1834
2.3910 318 - 0.1835
2.4060 320 - 0.1840
2.4211 322 - 0.1849
2.4361 324 - 0.1850
2.4511 326 - 0.1850
2.4662 328 - 0.1847
2.4812 330 - 0.1850
2.4962 332 - 0.1854
2.5113 334 - 0.1855
2.5263 336 - 0.1855
2.5414 338 - 0.1857
2.5564 340 - 0.1856
2.5714 342 - 0.1858
2.5865 344 - 0.1859
2.6015 346 - 0.1858
2.6165 348 - 0.1857
2.6316 350 - 0.1858
2.6466 352 - 0.1862
2.6617 354 - 0.1862
2.6767 356 - 0.1866
2.6917 358 - 0.1865
2.7068 360 - 0.1864
2.7218 362 - 0.1863
2.7368 364 - 0.1869
2.7519 366 - 0.1865
2.7669 368 - 0.1866
2.7820 370 - 0.1866
2.7970 372 - 0.1870
2.8120 374 - 0.1870
2.8271 376 - 0.1869
2.8421 378 - 0.1870
2.8571 380 - 0.1871
2.8722 382 - 0.1875
2.8872 384 - 0.1877
2.9023 386 - 0.1882
2.9173 388 - 0.1884
2.9323 390 - 0.1882
2.9474 392 - 0.1882
2.9624 394 - 0.1887
2.9774 396 - 0.1889
2.9925 398 - 0.1888
3.0 399 - 0.1888
3.0075 400 - 0.1885
3.0226 402 - 0.1886
3.0376 404 - 0.1887
3.0526 406 - 0.1886
3.0677 408 - 0.1885
3.0827 410 - 0.1883
3.0977 412 - 0.1886
3.1128 414 - 0.1883
3.1278 416 - 0.1888
3.1429 418 - 0.1884
3.1579 420 - 0.1879
3.1729 422 - 0.1880
3.1880 424 - 0.1881
3.2030 426 - 0.1881
3.2180 428 - 0.1878
3.2331 430 - 0.1879
3.2481 432 - 0.1882
3.2632 434 - 0.1881
3.2782 436 - 0.1884
3.2932 438 - 0.1880
3.3083 440 - 0.1878
3.3233 442 - 0.1879
3.3383 444 - 0.1882
3.3534 446 - 0.1879
3.3684 448 - 0.1877
3.3835 450 - 0.1877
3.3985 452 - 0.1876
3.4135 454 - 0.1876
3.4286 456 - 0.1870
3.4436 458 - 0.1871
3.4586 460 - 0.1870
3.4737 462 - 0.1867
3.4887 464 - 0.1867
3.5038 466 - 0.1865
3.5188 468 - 0.1862
3.5338 470 - 0.1863
3.5489 472 - 0.1860
3.5639 474 - 0.1859
3.5789 476 - 0.1858
3.5940 478 - 0.1858
3.6090 480 - 0.1854
3.6241 482 - 0.1854
3.6391 484 - 0.1859
3.6541 486 - 0.1861
3.6692 488 - 0.1863
3.6842 490 - 0.1867
3.6992 492 - 0.1874
3.7143 494 - 0.1881
3.7293 496 - 0.1884
3.7444 498 - 0.1884
3.7594 500 0.3408 0.1884
3.7744 502 - 0.1886
3.7895 504 - 0.1889
3.8045 506 - 0.1885
3.8195 508 - 0.1886
3.8346 510 - 0.1886
3.8496 512 - 0.1885
3.8647 514 - 0.1883
3.8797 516 - 0.1886
3.8947 518 - 0.1884
3.9098 520 - 0.1883
3.9248 522 - 0.1887
3.9398 524 - 0.1887
3.9549 526 - 0.1890
3.9699 528 - 0.1891
3.9850 530 - 0.1892
4.0 532 - 0.1890
4.0150 534 - 0.1888
4.0301 536 - 0.1889
4.0451 538 - 0.1887
4.0602 540 - 0.1887
4.0752 542 - 0.1885
4.0902 544 - 0.1884
4.1053 546 - 0.1888
4.1203 548 - 0.1894
4.1353 550 - 0.1897
4.1504 552 - 0.1901
4.1654 554 - 0.1904
4.1805 556 - 0.1905
4.1955 558 - 0.1903
4.2105 560 - 0.1904
4.2256 562 - 0.1908
4.2406 564 - 0.1907
4.2556 566 - 0.1906
4.2707 568 - 0.1908
4.2857 570 - 0.1909
4.3008 572 - 0.1908
4.3158 574 - 0.1902
4.3308 576 - 0.1902
4.3459 578 - 0.1906
4.3609 580 - 0.1904
4.3759 582 - 0.1907
4.3910 584 - 0.1909
4.4060 586 - 0.1909
4.4211 588 - 0.1909
4.4361 590 - 0.1909
4.4511 592 - 0.1908
4.4662 594 - 0.1907
4.4812 596 - 0.1905
4.4962 598 - 0.1906
4.5113 600 - 0.1903
4.5263 602 - 0.1902
4.5414 604 - 0.1900
4.5564 606 - 0.1900
4.5714 608 - 0.1900
4.5865 610 - 0.1902
4.6015 612 - 0.1903
4.6165 614 - 0.1903
4.6316 616 - 0.1902
4.6466 618 - 0.1901
4.6617 620 - 0.1899
4.6767 622 - 0.1899
4.6917 624 - 0.1898
4.7068 626 - 0.1896
4.7218 628 - 0.1898
4.7368 630 - 0.1897
4.7519 632 - 0.1897
4.7669 634 - 0.1897
4.7820 636 - 0.1891
4.7970 638 - 0.1895
4.8120 640 - 0.1897
4.8271 642 - 0.1899
4.8421 644 - 0.1898
4.8571 646 - 0.1898
4.8722 648 - 0.1898
4.8872 650 - 0.1897
4.9023 652 - 0.1897
4.9173 654 - 0.1895
4.9323 656 - 0.1893
4.9474 658 - 0.1893
4.9624 660 - 0.1894
4.9774 662 - 0.1895
4.9925 664 - 0.1900
5.0 665 - 0.1900
5.0075 666 - 0.1900
5.0226 668 - 0.1901
5.0376 670 - 0.1902
5.0526 672 - 0.1901
5.0677 674 - 0.1901
5.0827 676 - 0.1903
5.0977 678 - 0.1904
5.1128 680 - 0.1903
5.1278 682 - 0.1905
5.1429 684 - 0.1905
5.1579 686 - 0.1906
5.1729 688 - 0.1906
5.1880 690 - 0.1908
5.2030 692 - 0.1908
5.2180 694 - 0.1909
5.2331 696 - 0.1911
5.2481 698 - 0.1911
5.2632 700 - 0.1911
5.2782 702 - 0.1913
5.2932 704 - 0.1910
5.3083 706 - 0.1912
5.3233 708 - 0.1911
5.3383 710 - 0.1910
5.3534 712 - 0.1912
5.3684 714 - 0.1912
5.3835 716 - 0.1910
5.3985 718 - 0.1909
5.4135 720 - 0.1910
5.4286 722 - 0.1910
5.4436 724 - 0.1910
5.4586 726 - 0.1912
5.4737 728 - 0.1912
5.4887 730 - 0.1914
5.5038 732 - 0.1914
5.5188 734 - 0.1914
5.5338 736 - 0.1912
5.5489 738 - 0.1912
5.5639 740 - 0.1914
5.5789 742 - 0.1914
5.5940 744 - 0.1915

Framework Versions

  • Python: 3.12.2
  • Sentence Transformers: 3.4.1
  • Transformers: 4.50.0
  • PyTorch: 2.6.0
  • Accelerate: 1.5.2
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
-
Safetensors
Model size
33.4M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for magnifi/bge-small-en-v1-5-ft-test-run

Finetuned
(274)
this model

Evaluation results