SentenceTransformer based on BAAI/bge-small-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: BAAI/bge-small-en-v1.5
Maximum Sequence Length: 512 tokens
Output Dimensionality: 384 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Market news from [DATES]',
    '[{"get_news_articles(None,None,None,\'<DATES>\')": "news_data"}, {"get_attribute([\'SPY\'],[\'returns\'],\'<DATES>\')":"SPY_returns"},  {"get_attribute([\'DIA\'],[\'returns\'],\'<DATES>\')":"DIA_returns"}, {"get_attribute([\'QQQ\'],[\'returns\'],\'<DATES>\')":"QQQ_returns"}]',
    '[{"get_dividend_history([\'<TICKER>\'],None)": "<TICKER>_dividend_history"}]',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.7277
cosine_accuracy@3	0.933
cosine_accuracy@5	0.9643
cosine_accuracy@10	0.9911
cosine_precision@1	0.7277
cosine_precision@3	0.311
cosine_precision@5	0.1929
cosine_precision@10	0.0991
cosine_recall@1	0.0202
cosine_recall@3	0.0259
cosine_recall@5	0.0268
cosine_recall@10	0.0275
cosine_ndcg@10	0.1915
cosine_mrr@10	0.8297
cosine_map@100	0.0231

Training Details

Training Dataset

Unnamed Dataset

Size: 1,327 training samples
Columns: sentence_0 and sentence_1
Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1
type string string
details
min: 4 tokens
mean: 13.03 tokens
max: 35 tokens

min: 20 tokens
mean: 81.5 tokens
max: 279 tokens

	sentence_0	sentence_1
type	string	string
details	min: 4 tokens mean: 13.03 tokens max: 35 tokens	min: 20 tokens mean: 81.5 tokens max: 279 tokens

Samples:

sentence_0	sentence_1
`show my holding`	`[{"get_portfolio(['marketValue'],True,None)": "portfolio"}, {"aggregate('portfolio','ticker','marketValue','sum',None)": "total_value"}]`
`what are my portfolios holdings`	`[{"get_portfolio(['marketValue'],True,None)": "portfolio"}, {"aggregate('portfolio','ticker','marketValue','sum',None)": "total_value"}]`
`Provide a summary of my investments`	`[{"get_portfolio(['marketValue'],True,None)": "portfolio"}, {"aggregate('portfolio','ticker','marketValue','sum',None)": "total_value"}]`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 10
per_device_eval_batch_size: 10
num_train_epochs: 6
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 10
per_device_eval_batch_size: 10
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 6
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
tp_size: 0
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand

Epoch	Step	Training Loss	cosine_ndcg@10
0.0150	2	-	0.0835
0.0301	4	-	0.0837
0.0451	6	-	0.0846
0.0602	8	-	0.0864
0.0752	10	-	0.0886
0.0902	12	-	0.0907
0.1053	14	-	0.0937
0.1203	16	-	0.0976
0.1353	18	-	0.1018
0.1504	20	-	0.1068
0.1654	22	-	0.1113
0.1805	24	-	0.1176
0.1955	26	-	0.1208
0.2105	28	-	0.1231
0.2256	30	-	0.1256
0.2406	32	-	0.1281
0.2556	34	-	0.1302
0.2707	36	-	0.1320
0.2857	38	-	0.1335
0.3008	40	-	0.1342
0.3158	42	-	0.1363
0.3308	44	-	0.1380
0.3459	46	-	0.1393
0.3609	48	-	0.1413
0.3759	50	-	0.1424
0.3910	52	-	0.1434
0.4060	54	-	0.1452
0.4211	56	-	0.1455
0.4361	58	-	0.1467
0.4511	60	-	0.1480
0.4662	62	-	0.1493
0.4812	64	-	0.1504
0.4962	66	-	0.1512
0.5113	68	-	0.1531
0.5263	70	-	0.1538
0.5414	72	-	0.1549
0.5564	74	-	0.1557
0.5714	76	-	0.1570
0.5865	78	-	0.1578
0.6015	80	-	0.1586
0.6165	82	-	0.1589
0.6316	84	-	0.1596
0.6466	86	-	0.1597
0.6617	88	-	0.1607
0.6767	90	-	0.1612
0.6917	92	-	0.1626
0.7068	94	-	0.1632
0.7218	96	-	0.1631
0.7368	98	-	0.1634
0.7519	100	-	0.1639
0.7669	102	-	0.1638
0.7820	104	-	0.1645
0.7970	106	-	0.1648
0.8120	108	-	0.1646
0.8271	110	-	0.1651
0.8421	112	-	0.1652
0.8571	114	-	0.1656
0.8722	116	-	0.1660
0.8872	118	-	0.1670
0.9023	120	-	0.1674
0.9173	122	-	0.1684
0.9323	124	-	0.1682
0.9474	126	-	0.1687
0.9624	128	-	0.1691
0.9774	130	-	0.1689
0.9925	132	-	0.1693
1.0	133	-	0.1696
1.0075	134	-	0.1696
1.0226	136	-	0.1696
1.0376	138	-	0.1694
1.0526	140	-	0.1698
1.0677	142	-	0.1706
1.0827	144	-	0.1711
1.0977	146	-	0.1714
1.1128	148	-	0.1719
1.1278	150	-	0.1720
1.1429	152	-	0.1721
1.1579	154	-	0.1718
1.1729	156	-	0.1722
1.1880	158	-	0.1726
1.2030	160	-	0.1731
1.2180	162	-	0.1740
1.2331	164	-	0.1742
1.2481	166	-	0.1751
1.2632	168	-	0.1754
1.2782	170	-	0.1756
1.2932	172	-	0.1757
1.3083	174	-	0.1765
1.3233	176	-	0.1764
1.3383	178	-	0.1764
1.3534	180	-	0.1766
1.3684	182	-	0.1774
1.3835	184	-	0.1771
1.3985	186	-	0.1767
1.4135	188	-	0.1769
1.4286	190	-	0.1762
1.4436	192	-	0.1762
1.4586	194	-	0.1764
1.4737	196	-	0.1773
1.4887	198	-	0.1775
1.5038	200	-	0.1776
1.5188	202	-	0.1778
1.5338	204	-	0.1778
1.5489	206	-	0.1779
1.5639	208	-	0.1775
1.5789	210	-	0.1777
1.5940	212	-	0.1780
1.6090	214	-	0.1777
1.6241	216	-	0.1783
1.6391	218	-	0.1783
1.6541	220	-	0.1794
1.6692	222	-	0.1792
1.6842	224	-	0.1795
1.6992	226	-	0.1798
1.7143	228	-	0.1794
1.7293	230	-	0.1797
1.7444	232	-	0.1804
1.7594	234	-	0.1803
1.7744	236	-	0.1800
1.7895	238	-	0.1802
1.8045	240	-	0.1808
1.8195	242	-	0.1804
1.8346	244	-	0.1797
1.8496	246	-	0.1806
1.8647	248	-	0.1808
1.8797	250	-	0.1810
1.8947	252	-	0.1810
1.9098	254	-	0.1815
1.9248	256	-	0.1822
1.9398	258	-	0.1821
1.9549	260	-	0.1827
1.9699	262	-	0.1822
1.9850	264	-	0.1826
2.0	266	-	0.1829
2.0150	268	-	0.1826
2.0301	270	-	0.1824
2.0451	272	-	0.1829
2.0602	274	-	0.1832
2.0752	276	-	0.1830
2.0902	278	-	0.1836
2.1053	280	-	0.1841
2.1203	282	-	0.1844
2.1353	284	-	0.1843
2.1504	286	-	0.1842
2.1654	288	-	0.1829
2.1805	290	-	0.1827
2.1955	292	-	0.1825
2.2105	294	-	0.1820
2.2256	296	-	0.1821
2.2406	298	-	0.1822
2.2556	300	-	0.1822
2.2707	302	-	0.1820
2.2857	304	-	0.1823
2.3008	306	-	0.1817
2.3158	308	-	0.1827
2.3308	310	-	0.1831
2.3459	312	-	0.1826
2.3609	314	-	0.1833
2.3759	316	-	0.1834
2.3910	318	-	0.1835
2.4060	320	-	0.1840
2.4211	322	-	0.1849
2.4361	324	-	0.1850
2.4511	326	-	0.1850
2.4662	328	-	0.1847
2.4812	330	-	0.1850
2.4962	332	-	0.1854
2.5113	334	-	0.1855
2.5263	336	-	0.1855
2.5414	338	-	0.1857
2.5564	340	-	0.1856
2.5714	342	-	0.1858
2.5865	344	-	0.1859
2.6015	346	-	0.1858
2.6165	348	-	0.1857
2.6316	350	-	0.1858
2.6466	352	-	0.1862
2.6617	354	-	0.1862
2.6767	356	-	0.1866
2.6917	358	-	0.1865
2.7068	360	-	0.1864
2.7218	362	-	0.1863
2.7368	364	-	0.1869
2.7519	366	-	0.1865
2.7669	368	-	0.1866
2.7820	370	-	0.1866
2.7970	372	-	0.1870
2.8120	374	-	0.1870
2.8271	376	-	0.1869
2.8421	378	-	0.1870
2.8571	380	-	0.1871
2.8722	382	-	0.1875
2.8872	384	-	0.1877
2.9023	386	-	0.1882
2.9173	388	-	0.1884
2.9323	390	-	0.1882
2.9474	392	-	0.1882
2.9624	394	-	0.1887
2.9774	396	-	0.1889
2.9925	398	-	0.1888
3.0	399	-	0.1888
3.0075	400	-	0.1885
3.0226	402	-	0.1886
3.0376	404	-	0.1887
3.0526	406	-	0.1886
3.0677	408	-	0.1885
3.0827	410	-	0.1883
3.0977	412	-	0.1886
3.1128	414	-	0.1883
3.1278	416	-	0.1888
3.1429	418	-	0.1884
3.1579	420	-	0.1879
3.1729	422	-	0.1880
3.1880	424	-	0.1881
3.2030	426	-	0.1881
3.2180	428	-	0.1878
3.2331	430	-	0.1879
3.2481	432	-	0.1882
3.2632	434	-	0.1881
3.2782	436	-	0.1884
3.2932	438	-	0.1880
3.3083	440	-	0.1878
3.3233	442	-	0.1879
3.3383	444	-	0.1882
3.3534	446	-	0.1879
3.3684	448	-	0.1877
3.3835	450	-	0.1877
3.3985	452	-	0.1876
3.4135	454	-	0.1876
3.4286	456	-	0.1870
3.4436	458	-	0.1871
3.4586	460	-	0.1870
3.4737	462	-	0.1867
3.4887	464	-	0.1867
3.5038	466	-	0.1865
3.5188	468	-	0.1862
3.5338	470	-	0.1863
3.5489	472	-	0.1860
3.5639	474	-	0.1859
3.5789	476	-	0.1858
3.5940	478	-	0.1858
3.6090	480	-	0.1854
3.6241	482	-	0.1854
3.6391	484	-	0.1859
3.6541	486	-	0.1861
3.6692	488	-	0.1863
3.6842	490	-	0.1867
3.6992	492	-	0.1874
3.7143	494	-	0.1881
3.7293	496	-	0.1884
3.7444	498	-	0.1884
3.7594	500	0.3408	0.1884
3.7744	502	-	0.1886
3.7895	504	-	0.1889
3.8045	506	-	0.1885
3.8195	508	-	0.1886
3.8346	510	-	0.1886
3.8496	512	-	0.1885
3.8647	514	-	0.1883
3.8797	516	-	0.1886
3.8947	518	-	0.1884
3.9098	520	-	0.1883
3.9248	522	-	0.1887
3.9398	524	-	0.1887
3.9549	526	-	0.1890
3.9699	528	-	0.1891
3.9850	530	-	0.1892
4.0	532	-	0.1890
4.0150	534	-	0.1888
4.0301	536	-	0.1889
4.0451	538	-	0.1887
4.0602	540	-	0.1887
4.0752	542	-	0.1885
4.0902	544	-	0.1884
4.1053	546	-	0.1888
4.1203	548	-	0.1894
4.1353	550	-	0.1897
4.1504	552	-	0.1901
4.1654	554	-	0.1904
4.1805	556	-	0.1905
4.1955	558	-	0.1903
4.2105	560	-	0.1904
4.2256	562	-	0.1908
4.2406	564	-	0.1907
4.2556	566	-	0.1906
4.2707	568	-	0.1908
4.2857	570	-	0.1909
4.3008	572	-	0.1908
4.3158	574	-	0.1902
4.3308	576	-	0.1902
4.3459	578	-	0.1906
4.3609	580	-	0.1904
4.3759	582	-	0.1907
4.3910	584	-	0.1909
4.4060	586	-	0.1909
4.4211	588	-	0.1909
4.4361	590	-	0.1909
4.4511	592	-	0.1908
4.4662	594	-	0.1907
4.4812	596	-	0.1905
4.4962	598	-	0.1906
4.5113	600	-	0.1903
4.5263	602	-	0.1902
4.5414	604	-	0.1900
4.5564	606	-	0.1900
4.5714	608	-	0.1900
4.5865	610	-	0.1902
4.6015	612	-	0.1903
4.6165	614	-	0.1903
4.6316	616	-	0.1902
4.6466	618	-	0.1901
4.6617	620	-	0.1899
4.6767	622	-	0.1899
4.6917	624	-	0.1898
4.7068	626	-	0.1896
4.7218	628	-	0.1898
4.7368	630	-	0.1897
4.7519	632	-	0.1897
4.7669	634	-	0.1897
4.7820	636	-	0.1891
4.7970	638	-	0.1895
4.8120	640	-	0.1897
4.8271	642	-	0.1899
4.8421	644	-	0.1898
4.8571	646	-	0.1898
4.8722	648	-	0.1898
4.8872	650	-	0.1897
4.9023	652	-	0.1897
4.9173	654	-	0.1895
4.9323	656	-	0.1893
4.9474	658	-	0.1893
4.9624	660	-	0.1894
4.9774	662	-	0.1895
4.9925	664	-	0.1900
5.0	665	-	0.1900
5.0075	666	-	0.1900
5.0226	668	-	0.1901
5.0376	670	-	0.1902
5.0526	672	-	0.1901
5.0677	674	-	0.1901
5.0827	676	-	0.1903
5.0977	678	-	0.1904
5.1128	680	-	0.1903
5.1278	682	-	0.1905
5.1429	684	-	0.1905
5.1579	686	-	0.1906
5.1729	688	-	0.1906
5.1880	690	-	0.1908
5.2030	692	-	0.1908
5.2180	694	-	0.1909
5.2331	696	-	0.1911
5.2481	698	-	0.1911
5.2632	700	-	0.1911
5.2782	702	-	0.1913
5.2932	704	-	0.1910
5.3083	706	-	0.1912
5.3233	708	-	0.1911
5.3383	710	-	0.1910
5.3534	712	-	0.1912
5.3684	714	-	0.1912
5.3835	716	-	0.1910
5.3985	718	-	0.1909
5.4135	720	-	0.1910
5.4286	722	-	0.1910
5.4436	724	-	0.1910
5.4586	726	-	0.1912
5.4737	728	-	0.1912
5.4887	730	-	0.1914
5.5038	732	-	0.1914
5.5188	734	-	0.1914
5.5338	736	-	0.1912
5.5489	738	-	0.1912
5.5639	740	-	0.1914
5.5789	742	-	0.1914
5.5940	744	-	0.1915

Framework Versions

Python: 3.12.2
Sentence Transformers: 3.4.1
Transformers: 4.50.0
PyTorch: 2.6.0
Accelerate: 1.5.2
Datasets: 3.6.0
Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Downloads last month: -

Safetensors

Model size

33.4M params

Tensor type

F32

Model tree for magnifi/bge-small-en-v1-5-ft-test-run

Base model

BAAI/bge-small-en-v1.5

Finetuned

(274)

this model

Evaluation results

Cosine Accuracy@1 on Unknown
self-reported

0.728
Cosine Accuracy@3 on Unknown
self-reported

0.933
Cosine Accuracy@5 on Unknown
self-reported

0.964
Cosine Accuracy@10 on Unknown
self-reported

0.991
Cosine Precision@1 on Unknown
self-reported

0.728
Cosine Precision@3 on Unknown
self-reported

0.311
Cosine Precision@5 on Unknown
self-reported

0.193
Cosine Precision@10 on Unknown
self-reported

0.099
Cosine Recall@1 on Unknown
self-reported

0.020
Cosine Recall@3 on Unknown
self-reported

0.026

View on Papers With Code