langcache-embed-v3 / README.md
radoslavralev's picture
Add new SentenceTransformer model
9fcad6e verified
metadata
language:
  - en
license: apache-2.0
tags:
  - biencoder
  - sentence-transformers
  - text-classification
  - sentence-pair-classification
  - semantic-similarity
  - semantic-search
  - retrieval
  - reranking
  - generated_from_trainer
  - dataset_size:9233417
  - loss:ArcFaceInBatchLoss
base_model: answerdotai/ModernBERT-base
widget:
  - source_sentence: >-
      Hayley Vaughan portrayed Ripa on the ABC daytime soap opera , `` All My
      Children `` , between 1990 and 2002 .
    sentences:
      - >-
        Traxxpad is a music application for Sony 's PlayStation Portable
        published by Definitive Studios and developed by Eidos Interactive .
      - >-
        Between 1990 and 2002 , Hayley Vaughan Ripa portrayed in the ABC soap
        opera `` All My Children `` .
      - >-
        Between 1990 and 2002 , Ripa Hayley portrayed Vaughan in the ABC soap
        opera `` All My Children `` .
  - source_sentence: >-
      Olivella monilifera is a species of dwarf sea snail , small gastropod
      mollusk in the family Olivellidae , the marine olives .
    sentences:
      - >-
        Olivella monilifera is a species of the dwarf - sea snail , small
        gastropod mollusk in the Olivellidae family , the marine olives .
      - >-
        He was cut by the Browns after being signed by the Bills in 2013 . He
        was later released .
      - >-
        Olivella monilifera is a kind of sea snail , marine gastropod mollusk in
        the Olivellidae family , the dwarf olives .
  - source_sentence: >-
      Hayashi said that Mackey `` is a sort of `` of the original model for
      Tenchi .
    sentences:
      - >-
        In the summer of 2009 , Ellick shot a documentary about Malala Yousafzai
        .
      - >-
        Hayashi said that Mackey is `` sort of `` the original model for Tenchi
        .
      - >-
        Mackey said that Hayashi is `` sort of `` the original model for Tenchi
        .
  - source_sentence: >-
      Much of the film was shot on location in Los Angeles and in nearby Burbank
      and Glendale .
    sentences:
      - >-
        Much of the film was shot on location in Los Angeles and in nearby
        Burbank and Glendale .
      - >-
        Much of the film was shot on site in Burbank and Glendale and in the
        nearby Los Angeles .
      - >-
        Traxxpad is a music application for the Sony PlayStation Portable
        developed by the Definitive Studios and published by Eidos Interactive .
  - source_sentence: >-
      According to him , the earth is the carrier of his artistic work , which
      is only integrated into the creative process by minimal changes .
    sentences:
      - National players are Bold players .
      - >-
        According to him , earth is the carrier of his artistic work being
        integrated into the creative process only by minimal changes .
      - >-
        According to him , earth is the carrier of his creative work being
        integrated into the artistic process only by minimal changes .
datasets:
  - redis/langcache-sentencepairs-v2
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_precision@1
  - cosine_recall@1
  - cosine_ndcg@10
  - cosine_mrr@1
  - cosine_map@100
model-index:
  - name: Redis fine-tuned BiEncoder model for semantic caching on LangCache
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: test
          type: test
        metrics:
          - type: cosine_accuracy@1
            value: 0.6032809198037179
            name: Cosine Accuracy@1
          - type: cosine_precision@1
            value: 0.6032809198037179
            name: Cosine Precision@1
          - type: cosine_recall@1
            value: 0.585771482488324
            name: Cosine Recall@1
          - type: cosine_ndcg@10
            value: 0.7747479314468421
            name: Cosine Ndcg@10
          - type: cosine_mrr@1
            value: 0.6032809198037179
            name: Cosine Mrr@1
          - type: cosine_map@100
            value: 0.7280398908979986
            name: Cosine Map@100

Redis fine-tuned BiEncoder model for semantic caching on LangCache

This is a sentence-transformers model finetuned from answerdotai/ModernBERT-base on the LangCache Sentence Pairs (all) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for sentence pair similarity.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 100, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("redis/langcache-embed-v3")
# Run inference
sentences = [
    'According to him , the earth is the carrier of his artistic work , which is only integrated into the creative process by minimal changes .',
    'According to him , earth is the carrier of his artistic work being integrated into the creative process only by minimal changes .',
    'According to him , earth is the carrier of his creative work being integrated into the artistic process only by minimal changes .',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9180, 0.4531],
#         [0.9180, 1.0000, 0.4746],
#         [0.4531, 0.4746, 1.0000]], dtype=torch.bfloat16)

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.6033
cosine_precision@1 0.6033
cosine_recall@1 0.5858
cosine_ndcg@10 0.7747
cosine_mrr@1 0.6033
cosine_map@100 0.728

Training Details

Training Dataset

LangCache Sentence Pairs (all)

  • Dataset: LangCache Sentence Pairs (all)
  • Size: 126,938 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 8 tokens
    • mean: 27.27 tokens
    • max: 49 tokens
    • min: 8 tokens
    • mean: 27.27 tokens
    • max: 48 tokens
    • min: 7 tokens
    • mean: 26.54 tokens
    • max: 61 tokens
  • Samples:
    anchor positive negative
    The newer Punts are still very much in existence today and race in the same fleets as the older boats . The newer punts are still very much in existence today and run in the same fleets as the older boats . how can I get financial freedom as soon as possible?
    The newer punts are still very much in existence today and run in the same fleets as the older boats . The newer Punts are still very much in existence today and race in the same fleets as the older boats . The older Punts are still very much in existence today and race in the same fleets as the newer boats .
    Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada . Turner Valley , , was located at Turner Valley Bar N Ranch Airport , southwest of Turner Valley Bar N Ranch , Alberta , Canada . Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada .
  • Loss: losses.ArcFaceInBatchLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Evaluation Dataset

LangCache Sentence Pairs (all)

  • Dataset: LangCache Sentence Pairs (all)
  • Size: 126,938 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 8 tokens
    • mean: 27.27 tokens
    • max: 49 tokens
    • min: 8 tokens
    • mean: 27.27 tokens
    • max: 48 tokens
    • min: 7 tokens
    • mean: 26.54 tokens
    • max: 61 tokens
  • Samples:
    anchor positive negative
    The newer Punts are still very much in existence today and race in the same fleets as the older boats . The newer punts are still very much in existence today and run in the same fleets as the older boats . how can I get financial freedom as soon as possible?
    The newer punts are still very much in existence today and run in the same fleets as the older boats . The newer Punts are still very much in existence today and race in the same fleets as the older boats . The older Punts are still very much in existence today and race in the same fleets as the newer boats .
    Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada . Turner Valley , , was located at Turner Valley Bar N Ranch Airport , southwest of Turner Valley Bar N Ranch , Alberta , Canada . Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada .
  • Loss: losses.ArcFaceInBatchLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • weight_decay: 0.001
  • adam_beta2: 0.98
  • adam_epsilon: 1e-06
  • max_steps: 100000
  • warmup_ratio: 0.1
  • load_best_model_at_end: True
  • optim: stable_adamw
  • ddp_find_unused_parameters: False
  • push_to_hub: True
  • hub_model_id: redis/langcache-embed-v3
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.001
  • adam_beta1: 0.9
  • adam_beta2: 0.98
  • adam_epsilon: 1e-06
  • max_grad_norm: 1.0
  • num_train_epochs: 3.0
  • max_steps: 100000
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: stable_adamw
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: False
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: redis/langcache-embed-v3
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss test_cosine_ndcg@10
-1 -1 - - 0.5952
0.0069 500 3.4812 0.6932 0.6810
0.0139 1000 0.6045 0.4804 0.7354
0.0208 1500 0.3127 0.4128 0.7437
0.0277 2000 0.2424 0.4077 0.7440
0.0347 2500 0.2027 0.3707 0.7501
0.0416 3000 0.1752 0.3453 0.7551
0.0485 3500 0.1622 0.3380 0.7540
0.0555 4000 0.1466 0.3185 0.7583
0.0624 4500 0.1392 0.3092 0.7588
0.0693 5000 0.1342 0.3054 0.7566
0.0762 5500 0.1291 0.2960 0.7582
0.0832 6000 0.1291 0.2856 0.7616
0.0901 6500 0.1199 0.2803 0.7624
0.0970 7000 0.1171 0.2692 0.7648
0.1040 7500 0.1097 0.2811 0.7629
0.1109 8000 0.1089 0.2901 0.7621
0.1178 8500 0.1088 0.2986 0.7568
0.1248 9000 0.109 0.2806 0.7628
0.1317 9500 0.1046 0.3050 0.7587
0.1386 10000 0.1035 0.2925 0.7596
0.1456 10500 0.1041 0.2940 0.7573
0.1525 11000 0.1023 0.2790 0.7632
0.1594 11500 0.0992 0.3293 0.7542
0.1664 12000 0.0996 0.2876 0.7570
0.1733 12500 0.0949 0.2881 0.7591
0.1802 13000 0.0921 0.2861 0.7598
0.1871 13500 0.0912 0.2763 0.7632
0.1941 14000 0.0912 0.2785 0.7643
0.2010 14500 0.0909 0.3198 0.7629
0.2079 15000 0.0911 0.3015 0.7575
0.2149 15500 0.0861 0.3029 0.7597
0.2218 16000 0.0857 0.3271 0.7568
0.2287 16500 0.0843 0.2579 0.7645
0.2357 17000 0.085 0.2923 0.7625
0.2426 17500 0.0846 0.3241 0.7598
0.2495 18000 0.083 0.3128 0.7616
0.2565 18500 0.0794 0.2926 0.7611
0.2634 19000 0.0806 0.2665 0.7640
0.2703 19500 0.0782 0.2963 0.7615
0.2773 20000 0.0786 0.2771 0.7611
0.2842 20500 0.0761 0.2853 0.7623
0.2911 21000 0.0752 0.2782 0.7626
0.2980 21500 0.0777 0.2680 0.7612
0.3050 22000 0.0782 0.2731 0.7636
0.3119 22500 0.0785 0.2627 0.7627
0.3188 23000 0.0741 0.2714 0.7613
0.3258 23500 0.0741 0.2713 0.7661
0.3327 24000 0.072 0.2630 0.7636
0.3396 24500 0.0739 0.2839 0.7648
0.3466 25000 0.07 0.2860 0.7634
0.3535 25500 0.0715 0.2612 0.7666
0.3604 26000 0.0711 0.2531 0.7671
0.3674 26500 0.0701 0.2682 0.7638
0.3743 27000 0.0733 0.2708 0.7635
0.3812 27500 0.0705 0.2873 0.7636
0.3882 28000 0.0663 0.2831 0.7647
0.3951 28500 0.0678 0.2825 0.7643
0.4020 29000 0.0691 0.2733 0.7654
0.4089 29500 0.0696 0.2831 0.7621
0.4159 30000 0.0708 0.2893 0.7643
0.4228 30500 0.0663 0.2758 0.7653
0.4297 31000 0.064 0.2589 0.7666
0.4367 31500 0.0636 0.2491 0.7681
0.4436 32000 0.0644 0.2601 0.7650
0.4505 32500 0.0655 0.2611 0.7668
0.4575 33000 0.0643 0.2597 0.7664
0.4644 33500 0.066 0.2696 0.7677
0.4713 34000 0.0664 0.2489 0.7690
0.4783 34500 0.0654 0.2644 0.7649
0.4852 35000 0.0653 0.2704 0.7665
0.4921 35500 0.0657 0.2578 0.7689
0.4991 36000 0.0634 0.2629 0.7669
0.5060 36500 0.0609 0.2631 0.7663
0.5129 37000 0.0646 0.2586 0.7667
0.5198 37500 0.0634 0.2572 0.7657
0.5268 38000 0.0607 0.2624 0.7664
0.5337 38500 0.0621 0.2622 0.7668
0.5406 39000 0.0614 0.2562 0.7676
0.5476 39500 0.0621 0.2560 0.7673
0.5545 40000 0.0608 0.2506 0.7684
0.5614 40500 0.0621 0.2718 0.7666
0.5684 41000 0.0598 0.2599 0.7700
0.5753 41500 0.06 0.2706 0.7687
0.5822 42000 0.0618 0.2635 0.7694
0.5892 42500 0.0604 0.2743 0.7660
0.5961 43000 0.0576 0.2733 0.7661
0.6030 43500 0.0597 0.2644 0.7712
0.6100 44000 0.0592 0.2516 0.7694
0.6169 44500 0.0599 0.2461 0.7688
0.6238 45000 0.056 0.2438 0.7686
0.6307 45500 0.0573 0.2513 0.7703
0.6377 46000 0.0571 0.2526 0.7694
0.6446 46500 0.0573 0.2529 0.7702
0.6515 47000 0.0553 0.2529 0.7694
0.6585 47500 0.0541 0.2518 0.7707
0.6654 48000 0.0561 0.2471 0.7725
0.6723 48500 0.0558 0.2440 0.7710
0.6793 49000 0.0555 0.2556 0.7691
0.6862 49500 0.056 0.2479 0.7721
0.6931 50000 0.0564 0.2463 0.7723
0.7001 50500 0.0539 0.2561 0.7728
0.7070 51000 0.0526 0.2416 0.7725
0.7139 51500 0.0561 0.2501 0.7723
0.7209 52000 0.0545 0.2316 0.7732
0.7278 52500 0.0545 0.2352 0.7739
0.7347 53000 0.05 0.2278 0.7734
0.7416 53500 0.0515 0.2308 0.7730
0.7486 54000 0.0528 0.2524 0.7727
0.7555 54500 0.0509 0.2645 0.7717
0.7624 55000 0.0514 0.2659 0.7708
0.7694 55500 0.0503 0.2570 0.7725
0.7763 56000 0.0538 0.2524 0.7724
0.7832 56500 0.0477 0.2537 0.7719
0.7902 57000 0.0514 0.2333 0.7733
0.7971 57500 0.05 0.2420 0.7722
0.8040 58000 0.0518 0.2342 0.7734
0.8110 58500 0.0508 0.2402 0.7730
0.8179 59000 0.0474 0.2477 0.7711
0.8248 59500 0.0493 0.2465 0.7723
0.8318 60000 0.0492 0.2448 0.7731
0.8387 60500 0.0496 0.2498 0.7733
0.8456 61000 0.0479 0.2505 0.7721
0.8525 61500 0.0445 0.2449 0.7745
0.8595 62000 0.0477 0.2507 0.7748
0.8664 62500 0.0491 0.2551 0.7716
0.8733 63000 0.0474 0.2451 0.7743
0.8803 63500 0.0452 0.2464 0.7741
0.8872 64000 0.0482 0.2412 0.7742
0.8941 64500 0.0483 0.2444 0.7736
0.9011 65000 0.0485 0.2456 0.7724
0.9080 65500 0.045 0.2493 0.7730
0.9149 66000 0.0496 0.2499 0.7721
0.9219 66500 0.0461 0.2474 0.7748
0.9288 67000 0.0465 0.2432 0.7743
0.9357 67500 0.0477 0.2432 0.7729
0.9427 68000 0.0425 0.2491 0.7740
0.9496 68500 0.0452 0.2445 0.7736
0.9565 69000 0.046 0.2397 0.7742
0.9634 69500 0.0449 0.2539 0.7731
0.9704 70000 0.0462 0.2446 0.7745
0.9773 70500 0.0435 0.2385 0.7742
0.9842 71000 0.0469 0.2334 0.7750
0.9912 71500 0.0447 0.2312 0.7745
0.9981 72000 0.0465 0.2361 0.7737
1.0050 72500 0.0341 0.2359 0.7728
1.0120 73000 0.03 0.2405 0.7727
1.0189 73500 0.029 0.2241 0.7724
1.0258 74000 0.0284 0.2297 0.7740
1.0328 74500 0.0273 0.2317 0.7735
1.0397 75000 0.0291 0.2352 0.7727
1.0466 75500 0.0286 0.2439 0.7724
1.0536 76000 0.0268 0.2336 0.7732
1.0605 76500 0.0276 0.2298 0.7728
1.0674 77000 0.0279 0.2268 0.7726
1.0743 77500 0.0283 0.2206 0.7738
1.0813 78000 0.0277 0.2263 0.7733
1.0882 78500 0.0285 0.2228 0.7740
1.0951 79000 0.0283 0.2250 0.7729
1.1021 79500 0.0276 0.2200 0.7730
1.1090 80000 0.0276 0.2221 0.7739
1.1159 80500 0.0268 0.2279 0.7730
1.1229 81000 0.0274 0.2302 0.7733
1.1298 81500 0.0281 0.2286 0.7736
1.1367 82000 0.0267 0.2306 0.7733
1.1437 82500 0.0267 0.2348 0.7731
1.1506 83000 0.0278 0.2301 0.7729
1.1575 83500 0.028 0.2240 0.7738
1.1645 84000 0.0282 0.2196 0.7744
1.1714 84500 0.0264 0.2241 0.7737
1.1783 85000 0.0258 0.2252 0.7736
1.1852 85500 0.027 0.2196 0.7742
1.1922 86000 0.0256 0.2189 0.7739
1.1991 86500 0.0259 0.2174 0.7749
1.2060 87000 0.0262 0.2209 0.7751
1.2130 87500 0.0265 0.2202 0.7739
1.2199 88000 0.025 0.2228 0.7737
1.2268 88500 0.0266 0.2233 0.7739
1.2338 89000 0.0261 0.2255 0.7736
1.2407 89500 0.0271 0.2219 0.7746
1.2476 90000 0.0256 0.2185 0.7757
1.2546 90500 0.0257 0.2190 0.7758
1.2615 91000 0.0239 0.2210 0.7750
1.2684 91500 0.0252 0.2236 0.7743
1.2754 92000 0.0245 0.2238 0.7743
1.2823 92500 0.0267 0.2234 0.7747
1.2892 93000 0.025 0.2235 0.7746
1.2961 93500 0.0246 0.2298 0.7740
1.3031 94000 0.0266 0.2239 0.7744
1.3100 94500 0.0256 0.2231 0.7740
1.3169 95000 0.0265 0.2214 0.7744
1.3239 95500 0.0253 0.2221 0.7747
1.3308 96000 0.0251 0.2222 0.7742
1.3377 96500 0.0244 0.2211 0.7748
1.3447 97000 0.0249 0.2216 0.7750
1.3516 97500 0.0257 0.2215 0.7745
1.3585 98000 0.0263 0.2215 0.7749
1.3655 98500 0.0258 0.2209 0.7749
1.3724 99000 0.0255 0.2212 0.7748
1.3793 99500 0.0252 0.2213 0.7751
1.3863 100000 0.0257 0.2213 0.7747
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.3
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.0
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}