CrossEncoder based on BAAI/bge-reranker-v2-m3

This is a Cross Encoder model finetuned from BAAI/bge-reranker-v2-m3 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

  • Model Type: Cross Encoder
  • Base model: BAAI/bge-reranker-v2-m3
  • Maximum Sequence Length: 1024 tokens
  • Number of Output Labels: 1 label

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
    ['What was the 2011 Census population of this village and civil parish in Lincolnshire, where Regents Academy is based?', 'Manby. Manby is a village and civil parish in the East Lindsey district of Lincolnshire, England, and lies approximately 5 mi east from Louth.  The 2001 Census recorded a village population of 833, reducing to 759 at the 2011 Census.'],
    ['Piña colada and Aperol Spritz can both be described as what kind of drink?', '2-Acetylpyridine. 2-Acetylpyridine is an organic compound with the formula CH3COC5H4N.  It is a viscous colorless liquid that is widely used as a flavoring substance.  It is found in malt and produced by the Maillard reaction and by nixtamalization.  It contributes to the flavor of corn tortillas, popcorn, and beer.'],
    ['What song by a Barbadian singer was covered by Marié Christina Digby?', 'Umbrella (song). "Umbrella" is a song by Barbadian singer Rihanna from her third studio album "Good Girl Gone Bad" (2007).  It features American rapper Jay-Z, who co-wrote the song with its producers Tricky Stewart and Kuk Harrell, with additional writing from The-Dream.  The song was originally written with Britney Spears in mind, but her label rejected it.  "Umbrella" is a pop, hip hop and R&B song referring to a romantic and platonic relationship and the strength of that relationship.'],
    ["When was the  Roman politician and general died during who's reign Gaius Nasennius were soldiers?", "Narses. Narses (also sometimes written Nerses; Armenian: Նարսես ; Greek: Ναρσής ; 478–573) was, with Belisarius, one of the great generals in the service of the Byzantine Emperor Justinian I during the Roman reconquest that took place during Justinian's reign.  A Romanized Armenian, Narses spent most of his life as an important eunuch in the palace of the emperors in Constantinople."],
    ['California Tortilla was voted as having the best burritos in both 2009 and 2010 by a magazine founded in what year?', 'California Tortilla. California Tortilla, also known as CalTort, is a chain of franchised fast casual Mexican-style restaurants, the first of which was opened in August 1995 in Bethesda, Maryland by business partners Pam Felix and Alan Cohen.  The chain\'s menu, which features Mission burritos, is comparable to that of its competitors, such as Baja Fresh and Chipotle Mexican Grill.  A typical restaurant has 2500 sqft with seating for 75 people.  California Tortilla was voted by readers of "Washingtonian" magazine as having the best burritos in both 2009 and 2010, and "best Mexican" in 2014 and 2015.  The chain sold its 5 millionth burrito on August 22, 2007.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'What was the 2011 Census population of this village and civil parish in Lincolnshire, where Regents Academy is based?',
    [
        'Manby. Manby is a village and civil parish in the East Lindsey district of Lincolnshire, England, and lies approximately 5 mi east from Louth.  The 2001 Census recorded a village population of 833, reducing to 759 at the 2011 Census.',
        '2-Acetylpyridine. 2-Acetylpyridine is an organic compound with the formula CH3COC5H4N.  It is a viscous colorless liquid that is widely used as a flavoring substance.  It is found in malt and produced by the Maillard reaction and by nixtamalization.  It contributes to the flavor of corn tortillas, popcorn, and beer.',
        'Umbrella (song). "Umbrella" is a song by Barbadian singer Rihanna from her third studio album "Good Girl Gone Bad" (2007).  It features American rapper Jay-Z, who co-wrote the song with its producers Tricky Stewart and Kuk Harrell, with additional writing from The-Dream.  The song was originally written with Britney Spears in mind, but her label rejected it.  "Umbrella" is a pop, hip hop and R&B song referring to a romantic and platonic relationship and the strength of that relationship.',
        "Narses. Narses (also sometimes written Nerses; Armenian: Նարսես ; Greek: Ναρσής ; 478–573) was, with Belisarius, one of the great generals in the service of the Byzantine Emperor Justinian I during the Roman reconquest that took place during Justinian's reign.  A Romanized Armenian, Narses spent most of his life as an important eunuch in the palace of the emperors in Constantinople.",
        'California Tortilla. California Tortilla, also known as CalTort, is a chain of franchised fast casual Mexican-style restaurants, the first of which was opened in August 1995 in Bethesda, Maryland by business partners Pam Felix and Alan Cohen.  The chain\'s menu, which features Mission burritos, is comparable to that of its competitors, such as Baja Fresh and Chipotle Mexican Grill.  A typical restaurant has 2500 sqft with seating for 75 people.  California Tortilla was voted by readers of "Washingtonian" magazine as having the best burritos in both 2009 and 2010, and "best Mexican" in 2014 and 2015.  The chain sold its 5 millionth burrito on August 22, 2007.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Binary Classification

Metric validation train_subset
accuracy 0.9367 0.908
accuracy_threshold 0.7832 0.9297
f1 0.9389 0.9132
f1_threshold 0.7832 0.8477
precision 0.9068 0.8897
recall 0.9733 0.938
average_precision 0.9587 0.9412

Training Details

Training Dataset

Unnamed Dataset

  • Size: 8,000 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 22 characters
    • mean: 100.09 characters
    • max: 498 characters
    • min: 85 characters
    • mean: 533.66 characters
    • max: 1965 characters
    • min: 0.0
    • mean: 0.51
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    What was the 2011 Census population of this village and civil parish in Lincolnshire, where Regents Academy is based? Manby. Manby is a village and civil parish in the East Lindsey district of Lincolnshire, England, and lies approximately 5 mi east from Louth. The 2001 Census recorded a village population of 833, reducing to 759 at the 2011 Census. 1.0
    Piña colada and Aperol Spritz can both be described as what kind of drink? 2-Acetylpyridine. 2-Acetylpyridine is an organic compound with the formula CH3COC5H4N. It is a viscous colorless liquid that is widely used as a flavoring substance. It is found in malt and produced by the Maillard reaction and by nixtamalization. It contributes to the flavor of corn tortillas, popcorn, and beer. 0.0
    What song by a Barbadian singer was covered by Marié Christina Digby? Umbrella (song). "Umbrella" is a song by Barbadian singer Rihanna from her third studio album "Good Girl Gone Bad" (2007). It features American rapper Jay-Z, who co-wrote the song with its producers Tricky Stewart and Kuk Harrell, with additional writing from The-Dream. The song was originally written with Britney Spears in mind, but her label rejected it. "Umbrella" is a pop, hip hop and R&B song referring to a romantic and platonic relationship and the strength of that relationship. 1.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss validation_average_precision train_subset_average_precision
0.125 250 - 0.9311 0.9034
0.25 500 0.4545 0.9403 0.9061
0.375 750 - 0.9398 0.9220
0.5 1000 0.3954 0.9485 0.9292
0.625 1250 - 0.9500 0.9263
0.75 1500 0.3922 0.9519 0.9235
0.875 1750 - 0.9556 0.9224
1.0 2000 0.4131 0.9587 0.9412

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 5.2.2
  • Transformers: 4.44.2
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
13
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OloriBern/hotpotqa-hybrid-2000

Finetuned
(65)
this model

Paper for OloriBern/hotpotqa-hybrid-2000

Evaluation results