LamaDiab's picture
Updating model weights
72085b0 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:1339067
  - loss:MultipleNegativesSymmetricRankingLoss
base_model: sentence-transformers/all-MiniLM-L6-v2
widget:
  - source_sentence: kids' slippers with a sporty neon green design (size 24-25)
    sentences:
      - slipper
      - colorful slipper
      - black winter hoodie
  - source_sentence: male teacher figurine
    sentences:
      - home decor accessory
      - ' figurine'
      - natural hair flat oil brush 579 size 5
  - source_sentence: diana emerald
    sentences:
      - lantana crochet handbag
      - emerald earrings
      - earring
  - source_sentence: brown smoked salmon wrap + small orange juice
    sentences:
      - deli
      - ' orange juice'
      - boncafe puro arabica nespresso compatible
  - source_sentence: black seed oil
    sentences:
      - essentials lip gloss temptation - rusty brown
      - essential oils
      - natural black seed oil
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy
model-index:
  - name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
    results:
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: cosine_accuracy
            value: 0.9774984121322632
            name: Cosine Accuracy

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("LamaDiab/MiniLM-v2-v36-overlapbatch-SemanticEngine")
# Run inference
sentences = [
    'black seed oil',
    'natural black seed oil',
    'essentials lip gloss temptation - rusty brown',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9524, 0.1949],
#         [0.9524, 1.0000, 0.1884],
#         [0.1949, 0.1884, 1.0000]])

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.9775

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,339,067 training samples
  • Columns: anchor, positive, and itemCategory
  • Approximate statistics based on the first 1000 samples:
    anchor positive itemCategory
    type string string string
    details
    • min: 3 tokens
    • mean: 10.44 tokens
    • max: 30 tokens
    • min: 3 tokens
    • mean: 6.12 tokens
    • max: 25 tokens
    • min: 3 tokens
    • mean: 4.03 tokens
    • max: 9 tokens
  • Samples:
    anchor positive itemCategory
    adults tie-dye pant pants trousers
    manicure remover vanilla fruit fragrances nail polish remover nailcare
    canvas frame painting acrylic colors 5 canvas painting
  • Loss: MultipleNegativesSymmetricRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 9,466 evaluation samples
  • Columns: anchor, positive, negative, and itemCategory
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative itemCategory
    type string string string string
    details
    • min: 3 tokens
    • mean: 9.65 tokens
    • max: 32 tokens
    • min: 2 tokens
    • mean: 5.83 tokens
    • max: 131 tokens
    • min: 3 tokens
    • mean: 9.09 tokens
    • max: 34 tokens
    • min: 3 tokens
    • mean: 3.82 tokens
    • max: 9 tokens
  • Samples:
    anchor positive negative itemCategory
    extra bubblemint sugar free chewing gum extra gum céleste belgian chocolate sablé sweet
    golden pothos evergreen plant spider-man action figure plant
    effortless style slit linen pants - beige soft pants the one lilac trousers
  • Loss: MultipleNegativesSymmetricRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • learning_rate: 3e-05
  • weight_decay: 0.01
  • warmup_ratio: 0.1
  • fp16: True
  • dataloader_num_workers: 1
  • dataloader_prefetch_factor: 2
  • dataloader_persistent_workers: True
  • push_to_hub: True
  • hub_model_id: LamaDiab/MiniLM-v2-v36-overlapbatch-SemanticEngine
  • hub_strategy: all_checkpoints

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 3e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 1
  • dataloader_prefetch_factor: 2
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: True
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: LamaDiab/MiniLM-v2-v36-overlapbatch-SemanticEngine
  • hub_strategy: all_checkpoints
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss cosine_accuracy
0.0002 1 2.4739 - -
0.1912 1000 1.904 1.0658 0.9567
0.3823 2000 1.4035 0.9598 0.9651
0.5735 3000 1.1812 0.9080 0.9709
0.7647 4000 1.1325 0.8635 0.9729
0.9558 5000 1.6713 0.8611 0.9725
1.1470 6000 1.5416 0.8358 0.9737
1.3380 7000 1.1634 0.8124 0.9753
1.5291 8000 1.1446 0.8198 0.9762
1.7202 9000 1.1326 0.8060 0.9760
1.9113 10000 1.1111 0.8110 0.9765
2.1024 11000 1.062 0.8092 0.9778
2.2935 12000 1.0553 0.8073 0.9771
2.4846 13000 1.0477 0.8019 0.9777
2.6757 14000 1.0424 0.8012 0.9778
2.8668 15000 1.0563 0.8009 0.9775

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 5.1.2
  • Transformers: 4.53.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.9.0
  • Datasets: 4.4.1
  • Tokenizers: 0.21.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}