AI & ML interests
In the following you find models tuned to be used for sentence / text embedding generation. They can be used with the sentence-transformers package.
Recent Activity
View all activity
Organization Card
SentenceTransformers 🤗 is a Python framework for state-of-the-art sentence, text and image embeddings.
Install the Sentence Transformers library.
pip install -U sentence-transformers
The usage is as simple as:
from sentence_transformers import SentenceTransformer
# 1. Load a pretrained Sentence Transformer model
model = SentenceTransformer("all-MiniLM-L6-v2")
# The sentences to encode
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium.",
]
# 2. Calculate embeddings by calling model.encode()
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# 3. Calculate the embedding similarities
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6660, 0.1046],
# [0.6660, 1.0000, 0.1411],
# [0.1046, 0.1411, 1.0000]])
Hugging Face makes it easy to collaboratively build and showcase your Sentence Transformers models! You can collaborate with your organization, upload and showcase your own models in your profile ❤️
Documentation
Push your Sentence Transformers models to the Hub ❤️
Find all Sentence Transformers models on the 🤗 Hub
To upload your Sentence Transformers models to the Hugging Face Hub, log in with huggingface-cli login and use the push_to_hub method within the Sentence Transformers library.
from sentence_transformers import SentenceTransformer
# Load or train a model
model = SentenceTransformer(...)
# Push to Hub
model.push_to_hub("my_new_model")
A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers
These datasets all have "english" and "non_english" columns for numerous datasets. They can be used to make embedding models multilingual.
-
sentence-transformers/parallel-sentences-wikititles
Viewer • Updated • 14.7M • 84 • 1 -
sentence-transformers/parallel-sentences-tatoeba
Viewer • Updated • 8.35M • 3.37k -
sentence-transformers/parallel-sentences-talks
Viewer • Updated • 19.6M • 2.71k • 12 -
sentence-transformers/parallel-sentences-europarl
Viewer • Updated • 49.7M • 879 • 1
A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers
These datasets all have "english" and "non_english" columns for numerous datasets. They can be used to make embedding models multilingual.
-
sentence-transformers/parallel-sentences-wikititles
Viewer • Updated • 14.7M • 84 • 1 -
sentence-transformers/parallel-sentences-tatoeba
Viewer • Updated • 8.35M • 3.37k -
sentence-transformers/parallel-sentences-talks
Viewer • Updated • 19.6M • 2.71k • 12 -
sentence-transformers/parallel-sentences-europarl
Viewer • Updated • 49.7M • 879 • 1
models
127
sentence-transformers/embeddinggemma-300m-medical
Sentence Similarity
•
0.3B
•
Updated
•
13.6k
•
•
36
sentence-transformers/paraphrase-multilingual-mpnet-base-v2
Sentence Similarity
•
0.3B
•
Updated
•
6.52M
•
•
428
sentence-transformers/stsb-mpnet-base-v2
Sentence Similarity
•
0.1B
•
Updated
•
10.8k
•
•
13
sentence-transformers/paraphrase-mpnet-base-v2
Sentence Similarity
•
0.1B
•
Updated
•
1.59M
•
•
46
sentence-transformers/nli-mpnet-base-v2
Sentence Similarity
•
0.1B
•
Updated
•
120k
•
•
15
sentence-transformers/multi-qa-mpnet-base-dot-v1
Sentence Similarity
•
0.1B
•
Updated
•
5.12M
•
•
184
sentence-transformers/multi-qa-mpnet-base-cos-v1
Sentence Similarity
•
0.1B
•
Updated
•
671k
•
•
42
sentence-transformers/all-mpnet-base-v1
Sentence Similarity
•
0.1B
•
Updated
•
12.8k
•
•
12
sentence-transformers/all-mpnet-base-v2
Sentence Similarity
•
0.1B
•
Updated
•
23M
•
•
1.2k
sentence-transformers/average_word_embeddings_levy_dependency
Sentence Similarity
•
Updated
datasets
89
sentence-transformers/msmarco-scores-ms-marco-MiniLM-L6-v2
Viewer
•
Updated
•
241M
•
89
•
2
sentence-transformers/msmarco
Viewer
•
Updated
•
527M
•
1.54k
•
6
sentence-transformers/msmarco-msmarco-MiniLM-L6-v3
Viewer
•
Updated
•
80.6M
•
796
•
4
sentence-transformers/NanoTouche2020-bm25
Viewer
•
Updated
•
5.84k
•
54
•
1
sentence-transformers/NanoSciFact-bm25
Viewer
•
Updated
•
3.02k
•
306
sentence-transformers/NanoArguAna-bm25
Viewer
•
Updated
•
3.74k
•
72
sentence-transformers/NanoSCIDOCS-bm25
Viewer
•
Updated
•
2.31k
•
313
sentence-transformers/NanoQuoraRetrieval-bm25
Viewer
•
Updated
•
5.15k
•
274
sentence-transformers/NanoNQ-bm25
Viewer
•
Updated
•
5.14k
•
611
sentence-transformers/NanoNFCorpus-bm25
Viewer
•
Updated
•
3.05k
•
575