Instructions to use ContextualAI/ctxl-rerank-v2-instruct-multilingual-6b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ContextualAI/ctxl-rerank-v2-instruct-multilingual-6b with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ContextualAI/ctxl-rerank-v2-instruct-multilingual-6b") model = AutoModelForCausalLM.from_pretrained("ContextualAI/ctxl-rerank-v2-instruct-multilingual-6b") - sentence-transformers
How to use ContextualAI/ctxl-rerank-v2-instruct-multilingual-6b with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("ContextualAI/ctxl-rerank-v2-instruct-multilingual-6b") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
Integrate with Sentence Transformers v5.4
Hello!
Pull Request overview
- Integrate this model with Sentence Transformers v5.4+ so it can be loaded via
CrossEncoder("ContextualAI/ctxl-rerank-v2-instruct-multilingual-6b").
Details
This is the 6B sibling PR to https://huggingface.co/ContextualAI/ctxl-rerank-v2-instruct-multilingual-1b/discussions/2, with the same changes in place. There is one exception, however: the BOS token is emitted explicitly inside the chat template. Unlike the 1B and 2B, this model has "add_bos_token": true and "bos_token": "<s>", so I've included it in the chat template.
import torch
from sentence_transformers import CrossEncoder
model = CrossEncoder("ContextualAI/ctxl-rerank-v2-instruct-multilingual-6b", model_kwargs={"dtype": torch.bfloat16}, revision="refs/pr/1")
query = "What are the health benefits of exercise?"
instruction = "Prioritize recent medical research"
documents = [
"Regular exercise reduces risk of heart disease and improves mental health.",
"A 2024 study shows exercise enhances cognitive function in older adults.",
"Ancient Greeks valued physical fitness for military training.",
]
pairs = [(query, doc) for doc in documents]
scores = model.predict(pairs, prompt=instruction)
print(scores)
# [ -4.6875 -2.171875 -12.4375 ]
rankings = model.rank(query, documents, prompt=instruction)
print(rankings)
# [{'corpus_id': 1, 'score': np.float32(-2.171875)}, {'corpus_id': 0, 'score': np.float32(-4.6875)}, {'corpus_id': 2, 'score': np.float32(-12.4375)}]
You can run this outright due to the revisionargument. After merging, the revision argument isn't needed anymore.
Note that none of the old behaviour is affected or changed: this only adds an additional way to run the model in a familiar and common format. The raw AutoModelForCausalLM and vLLM paths already documented in the README continue to work unchanged, and the Sentence Transformers path produces identical bfloat16 scores to those paths for every sample I tested (0.0 diff vs. the README's Transformers baseline on 3/3 examples, with and without an instruction).
- Tom Aarsen