You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

berg-embed/Qwen3_1.7B_Hessian_FSDP_ckpt2600

A sentence-transformers embedding model based on Qwen/Qwen3-1.7B. Maps sentences & paragraphs to a 2048-dimensional dense vector space using last-token pooling and L2 normalization.

Key Details

Property Value
Base model Qwen/Qwen3-1.7B
Output dimensions 2048
Max sequence length 4096 tokens
Pooling Last token
Normalization L2
Similarity Cosine

Input Format

This model uses a chat-template format for inputs. Both queries and documents must be wrapped:

<|im_start|>instruction
{instruction}
<|im_end|>
<|im_start|>content
{text}
<|im_end|>

Usage with SentenceTransformer (recommended)

pip install -U sentence-transformers
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("berg-embed/Qwen3_1.7B_Hessian_FSDP_ckpt2600")

def format_input(instruction: str, text: str) -> str:
    return (
        "<|im_start|>instruction\n"
        f"{instruction}\n"
        "<|im_end|>\n"
        "<|im_start|>content\n"
        f"{text}\n"
        "<|im_end|>"
    )

# Encode queries
queries = [format_input("Retrieve documents that answer this question", "What is photosynthesis?")]
query_emb = model.encode(queries, normalize_embeddings=True)

# Encode documents
docs = [format_input("Represent this document for retrieval", "Photosynthesis is the process by which plants convert sunlight into energy.")]
doc_emb = model.encode(docs, normalize_embeddings=True)

# Compute similarity
similarity = model.similarity(query_emb, doc_emb)
print(similarity)

SentenceTransformer handles tokenizer setup (left-padding), last-token pooling, and normalization automatically from the model config.

Usage with Transformers (AutoModel)

This is how the model is used in our internal evaluation pipeline (BRIGHT benchmark).

import torch
import torch.nn.functional as F
from transformers import AutoTokenizer, AutoModel

checkpoint = "berg-embed/Qwen3_1.7B_Hessian_FSDP_ckpt2600"

tokenizer = AutoTokenizer.from_pretrained(checkpoint, padding_side="left", trust_remote_code=True)
model = AutoModel.from_pretrained(checkpoint, trust_remote_code=True, torch_dtype=torch.bfloat16)
model.eval().cuda()

def last_token_pool(hidden_states, attention_mask):
    left_padding = (attention_mask[:, -1].sum() == attention_mask.shape[0])
    if left_padding:
        return hidden_states[:, -1]
    seq_lens = attention_mask.sum(dim=1) - 1
    return hidden_states[torch.arange(hidden_states.size(0), device=hidden_states.device), seq_lens]

def format_input(instruction: str, text: str) -> str:
    return (
        "<|im_start|>instruction\n"
        f"{instruction}\n"
        "<|im_end|>\n"
        "<|im_start|>content\n"
        f"{text}\n"
        "<|im_end|>"
    )

def encode(texts, max_length=4096):
    inputs = tokenizer(texts, padding=True, truncation=True, max_length=max_length, return_tensors="pt").to("cuda")
    with torch.no_grad():
        outputs = model(**inputs)
    emb = last_token_pool(outputs.last_hidden_state, inputs["attention_mask"])
    return F.normalize(emb, p=2, dim=1)

# Encode
queries = [format_input("Retrieve documents that answer this question", "What is photosynthesis?")]
docs = [format_input("Represent this document for retrieval", "Photosynthesis is ...")]

q_emb = encode(queries)
d_emb = encode(docs)
scores = (q_emb @ d_emb.T).tolist()

Critical settings for AutoModel usage:

  • padding_side="left" on the tokenizer (required for last-token pooling)
  • last_token_pool() to extract the embedding from the last non-padding token
  • F.normalize() for L2 normalization

Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 4096, 'do_lower_case': False, 'architecture': 'Qwen3Model'})
  (1): Pooling({'word_embedding_dimension': 2048, 'pooling_mode_lasttoken': True})
  (2): Normalize()
)
Downloads last month
-
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for berg-embed/Qwen3_1.7B_Hessian_FSDP_ckpt2600

Finetuned
Qwen/Qwen3-1.7B
Finetuned
(544)
this model