code-reranker-miniLM-staqc

A fine-tuned cross-encoder based on cross-encoder/ms-marco-MiniLM-L-6-v2 for reranking Python code snippets based on natural language queries from Stack Overflow.

Model Description

This model is a cross-encoder trained on the StaQC dataset (Stack Overflow Question-Code pairs) to rerank relevant Python code snippets given a programming question or natural language intent. It is specifically fine-tuned for Python code search and retrieval tasks where accurate relevance scoring is important.

  • Architecture: Cross-Encoder based on MiniLM-L6
  • Base model: cross-encoder/ms-marco-MiniLM-L-6-v2
  • Fine-tuned on: StaQC SCA (Stack Overflow Question-Code) dataset
  • Task: Python code snippet reranking for natural language queries
  • Language: Python code snippets

Use Cases

  • Python code search engines
  • Developer assistants for Python programming
  • AI coding agents with natural language interfaces
  • Evaluation modules in RAG pipelines for Python programming use cases
  • Code recommendation systems

Evaluation Results

The model was evaluated on 500 query-code candidates from the Conala curated dataset.

Metric Value
MRR 0.938
Top‑1 Accuracy 0.910

How to Use

Using sentence-transformers

from sentence_transformers import CrossEncoder

# Load the model
model = CrossEncoder("NamanAgnih0tri/code-reranker-miniLM-staqc")

# Sample input
query = "How to convert a string to int in Python?"
code_snippet = "int_value = int('123')"

# Get relevance score
score = model.predict([query, code_snippet])
print(f"Relevance Score: {score:.4f}")

Using transformers directly

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("NamanAgnih0tri/code-reranker-miniLM-staqc")
model = AutoModelForSequenceClassification.from_pretrained("NamanAgnih0tri/code-reranker-miniLM-staqc")

# Sample input
query = "How to reverse a string in Python?"
code_snippet = "def reverse_string(s):\n    return s[::-1]"

# Tokenize and predict relevance
inputs = tokenizer(query, code_snippet, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    logits = model(**inputs).logits
    score = logits[0].item()

print(f"Relevance Score: {score:.4f}")

Code Ranking Example

from sentence_transformers import CrossEncoder

model = CrossEncoder("NamanAgnih0tri/code-reranker-miniLM-staqc")

def rank_code_snippets(query, candidates):
    """Rank code snippets by relevance to the query."""
    pairs = [[query, code] for code in candidates]
    scores = model.predict(pairs)
    ranked_results = sorted(zip(candidates, scores), key=lambda x: x[1], reverse=True)
    return ranked_results

# Example usage
query = "How to reverse a string in Python?"
candidates = [
    "def reverse_string(s):\n    return s[::-1]",
    "print('hello'[::-1])",
    "def add(a,b):\n    return a + b",
    "list = [1,2,3,4]"
]

ranked_results = rank_code_snippets(query, candidates)
for rank, (code, score) in enumerate(ranked_results, 1):
    print(f"{rank}. Score: {score:.4f}\n{code}\n")

Dataset

  • StaQC SCA (Stack Overflow Question-Code pairs)
  • Each pair consists of a natural language programming question and a corresponding Python code snippet
  • Positive and negative pairs were used for contrastive fine-tuning
  • Dataset contains 85,294 training examples

Training Details

  • Base Model: cross-encoder/ms-marco-MiniLM-L-6-v2
  • Optimizer: AdamW
  • Epochs: 3
  • Batch size: 8
  • Learning rate: 2e-5
  • Loss: Cosine Similarity Loss
  • Training samples: 170,588 (including negative samples)
  • Warmup steps: 10% of total training steps

Model Performance Comparison

Model MRR Top-1 Accuracy
code-reranker-miniLM-staqc 0.938 0.910
cross-encoder/ms-marco-MiniLM-L-6-v2 0.895 0.844
cross-encoder/ms-marco-TinyBERT-L-2-v2 0.823 0.756

Limitations

  • Trained specifically on Python code snippets; may not generalize well to other programming languages
  • Model is relatively small; performance may lag behind larger rerankers on complex queries
  • Fine-tuned on Stack Overflow-like questions; may not generalize to code from other domains

Citation

If you use this model in your work, please cite it as:

@misc{code-reranker-miniLM-staqc,
  title={Code Reranker using MiniLM and StaQC for Python Code Search},
  author={Naman Agnihotri},
  year={2025},
  howpublished={\url{https://huggingface.co/NamanAgnih0tri/code-reranker-miniLM-staqc}}
}

Author

License

This model is licensed under the Apache 2.0 License.

Downloads last month
23
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

  • MRR on StaQC (Stack Overflow Question-Code)
    self-reported
    0.938
  • Top-1 Accuracy on StaQC (Stack Overflow Question-Code)
    self-reported
    0.910