TrialChecker-0825
TrialChecker-0825 is a binary text classifier that estimates whether a given clinical trial “space” is a reasonable consideration for a patient, given the patient’s summary.
It is fine-tuned from [answerdotai/ModernBERT-large] for sequence classification on pairs of (trial space, patient summary).
Important: This is a research prototype for model development, not a medical device and not intended for clinical decision-making.
What counts as a “trial space”?
A trial space is a concise description of the target population a trial aims to enroll, focusing on:
- Cancer type & histology
- Burden of disease (curative vs metastatic)
- Prior or excluded treatments
- Required / excluded biomarkers
(Boilerplate exclusion rules—e.g., heart failure, uncontrolled brain mets—are not part of the trial space itself. They can be screened separately by OncoReasoning-3B or BoilerplateChecker-0825 or other logic.)
Training summary
The classifier was trained with a script that:
- Loads three sources of annotated patient–trial pairs:
- Pairs originating from space-specific eligibility checks
- “Patient→top-cohorts” checks (rounds 1–3)
- “Trial-space→top patients” checks (rounds 1–3)
- Deduplicates by
['patient_summary', 'this_space'] - Builds the final text input as:
text = this\_space + "\nNow here is the patient summary:" + patient\_summary
- Uses
eligibility_resultas the binary label (0/1) - Model is ModernBERT-large (sequence classification, 2 labels) at max_length 2048
Key hyperparameters from training
- Base model:
answerdotai/ModernBERT-large - Max length: 2048
- Optimizer settings:
learning_rate=2e-5,weight_decay=0.01 - Batch size:
per_device_train_batch_size=4 - Epochs:
2 - Save strategy:
epoch - Tokenizer:
AutoTokenizer.from_pretrained("answerdotai/ModernBERT-large") - Data collator:
DataCollatorWithPadding
Intended use
- Input: a string describing the trial space and a patient summary string
- Output: probability that the trial is a reasonable consideration for that patient (not full eligibility)
Use cases:
- Ranking candidate trial spaces for a patient
- Early triage before detailed eligibility review (including boilerplate exclusions)
Out of scope:
- Confirming formal eligibility or safety
- Clinical decision support
Inference (Transformers)
Quick start (single example)
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
device = "cuda" if torch.cuda.is_available() else "cpu"
MODEL_REPO = "ksg-dfci/TrialChecker-0825"
tok = AutoTokenizer.from_pretrained(MODEL_REPO)
model = AutoModelForSequenceClassification.from_pretrained(MODEL_REPO).to(device)
model.eval()
this_space = (
"Cancer type allowed: non-small cell lung cancer. "
"Histology allowed: adenocarcinoma. "
"Cancer burden allowed: metastatic disease. "
"Prior treatment required: prior platinum-based chemo-immunotherapy allowed. "
"Biomarkers required: ALK fusion."
)
patient_summary = (
"Dx 2022 lung adenocarcinoma; metastatic to bone. Prior carbo/pem/pembro "
"with best PR; ALK fusion detected by NGS. ECOG 1."
)
text = this_space + "\nNow here is the patient summary:" + patient_summary
enc = tok(text, return_tensors="pt", truncation=True, max_length=2048).to(device)
with torch.no_grad():
logits = model(**enc).logits
probs = logits.softmax(-1).squeeze(0)
# Label mapping was set in training: {0: "NEGATIVE", 1: "POSITIVE"}
p_positive = float(probs[1])
print(f"Reasonable consideration probability: {p_positive:.3f}")
Batched scoring
from typing import List
import torch
def score_pairs(spaces: List[str], summaries: List[str], tokenizer, model, max_length=2048, batch_size=8):
assert len(spaces) == len(summaries)
device = next(model.parameters()).device
scores = []
for i in range(0, len(spaces), batch_size):
batch_spaces = spaces[i:i+batch_size]
batch_summaries = summaries[i:i+batch_size]
texts = [s + "\nNow here is the patient summary:" + p for s, p in zip(batch_spaces, batch_summaries)]
enc = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=max_length).to(device)
with torch.no_grad():
logits = model(**enc).logits
probs = logits.softmax(-1)[:, 1] # POSITIVE
scores.extend(probs.detach().cpu().tolist())
return scores
# Example
spaces = [this_space] * 3
summaries = [patient_summary, "Different summary 1...", "Different summary 2..."]
scores = score_pairs(spaces, summaries, tok, model)
print(scores)
Thresholding & calibration
- Default decision: 0.5 on the POSITIVE probability.
- For better calibration/operating points, tune the threshold on a validation set (e.g., maximize F1, optimize Youden’s J, or set to a desired precision).
How to prepare inputs
Trial space: a compact “target population” disease context description, including cancer type/histology, metastatic/curative, prior/forbidden treatments, required/excluded biomarkers. Patient summary: a concise longitudinal summary of diagnosis, histology, current burden, biomarkers, and treatment history.
You can generate these inputs with your upstream LLM pipeline (e.g., OncoReasoning-3B for summarization and space extraction), but the classifier accepts any plain strings in the format shown above.
Reproducibility (high-level)
Below is the minimal structure used by the training script to build the dataset before tokenization:
# 1) Load and merge three labeled sources
# - space_specific_eligibility_checks.parquet
# - top_ten_cohorts_checked_round{1,2,3}.csv
# - top_twenty_patients_checked_round{1,2,3}.csv
# 2) Deduplicate by ['patient_summary','this_space'] and keep:
# - split, patient_summary, this_space, eligibility_result
# 3) Compose input text and label:
text = this_space + "\nNow here is the patient summary:" + patient_summary
label = int(eligibility_result) # 0 or 1
# 4) Tokenize with ModernBERT tokenizer (max_length=2048, truncation=True)
# 5) Train AutoModelForSequenceClassification (2 labels)
To reproduce exactly, consult and run the original training script.
Limitations & ethical considerations
- Outputs reflect training data and may contain biases or errors.
- The model estimates reasonableness for consideration, not strict eligibility.
- Not validated for safety-critical use; do not use for diagnosis or treatment decisions.
Citation
If you use this model or parts of the pipeline, please cite this model card and the training script (ModernBERT TrialChecker fine-tuning).
- Downloads last month
- 5