Qwen2.5-3B-R1-MedicalReasoner

Qwen2.5-3B-R1-MedicalReasoner is a clinical reasoning language model fine-tuned for advanced diagnostic and case-based problem solving. It has been developed for applications in medical education, clinical decision support, and research, with the capability to generate detailed chain-of-thought responses that include both the reasoning process and the final answer.

Overview

Model Name: Qwen2.5-3B-R1-MedicalReasoner
Base Architecture: Qwen2.5 (3B)
Primary Application: Clinical reasoning and medical problem solving
Key Features:
- Chain-of-Thought Outputs: Responds with structured reasoning (<reasoning> ... </reasoning>) followed by a concise answer (<answer> ... </answer>).
- Multi-Specialty Coverage: Well-suited for scenarios in internal medicine, surgery, pediatrics, OB/GYN, emergency medicine, and more.
- Explainable AI: Generates detailed, educational explanations that support clinical decision-making.

Model Capabilities

Expert-Level Clinical Reasoning: Equipped to analyze complex clinical scenarios and provide in-depth diagnostic reasoning.
Structured Outputs: Enforces a response format that separates the thought process from the final answer, aiding transparency and interpretability.
Optimized for Speed: Uses Unsloth and vLLM for fast, efficient inference on GPU systems.

Inference and Usage

Below is an example of how to use the model for inference or refer to inference.py in files section:

from unsloth import FastLanguageModel, is_bfloat16_supported
from vllm import SamplingParams
from huggingface_hub import snapshot_download
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="iimran/Qwen2.5-3B-R1-MedicalReasoner",
    load_in_4bit=True,
    fast_inference=True,
    gpu_memory_utilization=0.5
)
lora_rank = 64
model = FastLanguageModel.get_peft_model(
    model,
    r=lora_rank,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
                    "gate_proj", "up_proj", "down_proj"],
    lora_alpha=lora_rank,
    use_gradient_checkpointing="unsloth",
    random_state=3407,
)
lora_path = snapshot_download("iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter")
print("LoRA adapter downloaded to:", lora_path)
model.load_lora(lora_path)
SYSTEM_PROMPT = (
    "Respond in the following format:\n"
    "<reasoning>\n"
    "...\n"
    "</reasoning>\n"
    "<answer>\n"
    "...\n"
    "</answer>"
)
USER_PROMPT = (
    "In the context of disseminated intravascular coagulation (DIC), "
    "which blood component is expected to show an increase due to the excessive breakdown of fibrin?"
)
text = tokenizer.apply_chat_template(
    [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": USER_PROMPT},
    ],
    tokenize=False,
    add_generation_prompt=True
)
sampling_params = SamplingParams(
    temperature=0.1,
    top_p=0.95,
    max_tokens=4096,
)
outputs = model.fast_generate(
    text,
    sampling_params=sampling_params,
    lora_request=None
)
print(outputs[0].outputs[0].text)

Adapter Integration

For further fine-tuning or experiments with LoRA adapters, the LoRA adapter for this model is available in a separate repository.

LoRA Adapter Repo: iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter

To download and integrate the LoRA adapter:

from huggingface_hub import snapshot_download

# Download the LoRA adapter repository:
lora_path = snapshot_download("iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter")
print("LoRA adapter downloaded to:", lora_path)

# Load the adapter into the model:
model.load_lora(lora_path)

Installation

To use this model, install the required packages:

pip install unsloth vllm trl datasets huggingface-hub

A compatible GPU is recommended for optimal performance.

Citation

If you use Qwen2.5-3B-R1-MedicalReasoner in your research, please cite:

@misc{sarwar2025reinforcement,
  author = {Imran Sarwar and Muhammad Rouf Mustafa},
  title = {Reinforcement Learning Elevates Qwen2.5-3B Medical Reasoning Performance},
  year = {2025},
  month = {Apr},
  day = {10},
  publisher = {Imran Sarwar's Blog},
  howpublished = {\url{https://www.imransarwar.com/blog-posts/Reinforcement-Learning-Elevates-Qwen2.5-Medical-Reasoning-Performance.html}},
  note = {Accessed: 2025-04-09}
}

@misc{Qwen2.5-3B-R1-MedicalReasoner,
  authors = {Imran Sarwar, Muhammad Rouf Mustafa},
  title = {Qwen 2.5-3B Meets Deepseek R1: A Fine-Tuned Medical Reasoning Model for Enhanced Diagnostics},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/iimran/Qwen2.5-3B-R1-MedicalReasoner}
}

Disclaimer

This model is intended for research and educational purposes only. It should not be used as the sole basis for clinical decision-making. All outputs should be validated by qualified healthcare professionals.

Downloads last month: 14

Model tree for iimran/Qwen2.5-3B-R1-MedicalReasoner

Base model

Qwen/Qwen2.5-3B

Finetuned

(259)

this model

iimran
/

Qwen2.5-3B-R1-MedicalReasoner