MedGo: Medical Large Language Model Based on Qwen3-32B

๐Ÿ“‹ Table of Contents

๐ŸŽฏ Introduction

MedGo is a general-purpose medical large language model fine-tuned from Qwen3-32B, designed for clinical medicine and research scenarios. The model is trained on large-scale multi-source medical corpora and enhanced with complex case data, supporting various capabilities including medical Q&A, clinical summary, clinical reasoning, multi-turn dialogue, and scientific text generation.

๐ŸŒŸ Core Capabilities

  • ๐Ÿ“š Medical Knowledge Q&A: Professional responses based on authoritative medical literature and clinical guidelines
  • ๐Ÿ“ Clinical Documentation: Automated medical record summaries, diagnostic reports, and medical documentation
  • ๐Ÿ” Clinical Reasoning: Differential diagnosis, examination recommendations, and treatment suggestions
  • ๐Ÿ’ฌ Multi-turn Dialogue: Patient-doctor interaction simulation and complex case discussions
  • ๐Ÿ”ฌ Research Support: Literature summarization, research idea generation, and quality control review

โœจ Key Features

Feature Details
Base Architecture Qwen3-32B
Parameters 32B
Domain Clinical Medicine, Research Support, Healthcare System Integration
Fine-tuning Method SFT + Preference Alignment (DPO/KTO)
Data Sources Authoritative medical literature, clinical guidelines, real cases (anonymized)
Deployment Local deployment, HIS/EMR system integration
License Apache 2.0

๐Ÿ“Š Performance

MedGo demonstrates excellent performance across multiple medical and general evaluation benchmarks, showing competitive results among 32B-parameter models:

Key Benchmark Results

  • AIMedQA: Medical question answering comprehension
  • CME: Clinical reasoning evaluation
  • DiagnosisArena: Diagnostic capability assessment
  • MedQA / MedMCQA: Medical multiple-choice questions
  • PubMedQA: Biomedical literature Q&A
  • MMLU-Pro: Comprehensive capability evaluation

Performance Comparison

Performance Highlights:

  • โœ… Average Score: ~70 points (excellent performance in the 32B parameter class)
  • โœ… Strong Tasks: Clinical reasoning (DiagnosisArena, CME) and multi-turn medical Q&A
  • โœ… Balanced Capability: Good performance in medical semantic understanding and multi-task generalization

๐Ÿš€ Quick Start

Requirements

  • Python >= 3.8
  • PyTorch >= 2.0
  • Transformers >= 4.35.0
  • CUDA >= 11.8 (for GPU inference)

Installation

# Clone the repository
git clone https://github.com/OpenMedZoo/MedGo.git
cd MedGo

# Install dependencies
pip install -r requirements.txt

Model Download

Download model weights from HuggingFace:

# Using huggingface-cli
huggingface-cli download OpenMedZoo/MedGo --local-dir ./models/MedGo

# Or using git-lfs
git lfs install
git clone https://huggingface.co/OpenMedZoo/MedGo

Basic Inference

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model_path = "OpenMedZoo/MedGo"
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    device_map="auto",
    trust_remote_code=True,
    torch_dtype="auto"
)

# Medical Q&A example
messages = [
    {"role": "system", "content": "You are a professional medical assistant. Please answer questions based on medical knowledge."},
    {"role": "user", "content": "What is hypertension and what are the common treatment methods?"}
]

# Generate response
inputs = tokenizer.apply_chat_template(
    messages, 
    tokenize=True, 
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)

response = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)
print(response)

Batch Inference

# Use the provided inference script
python scripts/inference.py \
    --model_path OpenMedZoo/MedGo \
    --input_file examples/medical_qa.jsonl \
    --output_file results/predictions.jsonl \
    --batch_size 4

Accelerated Inference with vLLM

from vllm import LLM, SamplingParams

# Initialize vLLM
llm = LLM(model="OpenMedZoo/MedGo", trust_remote_code=True)
sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=512)

# Batch inference
prompts = [
    "What are the symptoms and treatment methods for diabetes?",
    "What dietary precautions should hypertensive patients take?"
]

outputs = llm.generate(prompts, sampling_params)
for output in outputs:
    print(output.outputs[0].text)

๐Ÿ”ง Training Details

MedGo employs a two-stage fine-tuning strategy to balance general medical knowledge with clinical task adaptation.

Stage I: General Medical Alignment

Objective: Establish a solid foundation of medical knowledge and improve Q&A standardization

  • Data Sources:

    • Authoritative medical literature (PubMed, medical textbooks)
    • Clinical guidelines and diagnostic standards
    • Medical encyclopedia entries and terminology databases
  • Training Methods:

    • Supervised Fine-Tuning (SFT)
    • Chain-of-Thought (CoT) guided samples
    • Medical terminology alignment and safety constraints

Stage II: Clinical Task Enhancement

Objective: Enhance complex case reasoning and multi-task processing capabilities

  • Data Sources:

    • Real medical records (fully anonymized)
    • Outpatient and emergency records with complex multi-diagnosis samples
    • Research articles and quality control cases
  • Data Augmentation Techniques:

    • Semantic paraphrasing and multi-perspective expansion
    • Complex case synthesis
    • Doctor-patient interaction simulation
  • Training Methods:

    • Multi-Task Learning (medical record summary, differential diagnosis, examination suggestions, etc.)
    • Preference Alignment (DPO/KTO)
    • Expert feedback iterative optimization

Training Optimization Focus

  • โœ… Strengthen information extraction and cross-evidence reasoning for complex cases
  • โœ… Improve medical consistency and interpretability of outputs
  • โœ… Optimize expression compliance and safety
  • โœ… Continuous iteration through expert samples and automated evaluation

๐Ÿ’ก Use Cases

โœ… Suitable Scenarios

Scenario Description
Clinical Assistance Preliminary diagnosis suggestions, medical record writing, formatted report generation
Research Support Literature summarization, research idea generation, data analysis assistance
Quality Control Medical document compliance checking, clinical process quality control
System Integration Embedded in HIS/EMR systems to provide intelligent decision support
Medical Education Case discussions, medical knowledge Q&A, clinical reasoning training

๐Ÿšซ Unsuitable Scenarios

  • โŒ Cannot Replace Doctors: Only an auxiliary tool, not a standalone diagnostic basis
  • โŒ High-Risk Operations: Not recommended for surgical decisions or other high-risk medical operations
  • โŒ Rare Disease Limitations: May perform poorly on rare diseases outside training data
  • โŒ Emergency Care: Not suitable for scenarios requiring immediate decisions

โš ๏ธ Limitations & Risks

Model Limitations

  1. Understanding Bias: Despite covering extensive medical knowledge, may still produce understanding biases or incorrect recommendations
  2. Complex Cases: Higher risk for cases with complex conditions, severe complications, or missing information
  3. Knowledge Currency: Medical knowledge continuously updates; training data may lag
  4. Language Limitation: Primarily designed for Chinese medical scenarios; performance in other languages may vary

Usage Recommendations

  • โš ๏ธ Use in controlled environments with clinical expert review of generated results
  • โš ๏ธ Treat model outputs as auxiliary references, not final diagnostic conclusions
  • โš ๏ธ For sensitive cases or high-risk scenarios, expert consultation is mandatory
  • โš ๏ธ Deployment requires internal validation, security review, and clinical testing

Data Privacy & Compliance

  • ๐Ÿ”’ Training data fully anonymized
  • ๐Ÿ”’ Attention to patient privacy protection during use
  • ๐Ÿ”’ Production deployment must comply with healthcare data security regulations (e.g., HIPAA, GDPR)
  • ๐Ÿ”’ Local deployment recommended to avoid sensitive data transmission

๐Ÿ“š Citation

If MedGo is helpful for your research or project, please cite our work:

@misc{openmedzoo_2025,
    author       = { OpenMedZoo },
    title        = { MedGo (Revision 640a2e2) },
    year         = 2025,
    url          = { https://huggingface.co/OpenMedZoo/MedGo },
    doi          = { 10.57967/hf/7024 },
    publisher    = { Hugging Face }
}

๐Ÿ“„ License

This project is licensed under the Apache License 2.0.

Commercial Use Notice:

  • โœ… Commercial use and modification allowed
  • โœ… Original license and copyright notice must be retained
  • โœ… Contact us for technical support when integrating into healthcare systems

๐Ÿค Contributing

We welcome community contributions! Here's how to participate:

Contribution Types

  • ๐Ÿ› Submit bug reports
  • ๐Ÿ’ก Propose new features
  • ๐Ÿ“ Improve documentation
  • ๐Ÿ”ง Submit code fixes or optimizations
  • ๐Ÿ“Š Share evaluation results and use cases

๐Ÿ™ Acknowledgments

Thanks to all contributors to the MedGo project:

  • Model development and fine-tuning algorithm team
  • Data annotation and quality control team
  • Clinical expert guidance and review team
  • Open-source community support and feedback

Special thanks to:

  • Qwen Team for providing excellent foundation models
  • All healthcare institutions that provided data and feedback

๐Ÿ“ง Contact


Downloads last month
57
Safetensors
Model size
33B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for OpenMedZoo/MedGo

Base model

Qwen/Qwen3-32B
Finetuned
(143)
this model
Quantizations
3 models

Evaluation results