Model Card for BiMediX-Bilingual
Model Details
- Name: BiMediX
- Version: 1.0
- Type: Bilingual Medical Mixture of Experts Large Language Model (LLM)
- Languages: English
- Model Architecture: Mixtral-8x7B-Instruct-v0.1
- Training Data: BiMed1.3M-English, a bilingual dataset with diverse medical interactions.
Intended Use
- Primary Use: Medical interactions in both English and Arabic.
- Capabilities: MCQA, closed QA and chats.
Getting Started
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "BiMediX/BiMediX-Eng"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
text = "Hello BiMediX! I've been experiencing increased tiredness in the past week."
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=500)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Procedure
- Dataset: BiMed1.3M-English, million healthcare specialized tokens.
- QLoRA Adaptation: Implements a low-rank adaptation technique, incorporating learnable low-rank adapter weights into the experts and the routing network. This results in training about 4% of the original parameters.
- Training Resources: The model underwent training on approximately 288 million tokens from the BiMed1.3M-English corpus.
Model Performance
- Benchmarks: Demonstrates superior performance compared to baseline models in medical benchmarks. This enhancement is attributed to advanced training techniques and a comprehensive dataset, ensuring the model's adeptness in handling complex medical queries and providing accurate information in the healthcare domain.
| Model |
CKG |
CBio |
CMed |
MedGen |
ProMed |
Ana |
MedMCQA |
MedQA |
PubmedQA |
AVG |
| PMC-LLaMA-13B |
63.0 |
59.7 |
52.6 |
70.0 |
64.3 |
61.5 |
50.5 |
47.2 |
75.6 |
60.5 |
| Med42-70B |
75.9 |
84.0 |
69.9 |
83.0 |
78.7 |
64.4 |
61.9 |
61.3 |
77.2 |
72.9 |
| Clinical Camel-70B |
69.8 |
79.2 |
67.0 |
69.0 |
71.3 |
62.2 |
47.0 |
53.4 |
74.3 |
65.9 |
| Meditron-70B |
72.3 |
82.5 |
62.8 |
77.8 |
77.9 |
62.7 |
65.1 |
60.7 |
80.0 |
71.3 |
| BiMediX |
78.9 |
86.1 |
68.2 |
85.0 |
80.5 |
74.1 |
62.7 |
62.8 |
80.2 |
75.4 |
Safety and Ethical Considerations
- Potential issues: hallucinations, toxicity, stereotypes.
- Usage: Research purposes only.
Accessibility
Authors
Sara Pieri, Sahal Shaji Mullappilly, Fahad Shahbaz Khan, Rao Muhammad Anwer Salman Khan, Timothy Baldwin, Hisham Cholakkal
Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI)