File size: 5,168 Bytes
66fe6a9 952b903 66fe6a9 fd265eb 952b903 66fe6a9 fd265eb 66fe6a9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
---
base_model: mistralai/Mistral-7B-Instruct-v0.3
library_name: transformers
model_name: Doctor_AI_LoRA-Mistral-7B-Instructritvik77
tags:
- generated_from_trainer
- trl
- medical
- Doctor
- PEFT
- MEDICAL
- AIMEDICAL
- DOCTORai
licence: license
license: apache-2.0
datasets:
- FreedomIntelligence/medical-o1-reasoning-SFT
pipeline_tag: text-generation
---
# Model Card for Doctor_AI_LoRA-Mistral-7B-Instructritvik77
This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3).
It has been trained using [TRL](https://github.com/huggingface/trl).
## Quick start
```python
# from peft import PeftModel, PeftConfig
# from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
# from datasets import load_dataset
# import torch
# # Quantization config for 4-bit loading
# bnb_config = BitsAndBytesConfig(
# load_in_4bit=True,
# bnb_4bit_quant_type="nf4",
# bnb_4bit_compute_dtype=torch.bfloat16,
# bnb_4bit_use_double_quant=True,
# )
# # Repo ID for the PEFT model
# peft_model_id = f"{username}/{output_dir}" # e.g., ritvik77/Mixtral-7B-LoRA-Salesforce-Optimized-AI-AgentCall
# device = "auto"
# # Load PEFT config from the Hub
# config = PeftConfig.from_pretrained(peft_model_id)
# # Load the base model (e.g., Mistral-7B) with quantization
# model = AutoModelForCausalLM.from_pretrained(
# config.base_model_name_or_path, # Base model ID stored in PEFT config
# device_map="auto",
# quantization_config=bnb_config, # Apply 4-bit quantization
# )
# # Load tokenizer from the PEFT model repo
# tokenizer = AutoTokenizer.from_pretrained(peft_model_id)
# # Resize token embeddings to match tokenizer (if needed)
# model.resize_token_embeddings(len(tokenizer))
# # Load PEFT adapters and apply them to the base model
# model = PeftModel.from_pretrained(model, peft_model_id)
# # Convert model to bfloat16 and set to evaluation mode
# model.to(torch.bfloat16)
# model.eval()
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel, PeftConfig
# β
Quantization config for 4-bit loading (Memory Optimization)
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4", # β
Improved precision for LoRA weights
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True, # β
Reduces VRAM overhead
)
# β
Load tokenizer from fine-tuned checkpoint (Ensures token consistency)
peft_model_id = "ritvik77/Doctor_AI_LoRA-Mistral-7B-Instructritvik77"
tokenizer = AutoTokenizer.from_pretrained(peft_model_id, trust_remote_code=True)
# β
Ensure `pad_token` is correctly assigned
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
# β
Load Base Model with Quantization for Memory Efficiency
model_name = "mistralai/Mistral-7B-Instruct-v0.3"
model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map="auto", # β
Efficiently maps to available GPUs
quantization_config=bnb_config, # β
Efficient quantization for large models
torch_dtype=torch.bfloat16
)
# β
Resize Token Embeddings BEFORE Loading LoRA Adapter (Prevents size mismatch)
model.resize_token_embeddings(len(tokenizer))
# β
Load PEFT Adapter (LoRA Weights)
model = PeftModel.from_pretrained(model, peft_model_id)
# β
Unfreeze LoRA layers to ensure they are trainable
for name, param in model.named_parameters():
if "lora" in name:
param.requires_grad = True
# β
Confirm LoRA Layers Are Active
if hasattr(model, 'print_trainable_parameters'):
model.print_trainable_parameters()
else:
print("β Warning: LoRA adapter may not have loaded correctly.")
# β
Ensure model is in evaluation mode for inference
model.eval()
# β
Sample Inference Code
def generate_response(prompt, max_new_tokens=300, temperature=0.7):
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=max_new_tokens,
do_sample=True,
temperature=temperature
)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# β
Sample Prompt for Medical Diagnosis
prompt = "Patient reports chest pain and shortness of breath. What might be the diagnosis?"
response = generate_response(prompt)
print("\nπ©Ί **Diagnosis:**", response)
print("π PEFT model loaded successfully with resized embeddings!")
## Training procedure
This model was trained with SFT.
### Framework versions
- TRL: 0.15.2
- Transformers: 4.48.3
- Pytorch: 2.5.1+cu124
- Datasets: 3.3.2
- Tokenizers: 0.21.0
## Citations
Cite TRL as:
```bibtex
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin GallouΓ©dec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
``` |