Phi-3 Mini 4K Instruct - Alpaca LoRA Fine-tuned
This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct using LoRA (Low-Rank Adaptation) on the Alpaca dataset.
Model Details
- Base Model: microsoft/Phi-3-mini-4k-instruct
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Dataset: tatsu-lab/alpaca (52,002 instruction-following examples)
- Training Duration: ~1.24 hours
- Final Training Loss: 1.0445
- Average Training Loss: 1.0311
Training Configuration
- LoRA Rank: 16
- LoRA Alpha: 32
- LoRA Dropout: 0.05
- Target Modules: qkv_proj, o_proj, gate_proj, up_proj, down_proj
- Learning Rate: 1e-5
- Batch Size: 2 (with gradient accumulation steps: 8)
- Epochs: 1
- Precision: bfloat16
- Gradient Checkpointing: Enabled
Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct", trust_remote_code=True)
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"microsoft/Phi-3-mini-4k-instruct",
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "johnlam90/phi3-mini-4k-instruct-alpaca-lora")
model.eval()
# Format prompt
prompt = "Give three tips for staying healthy."
formatted_prompt = f'''### Instruction:
{prompt}
### Response:
'''
# Generate
inputs = tokenizer(formatted_prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=200,
do_sample=False,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response.split("### Response:")[1].strip())
Performance
The model has been tested with comprehensive safety measures including:
- β NaN clamp protection for stable generation
- β Proper bfloat16 precision handling
- β Consistent and coherent responses across multiple test prompts
- β No numerical instabilities during training or inference
Training Details
This model was fine-tuned with careful attention to:
- Data Formatting: Proper Alpaca instruction/input/output structure
- Numerical Stability: Using bfloat16 precision and conservative hyperparameters
- Memory Efficiency: Gradient checkpointing and optimized batch sizes
- Safety Measures: NaN protection and proper token handling
License
This model is released under the MIT license, following the base model's licensing terms.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support