johnlam90's picture
Upload fine-tuned Phi-3 Mini with LoRA adapters
2426ccb verified
metadata
license: mit
base_model: microsoft/Phi-3-mini-4k-instruct
tags:
  - phi3
  - lora
  - alpaca
  - instruction-tuning
  - fine-tuned
datasets:
  - tatsu-lab/alpaca
language:
  - en
pipeline_tag: text-generation

Phi-3 Mini 4K Instruct - Alpaca LoRA Fine-tuned

This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct using LoRA (Low-Rank Adaptation) on the Alpaca dataset.

Model Details

  • Base Model: microsoft/Phi-3-mini-4k-instruct
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Dataset: tatsu-lab/alpaca (52,002 instruction-following examples)
  • Training Duration: ~1.24 hours
  • Final Training Loss: 1.0445
  • Average Training Loss: 1.0311

Training Configuration

  • LoRA Rank: 16
  • LoRA Alpha: 32
  • LoRA Dropout: 0.05
  • Target Modules: qkv_proj, o_proj, gate_proj, up_proj, down_proj
  • Learning Rate: 1e-5
  • Batch Size: 2 (with gradient accumulation steps: 8)
  • Epochs: 1
  • Precision: bfloat16
  • Gradient Checkpointing: Enabled

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct", trust_remote_code=True)

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "johnlam90/phi3-mini-4k-instruct-alpaca-lora")
model.eval()

# Format prompt
prompt = "Give three tips for staying healthy."
formatted_prompt = f'''### Instruction:
{prompt}

### Response:
'''

# Generate
inputs = tokenizer(formatted_prompt, return_tensors="pt")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=200,
        do_sample=False,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response.split("### Response:")[1].strip())

Performance

The model has been tested with comprehensive safety measures including:

  • ✅ NaN clamp protection for stable generation
  • ✅ Proper bfloat16 precision handling
  • ✅ Consistent and coherent responses across multiple test prompts
  • ✅ No numerical instabilities during training or inference

Training Details

This model was fine-tuned with careful attention to:

  1. Data Formatting: Proper Alpaca instruction/input/output structure
  2. Numerical Stability: Using bfloat16 precision and conservative hyperparameters
  3. Memory Efficiency: Gradient checkpointing and optimized batch sizes
  4. Safety Measures: NaN protection and proper token handling

License

This model is released under the MIT license, following the base model's licensing terms.