Phi-3 Mini 4K Instruct - Alpaca LoRA Fine-tuned

This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct using LoRA (Low-Rank Adaptation) on the Alpaca dataset.

Model Details

  • Base Model: microsoft/Phi-3-mini-4k-instruct
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Dataset: tatsu-lab/alpaca (52,002 instruction-following examples)
  • Training Duration: ~1.24 hours
  • Final Training Loss: 1.0445
  • Average Training Loss: 1.0311

Training Configuration

  • LoRA Rank: 16
  • LoRA Alpha: 32
  • LoRA Dropout: 0.05
  • Target Modules: qkv_proj, o_proj, gate_proj, up_proj, down_proj
  • Learning Rate: 1e-5
  • Batch Size: 2 (with gradient accumulation steps: 8)
  • Epochs: 1
  • Precision: bfloat16
  • Gradient Checkpointing: Enabled

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct", trust_remote_code=True)

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "johnlam90/phi3-mini-4k-instruct-alpaca-lora")
model.eval()

# Format prompt
prompt = "Give three tips for staying healthy."
formatted_prompt = f'''### Instruction:
{prompt}

### Response:
'''

# Generate
inputs = tokenizer(formatted_prompt, return_tensors="pt")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=200,
        do_sample=False,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response.split("### Response:")[1].strip())

Performance

The model has been tested with comprehensive safety measures including:

  • βœ… NaN clamp protection for stable generation
  • βœ… Proper bfloat16 precision handling
  • βœ… Consistent and coherent responses across multiple test prompts
  • βœ… No numerical instabilities during training or inference

Training Details

This model was fine-tuned with careful attention to:

  1. Data Formatting: Proper Alpaca instruction/input/output structure
  2. Numerical Stability: Using bfloat16 precision and conservative hyperparameters
  3. Memory Efficiency: Gradient checkpointing and optimized batch sizes
  4. Safety Measures: NaN protection and proper token handling

License

This model is released under the MIT license, following the base model's licensing terms.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for johnlam90/phi3-mini-4k-instruct-alpaca-lora

Adapter
(746)
this model
Adapters
1 model

Dataset used to train johnlam90/phi3-mini-4k-instruct-alpaca-lora