File size: 3,016 Bytes
2426ccb |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
---
license: mit
base_model: microsoft/Phi-3-mini-4k-instruct
tags:
- phi3
- lora
- alpaca
- instruction-tuning
- fine-tuned
datasets:
- tatsu-lab/alpaca
language:
- en
pipeline_tag: text-generation
---
# Phi-3 Mini 4K Instruct - Alpaca LoRA Fine-tuned
This model is a fine-tuned version of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) using LoRA (Low-Rank Adaptation) on the [Alpaca dataset](https://huggingface.co/datasets/tatsu-lab/alpaca).
## Model Details
- **Base Model**: microsoft/Phi-3-mini-4k-instruct
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Dataset**: tatsu-lab/alpaca (52,002 instruction-following examples)
- **Training Duration**: ~1.24 hours
- **Final Training Loss**: 1.0445
- **Average Training Loss**: 1.0311
## Training Configuration
- **LoRA Rank**: 16
- **LoRA Alpha**: 32
- **LoRA Dropout**: 0.05
- **Target Modules**: qkv_proj, o_proj, gate_proj, up_proj, down_proj
- **Learning Rate**: 1e-5
- **Batch Size**: 2 (with gradient accumulation steps: 8)
- **Epochs**: 1
- **Precision**: bfloat16
- **Gradient Checkpointing**: Enabled
## Usage
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct", trust_remote_code=True)
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"microsoft/Phi-3-mini-4k-instruct",
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "johnlam90/phi3-mini-4k-instruct-alpaca-lora")
model.eval()
# Format prompt
prompt = "Give three tips for staying healthy."
formatted_prompt = f'''### Instruction:
{prompt}
### Response:
'''
# Generate
inputs = tokenizer(formatted_prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=200,
do_sample=False,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response.split("### Response:")[1].strip())
```
## Performance
The model has been tested with comprehensive safety measures including:
- ✅ NaN clamp protection for stable generation
- ✅ Proper bfloat16 precision handling
- ✅ Consistent and coherent responses across multiple test prompts
- ✅ No numerical instabilities during training or inference
## Training Details
This model was fine-tuned with careful attention to:
1. **Data Formatting**: Proper Alpaca instruction/input/output structure
2. **Numerical Stability**: Using bfloat16 precision and conservative hyperparameters
3. **Memory Efficiency**: Gradient checkpointing and optimized batch sizes
4. **Safety Measures**: NaN protection and proper token handling
## License
This model is released under the MIT license, following the base model's licensing terms.
|