File size: 3,016 Bytes
2426ccb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
---
license: mit
base_model: microsoft/Phi-3-mini-4k-instruct
tags:
- phi3
- lora
- alpaca
- instruction-tuning
- fine-tuned
datasets:
- tatsu-lab/alpaca
language:
- en
pipeline_tag: text-generation
---

# Phi-3 Mini 4K Instruct - Alpaca LoRA Fine-tuned

This model is a fine-tuned version of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) using LoRA (Low-Rank Adaptation) on the [Alpaca dataset](https://huggingface.co/datasets/tatsu-lab/alpaca).

## Model Details

- **Base Model**: microsoft/Phi-3-mini-4k-instruct
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Dataset**: tatsu-lab/alpaca (52,002 instruction-following examples)
- **Training Duration**: ~1.24 hours
- **Final Training Loss**: 1.0445
- **Average Training Loss**: 1.0311

## Training Configuration

- **LoRA Rank**: 16
- **LoRA Alpha**: 32
- **LoRA Dropout**: 0.05
- **Target Modules**: qkv_proj, o_proj, gate_proj, up_proj, down_proj
- **Learning Rate**: 1e-5
- **Batch Size**: 2 (with gradient accumulation steps: 8)
- **Epochs**: 1
- **Precision**: bfloat16
- **Gradient Checkpointing**: Enabled

## Usage

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct", trust_remote_code=True)

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "johnlam90/phi3-mini-4k-instruct-alpaca-lora")
model.eval()

# Format prompt
prompt = "Give three tips for staying healthy."
formatted_prompt = f'''### Instruction:
{prompt}

### Response:
'''

# Generate
inputs = tokenizer(formatted_prompt, return_tensors="pt")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=200,
        do_sample=False,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response.split("### Response:")[1].strip())
```

## Performance

The model has been tested with comprehensive safety measures including:
- ✅ NaN clamp protection for stable generation
- ✅ Proper bfloat16 precision handling
- ✅ Consistent and coherent responses across multiple test prompts
- ✅ No numerical instabilities during training or inference

## Training Details

This model was fine-tuned with careful attention to:
1. **Data Formatting**: Proper Alpaca instruction/input/output structure
2. **Numerical Stability**: Using bfloat16 precision and conservative hyperparameters
3. **Memory Efficiency**: Gradient checkpointing and optimized batch sizes
4. **Safety Measures**: NaN protection and proper token handling

## License

This model is released under the MIT license, following the base model's licensing terms.