gpt4-x-alpaca / README.md
maicomputer's picture
Update README.md
bd46821 verified

GPT4 x Alpaca

As a base model we used: alpaca-13b

Finetuned on GPT4's responses, for 3 epochs.

Open LLM Leaderboard Evaluation Results

Metric Value
Avg. 46.78
ARC (25-shot) 52.82
HellaSwag (10-shot) 79.59
MMLU (5-shot) 48.19
TruthfulQA (0-shot) 48.88
Winogrande (5-shot) 70.17
GSM8K (5-shot) 2.81
DROP (3-shot) 24.99