Update README.md
Browse files
README.md
CHANGED
|
@@ -30,7 +30,7 @@ TruthfulQA 43.0% 43.1%
|
|
| 30 |
|
| 31 |
| Metric | HellaSwag | MMLU Overall | MMLU High School World History | MMLU High School US History | MMLU High School European History | TruthfulQA |
|
| 32 |
|--------------------------------------|--------------|--------------|--------------------------------|-----------------------------|-----------------------------------|--------------|
|
| 33 |
-
| Historical Narrative Generator Model | 64.0
|
| 34 |
| Base Qwen2.5 Instruct Model | 64.0% | 78.7% | 90.3% | 92.2% | 87.2% | 43.0% |
|
| 35 |
| DeepSeek R1 Model | 60.4% | 73.3% | 88.6% | 85.3% | 82.4% | 35.9% |
|
| 36 |
| Mistral Nemo Instruct | 63.3% | 65.6% | 84.4% | 84.8% | 74.5% | 39.5% |
|
|
|
|
| 30 |
|
| 31 |
| Metric | HellaSwag | MMLU Overall | MMLU High School World History | MMLU High School US History | MMLU High School European History | TruthfulQA |
|
| 32 |
|--------------------------------------|--------------|--------------|--------------------------------|-----------------------------|-----------------------------------|--------------|
|
| 33 |
+
| Historical Narrative Generator Model | **64.0%** | **78.9%** | **91.1%** | 91.2% | 86.7% | **43.1%** |
|
| 34 |
| Base Qwen2.5 Instruct Model | 64.0% | 78.7% | 90.3% | 92.2% | 87.2% | 43.0% |
|
| 35 |
| DeepSeek R1 Model | 60.4% | 73.3% | 88.6% | 85.3% | 82.4% | 35.9% |
|
| 36 |
| Mistral Nemo Instruct | 63.3% | 65.6% | 84.4% | 84.8% | 74.5% | 39.5% |
|