Commit
·
60e49ad
1
Parent(s):
bb3556d
Add arrows for code evaluation (#28)
Browse files- Add arrows for code evaluation (3298a1743aedd867ec5054e9c169cee49fd4d956)
Co-authored-by: Niklas Muennighoff <[email protected]>
README.md
CHANGED
|
@@ -2314,9 +2314,9 @@ See this repository for JSON files: https://github.com/bigscience-workshop/evalu
|
|
| 2314 |
| winogrande | eng | acc ↑ | 0.71 | 0.736 |
|
| 2315 |
| wnli (Median of 6 prompts) | eng | acc ↑ | 0.57 | 0.563 |
|
| 2316 |
| wsc (Median of 11 prompts) | eng | acc ↑ | 0.519 | 0.413 |
|
| 2317 |
-
| humaneval | python | pass@1 | 0.155 | 0.0 |
|
| 2318 |
-
| humaneval | python | pass@10 | 0.322 | 0.0 |
|
| 2319 |
-
| humaneval | python | pass@100 | 0.555 | 0.003 |
|
| 2320 |
|
| 2321 |
|
| 2322 |
**Train-time Evaluation:**
|
|
|
|
| 2314 |
| winogrande | eng | acc ↑ | 0.71 | 0.736 |
|
| 2315 |
| wnli (Median of 6 prompts) | eng | acc ↑ | 0.57 | 0.563 |
|
| 2316 |
| wsc (Median of 11 prompts) | eng | acc ↑ | 0.519 | 0.413 |
|
| 2317 |
+
| humaneval | python | pass@1 ↑ | 0.155 | 0.0 |
|
| 2318 |
+
| humaneval | python | pass@10 ↑ | 0.322 | 0.0 |
|
| 2319 |
+
| humaneval | python | pass@100 ↑ | 0.555 | 0.003 |
|
| 2320 |
|
| 2321 |
|
| 2322 |
**Train-time Evaluation:**
|