Discrepancy for the LCB score in your paper?
Hello again guys!
I was just reading back through your paper, and right at the beginning I was confused by a difference between what we read in your abstract and the results in the table at page 2.
It's about the LCB evaluation results:
- abstract and the accuracy chart says and shows up a 51 pts
- while table in page 2 states a whopping 64.5 pts
I guess the correct results are the ones stated in the abstract, as you also present it right here in the model card, but wanted to be sure and inform you about this.
Could you confirm it's indeed 51?
Would the 64.5 be instead the result for a checkpoint of the upcoming 32B model? :)
For LCB, there are different versions based on the update date. We report 2 versions, one from 05/23-05/24 (LCBv2) and one from 06/24 to 01/25 (LCB v5 minus LCB v2). If you check the table on page 2, there is an LCB under the “code” rows and another LCB under the “held out” rows.
In the abstract, the 51 we report is our generalization held-out set which is LCBv5-minus-LCBv2. You can also find this 51 number at the bottom of the table in page 2. The 64.5 you mentioned is the score on LCBv2 which is we used as part of our validation set.
Oh yes! My bad I didn't see the 2 separate rows in the page 2 table... Sorry! Thanks for your answer.