DyCodeEval - a CodeKaleidoscope Collection

CodeKaleidoscope 's Collections

updated Jun 27, 2025

DyCodeEval (ICML 2025) enables dynamic benchmarking for code LLMs. This collection features dynamic HumanEval and MBPP sets generated with Claude 3.5.

Upvote

CodeKaleidoscope/Dynamic_HumanEvalZero

Viewer • Updated Jun 24, 2025 • 15.7k • 3 • 3
CodeKaleidoscope/Dynamic_MBPP_sanitized

Viewer • Updated Jun 24, 2025 • 15.8k • 2 • 3
Dynamic Benchmarking of Reasoning Capabilities in Code Large Language Models Under Data Contamination

Paper • 2503.04149 • Published Mar 6, 2025 • 6

Upvote

Collection guide
Browse collections