Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving Paper • 2504.02605 • Published Apr 3 • 48
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models Paper • 2410.07985 • Published Oct 10, 2024 • 33
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4, 2024 • 74
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation Paper • 2303.12570 • Published Mar 22, 2023
Private-Library-Oriented Code Generation with Large Language Models Paper • 2307.15370 • Published Jul 28, 2023 • 1
CodeS: Natural Language to Code Repository via Multi-Layer Sketch Paper • 2403.16443 • Published Mar 25, 2024
SWE-bench-java: A GitHub Issue Resolving Benchmark for Java Paper • 2408.14354 • Published Aug 26, 2024 • 42
PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback Paper • 2307.14936 • Published Jul 27, 2023 • 41
Can Programming Languages Boost Each Other via Instruction Tuning? Paper • 2308.16824 • Published Aug 31, 2023 • 11