MLDebugging: Towards Benchmarking Code Debugging Across Multi-Library ScenariosJinyang Huang, Xiachong Feng, Qiguang Chen, Hanjie Zhao, Zihui Cheng, Jiesong Bai, Jingxuan Zhou, Min Li, Libo Qinhttps://arxiv.org/abs/2506.13824
MLDebugging: Towards Benchmarking Code Debugging Across Multi-Library ScenariosCode debugging is a crucial task in software engineering, which attracts increasing attention. While remarkable success has been made in the era of large language models (LLMs), current research still focuses on the simple no-library or single-library setting, ignoring the complex multi-library scenario in real-world applications. To address this limitation, we make the first attempt to introduce MLDebugging (Multi-Library Debugging), a comprehensive benchmark designed to assess debugging chall…