ByteDance SE Lab

About Us

The ByteDance Software Engineering Lab envisions achieving safe and trusted intelligent automated software engineering. We are dedicated to accelerating the integration and complementarity of software engineering and artificial intelligence, driving technological advancements in every aspect of software development. To achieve this goal, we bring together the best researchers and development engineers from diverse fields and backgrounds to address the challenges faced by ByteDance and the entire software engineering community.

Collaborating Universities

We setup collaboration projects with top-tier universities around the world, to explore the future of software engineering and artificial intelligence.

News

📌 We are hiring internships under a talent program. Please consider join us!

📌 We are organising an AI-IDE workshop with FSE.

May 2024: Our paper SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning is accepted by the Main Track of ACL 2025 [pdf]. Congratulations to Zexiong!

April 2024: We welcome Junjielong Xu from CUHKSZ, our first intern under the Jin Dou Yun Talent Program.

Apr. 2025: Our paper AEGIS: An Agent-based Framework for Bug Reproduction from Issue Descriptions is accepted by the Industry Track of FSE'25. Congratulations to Xinchen!

Apr 2025: Our paper Understanding Large Language Model Performance in Software Engineering: A Large-scale Question Answering Benchmark has been accepted by the SIGIR 2025 Short Paper track. Congratulations to Ruida!

Dec 2024: Our paper DialogAgent: An Auto-engagement Agent for Code Question Answering Data Production has been accepted by the Industry Track of ICSE 2025. Congratulations to Xiaoyun!

Nov 2024: Our paper Prompting Large Language Models to Tackle the Full Software Development Lifecycle: A Case Study is accepted by COLING 2025.

Contact

Address: Building 1, Dazhongsi Square, ByteDance, Beijing, China

Email:

Active Events

ICSE 2025

📅 27 April - Sat 3 May 2025

📍 Rogers Centre, Ottawa

We are sponsoring ICSE 2025 and organising an LLM4Code Reception. Signup here: Link

FSE 2025

📅 23 June 2025 - 27 June 2025

📍 Trondheim, Norway

We are sponsoring FSE 2025 and organising an AI IDE workshop.

Past Events

ASE 2024

📅 28 Oct 2024 - 1 Nov 2024

📍 Sheraton Grand, Sacramento, California

ByteDance SE Lab is sponsoring ASE 2024 and present our NIER paper on code completion. We look forward to seeing you in Sacramento.

ECOOP and ISSTA 2024

📅 16 Sep 2024 - 20 Sep 2024

📍 Vienna University of Technology (TU Wien) - Campus Gusshaus

ByteDance SE Lab is sponsoring ECOOP and ISSTA 2024, and organising a MarsCode Reception. We look forward to seeing you in Vienna.

Publications

Preprint

Hu, Ruida, Chao Peng, Xinchen Wang, and Cuiyun Gao. (2025). "An LLM-based Agent for Reliable Docker Environment Configuration." arXiv preprint arXiv:2502.13681. pdf code

Repo2Run is the first LLM-based agent that automates environment configuration via atomic synthesis and dual-environment architecture, generating error-free Dockerfiles for Python repositories with 86% success rate (63.9% higher than baselines).
Chen, Jialiang, Kaifa Zhao, Jie Liu, Chao Peng, Jierui Liu, Hang Zhu, Pengfei Gao, Ping Yang, Shuiguang Deng. (2024). "CoReQA: Uncovering Potentials of Language Models in Code Repository Question Answering." arXiv preprint arXiv:2501.03447. pdf

We introduce CoReQA, a new benchmark for repository-level question answering, to evaluate large language models' performance in understanding code repositories, highlighting the limitations of current models in effectively addressing repository-level questions.
Guan, Zhanming, Junlin Liu, Jierui Liu, Chao Peng, Dexin Liu, Ningyuan Sun, Bo Jiang, Wenchao Li, Jie Liu, and Hang Zhu. (2024). "ContextModule: Improving Code Completion via Repository-level Contextual Information." arXiv preprint arXiv:2412.08063. pdf

This paper introduces ContextModule, a framework that enhances LLM-based code completion by retrieving and integrating user behavior-based code, similar code snippets, and critical symbol definitions from the repository, thereby improving accuracy and user acceptance rates.
Tsimpourlas, Foivos, Chao Peng, Carlos Rosuero, Ping Yang, and Ajitha Rajan. (2024). "Go-Oracle: Automated Test Oracle for Go Concurrency Bugs." arXiv preprint arXiv:2412.08061. pdf

We propose an automatic classification method using a transformer-based neural network to address the test oracle problem for Go programs.
Hu, Ruida, Chao Peng, Jingyi Ren, Bo Jiang, Xiangxin Meng, Qinyun Wu, Pengfei Gao, Xinchen Wang, Cuiyun Gao. (2024). "A Real-World Benchmark for Evaluating Fine-Grained Issue Solving Capabilities of Large Language Models". arXiv preprint arXiv:2411.18019. pdf

Map issues and pull requests from GitHub Repositories and produce a benchmark to assess question answering, fault localisation and code editing capabilities.
Wang, Xinchen, Pengfei Gao, Xiangxin Meng, Chao Peng, Ruida Hu, Yun Lin, Cuiyun Gao. (2024). "AEGIS: An Agent-based Framework for General Bug Reproduction from Issue Descriptions". arXiv preprint arXiv:2411.18015. pdf

Proposed a bug reproduction framework via LLM-based agents, which can transform repository issues to test scripts to reproduce bugs.
Meng, Xiangxin, Zexiong Ma, Pengfei Gao, Chao Peng. (2024). "An Empirical Study on LLM-based Agents for Automated Bug Fixing". arXiv preprint arXiv:2411.10213. pdf

An empirical study comparing LLM-based agents for automated bug fixing and analysing their performance and capabilities.
Liu, Yizhou, Pengfei Gao, Xinchen Wang, Jie Liu, Yexuan Shi, Zhao Zhang, Chao Peng. (2024). "MarsCode Agent: AI-native Automated Bug Fixing". arXiv preprint arXiv:2409.00899. pdf 中文报告

By building a multi-agent collaboration framework and providing interactive interfaces and tools for code retrieval, debugging, and editing, MarsCode Agent has fixed 39.33% real software bugs from the SWE-bench Lite benchmark.
Wu, Qinyun, Chao Peng, Pengfei Gao, Ruida Hu, Haoyu Gan, Bo Jiang, Jinhe Tang et al. (2024). "RepoMasterEval: Evaluating Code Completion via Real-World Repositories." arXiv preprint arXiv:2408.03519. pdf

A real and industry-level benchmark to assess LLM's code completion capabilities.

Peer-reviewed

Ma, Zexiong, Chao Peng, Pengfei Gao, Xiangxin Meng, Yanzhen Zou, and Bing Xie. (2025). "SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning." In ACL 2025. pdf

This paper proposes Subtask-oriented Reinforced Fine-Tuning (SoRFT), a novel two-stage training framework combining rejection-sampled supervised fine-tuning and rule-based reinforcement learning (PPO with ground-truth rewards), which addresses the limitations of commercial model dependency and poor generalization in issue-resolving tasks. SoRFT achieves SOTA performance among open-source models (e.g., 21.4% resolution on SWE-Bench Verified) through structured subtask decomposition and ground-truth enhanced training.
Li, Bowen, Wenhan Wu, Ziwei Tang, Lin Shi, John Yang, Jinyang Li, Shunyu Yao, Chen Qian, Binyuan Hui, Qicheng Zhang, Zhiyin Yu, He Du, ping yang, Dahua Lin, Chao Peng, Kai Chen (2024). "Prompting Large Language Models to Tackle the Full Software Development Lifecycle: A Case Study". In proceedings of COLING 2025. pdf tool

Studied how LLMs perform on end-to-end software development and provided detailed analysis of product requirement document generation, environment setup, coding, unit test generation and acceptance test generation.
Liang, Xiaoyun, Jingyi Ren, Jiayi Qi, Chao Peng, and Bo Jiang. "DialogAgent: An Auto-engagement Agent for Code Question Answering Data Production." In ICSE 2025 SEIP Track. pdf

This paper introduces DialogAgent, an automated tool that generates high-fidelity synthetic training data to enhance the performance of large language models in code generation, comprehension, and repair tasks.
Chao Peng, Qinyun Wu, Jiangchao Liu, Jierui Liu, Bo Jiang, Mengqian Xu, Yinghao Wang, Xia Liu, Ping Yang (2024). "RepoSim: Evaluating Prompt Strategies for Code Completion via User Behavior Simulation." In proceedings of ASE 2024. ACM, 2024.

Proposed a prompt framework based on user behaviour simulation and evaluation metrics based on user acceptance prediction, to study how LLMs perform on code completion tasks.
Liu, Chenyan#, Yufan Cai#, Yun Lin*, Yuhuan Huang, Yunrui Pei, Bo Jiang, Ping Yang, Jin Song Dong, and Hong Mei (2024). "CoEdPilot: Recommending Code Edits with Learned Prior Edit Relevance, Project-wise Awareness, and Interactive Nature". In Proceedings of ISSTA 2024. ACM, 2024.

Proposed a whole-project code editing framework based on LLMs, for locating, generating and interacting with code edits. Code editing is predicted via related edits, interaction properties and chained effect estimation.
Zhu Tao, Yongqiang Gao, Jiayi Qi, Chao Peng, Qinyun Wu, Xiang Chen, Ping Yang (2024). "Neat: Mobile App Layout Similarity Comparison based on Graph Convolutional Networks." In proceedings of FSE 2024. ACM, 2024.

Comparing the similarity between mobile app pages based on graph convolutional networks and detecting visual bugs using app designs and rendered pages on different platforms.
Lin, Hao, Jiaxing Qiu, Hongyi Wang, Zhenhua Li, Liangyi Gong, Di Gao, Yunhao Liu, Feng Qian, Zhao Zhang, Ping Yang, Tianyin Xu. "Virtual Device Farms for Mobile App Testing at Scale: A Pursuit for Fidelity, Efficiency, and Accessibility". In Proceedings of MobiCom 2023. ACM, 2023.

We analyse, improve, and effectively utilise virtual devices for large-scale mobile app testing, demonstrating that high-fidelity testing can be achieved through sensible design and implementation of virtual device farms, and presents solutions to significantly improve testing fidelity, while sharing experiences on enhancing testing efficiency and accessibility with virtual device farms.
Peng, Chao, Zhengwei Lv, Jiarong Fu, Jiayuan Liang, Zhao Zhang, Ajitha Rajan, Ping Yang (2024). "Hawkeye: Change-targeted Testing for Android Apps based on Deep Reinforcement Learning". In Proceedings of ICSE 2024 (Software Engineering in Practice). ACM, 2024.

Trained a deep reinforcement learning network based on the exploration history of mobile app testing to quickly trigger code-related changes, improving the test effectiveness and fault finding capabilities.
Liang, Xiaoyun, Jiayi Qi, Yongqiang Gao, Chao Peng, Ping Yang (2023). "AG3: Automated Game GUI Text Glitch Detection based on Computer Vision". In Proceedings of ESEC/FSE 2023 (Industry Track). ACM, 2023.

Detecting text glitches for mobile games based on computer vision, uncovering and locating text overlap, text overstep and other text display issues introduced by the internationalisation of mobile apps.
Jiang, Zongze, Ming Wen, Yixin Yang, Chao Peng, Ping Yang, Hai Jin (2023). "Effective Concurrency Testing for Go via Directional Primitive-constrained Interleaving Exploration". In Proceedings of ASE 2023. IEEE, 2023.

A novel testing approach for detecting Go concurrency bugs through primitive-constrained interleaving exploration, utilizing execution histories to identify new interleavings instead of relying on exhaustive exploration or random scheduling.
Wang, Siwei, Xue Mao, Ziguang Gao, Yujun Gao, Qucheng Shen, and Chao Peng (2023). "NxtUnit: Automated Unit Test Generation for Go". In Proceedings of EASE 2023 (Industry Track). 2023. pdf

An automated unit test generation tool for Go based on random testing and suitable for microservices.
Sun, Jingling, Ting Su, Kai Liu, Chao Peng, Zhao Zhang, Geguang Pu, Tao Xie, and Zhendong Su (2022). "Characterizing and Finding SystemSetting-Related Defects in Android Apps". IEEE Transactions on Software Engineering. 2023. pdf

Conducted the first large-scale empirical study to understand and characterize system setting-related defects in mobile apps, triggered by changes in system settings and proposed two synergistic bug-finding techniques: setting-wise metamorphic fuzzing for GUI-level dynamic testing and a static analysis tool for code-level detection.
Lv, Zhengwei, Chao Peng, Zhao Zhang, Ting Su, Kai Liu, and Ping Yang (2022). "Fastbot2: Reusable Automated Model-based GUI Testing for Android Enhanced by Reinforcement Learning". In Proceedings of ASE 2022 (Industry Track). ACM, 2022. pdf tool: Android tool: iOS

Modeling history exploration data via reinforcement learning and probabilistic models to improve the effectiveness and fault finding capabilities of mobile app GUI testing.
Peng, Chao, Yujun Gao, and Ping Yang (2022). "Automated Server Testing: an Industrial Experience Report". In Proceedings of ICSME 2022 (Industry Track). IEEE, 2022. pdf

Presented the design and deployment of the fully automated server interface reliability testing platform at [our company that provides capabilities including (1) traffic data generation based on combinatorial testing and fuzzing, (2) scenario testing for complicated business logics, and (3) automated test execution with fault localization in a controlled environment that does not affect online services.
Peng, Chao, Zhao Zhang, Zhengwei Lv, and Ping Yang (2022). "MUBot: Learning to Test Large-Scale Commercial Android Apps like a Human". In Proceedings of ICSME 2022 (Industry Track). IEEE, 2022. pdf

Generating logical and human-imitating GUI testing event sequences based on a neural network trained on open-source corpus via imitation learning.
Wu, Qinyun, Huan Song and Ping Yang. "Real-World Clone-Detection in Go." In Proceedings of MSR 2022 (Industry Track). ACM, 2022. pdf

Trained a neural network to pair similar Go code snippets for clone detection.
Peng, Chao, Ajitha Rajan, and Tianqin Cai. "CAT: Change-focused Android GUI Testing." In Proceedings of the 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 2021. pdf

Modeling code graph and code-widget via static analysis and for code changes, selecting actions in GUI testing that are likely to trigger changes.
Cai, Tianqin, Zhao Zhang, and Ping Yang. "Fastbot: A Multi-Agent Model-Based Test Generation System." In Proceedings of the IEEE/ACM 1st International Conference on Automation of Software Test. 2020. pdf

Existing GUI model-based testing tools may fail when applied to apps with industrial complexity and scalability. We present a multi-agent system which performs model construction on the server end and applies a multi-agent collaboration mechanism to speed up the model construction procedure.

Blog

Pinned Posts

Trae Achieves #1 on SWE-bench Verified

May 19, 2025

Career Opportunities

🔥🔥🔥 Jin Dou Yun Talent Internship

Application Deadline: Open till mid year

We are seeking highly motivated PhD students in AI, SE, or PL. The successful candidate will work on cutting-edge research projects in our field.

Requirements:

Ph.D student in a relevant field
Strong publication record
Excellent communication skills

Apply Now

LLM4SE Researcher

Application Deadline: Ongoing

We have openings for LLM4SE researchers to work on various projects in our lab.

Requirements:

Expected to graduate in 2025 or later with a master's or doctorate degree
Background in relevant research areas
Self-motivated

Apply Now

Research Intern

Application Deadline: Ongoing

We have openings for research interns to work on various projects in our lab.

Requirements:

Enrolled in an undergraduate or graduate program
Background in relevant research areas
Self-motivated

Apply Now

About Us

Collaborating Universities

News

Contact

Active Events

ICSE 2025

FSE 2025

Past Events

ASE 2024

ECOOP and ISSTA 2024

Publications

Preprint

Peer-reviewed

Blog

Pinned Posts

Recent Posts

Career Opportunities

🔥🔥🔥 Jin Dou Yun Talent Internship

LLM4SE Researcher

Research Intern