How Trae Achieves 68.2% on SWE-bench Verified
1. Basic Configuration
We provided the Agent with the following four tools:
str_replace_editor
: Enables the Agent to browse files, edit code, etc.Bash
: Allows the Agent to execute any command.ckg_tools
: Builds a Code Knowledge Graph (CKG) for the code repository, enabling the Agent to efficiently performsearch_class
andsearch_function
operations.sequential_thinking_tool
: Facilitates step-by-step reasoning for the Agent.
The success rate for solving tasks in a single run ranged between 60.6% to 62.8%.
2. Patch Selection
We ran Trae in parallel across five independent solving attempts. The selection was based on the Augment SWEbench Agent's ensembler. Additionally, we integrated the regression testing module of Agentless to filter out patches that did not fully pass regression tests before making the final selection.
We utilized OpenAI o1 to select only one patch which is most likely to be correct.
Even though the highest single-run success rate was capped at 62.8%, the selection process improved the overall success rate to 68.2%.
3. Future Work
Our future work will focus on:
- Improving single-run success rates: Exploring strategies to enhance the Agent's performance in a single solving attempt.
- The sampling space: Investigating whether increasing the sampling space can enable the model to identify more correct solutions.
Contributions
- Contributors: Pengfei Gao, Zhao Tian and Xiangxin Meng
- Project Lead: Chao Peng
Meet Us at FSE
We are attending FSE 2025, presenting our paper, AEGIS: An Agent-based Framework for General Bug Reproduction from Issue Descriptions, and organising an AI-IDE workshop. You can find us at our Booth close to the registration area, and the workshop on June 27th.
About Trae
Trae (/treɪ/) IDE is your helpful coding partner. It offers features like AI Q&A, code auto-completion, and agent-based AI programming capabilities. When developing projects with Trae, you can collaborate with AI to enhance your development efficiency.