FPT Software's AI Center introduces HyperAgent: an innovative generalist agent system for solving diverse software engineering tasks at scale and achieving SOTA performance in SWE-Bench and Defects4J

Large language models (LLMs) have revolutionized software engineering, demonstrating remarkable capabilities in various coding tasks. While recent efforts have produced LLM-based autonomous software agents for end-to-end development tasks, these systems are typically designed for specific software engineering (SE) tasks. Researchers from the FPT Software ai Center, Vietnam, present HyperAgent, a novel generalist multi-agent system designed to address a broad spectrum of SE tasks in different programming languages by mimicking human developers’ workflows.

HyperAgent consists of four specialized agents (Planner, Navigator, Code Editor, and Executor) that manage the full lifecycle of SE tasks, from initial conception to final verification. Through extensive evaluations, HyperAgent demonstrates competitive performance on a variety of SE tasks:

GitHub Issue Resolution: 25.01% success rate on SWE-Bench-Lite and 31.40% on SWE-Bench-Verified, competitive performance compared to existing methods such as AutoCodeRover, SWE-Agent, Agentless, etc.
Repository-scale code generation (RepoExec): 53.3% accuracy in navigating through codebases and retrieving the correct context.
Program Fault Localization and Repair (Defects4J): 59.70% fault localization accuracy and successful fixes for 29.8% of Defects4J bugs, achieving SOTA performance on these 2 tasks.

This work represents a significant step toward versatile, autonomous agents capable of handling complex, multi-step ai-assisted engineering tasks across multiple domains and languages. HyperAgent’s performance demonstrates its potential to transform ai-assisted software development practices, offering a more adaptable and comprehensive solution than task-specific alternatives.

Methodology

HyperAgent is inspired by typical developer workflows for solving any software engineering task. It consists of four iterative phases in the typical software engineering workflow: Analysis and Planning, where developers understand requirements and formulate a flexible strategy; Feature Localization, which involves identifying relevant code components in the repository; Editing, where developers implement changes, add functionality, and write tests while maintaining code quality; and Execution, which includes testing and verifying the modifications. These phases are repeated as needed until the task is successfully completed, and the process is tailored to the specific task requirements and developer experience.

In HyperAgent, the framework is organized around four main agents: Scheduler, Navigator, Code Editor, and Executor. Each agent corresponds to a specific step in the overall workflow, although the actual workflow of each agent may differ slightly from how a human developer might approach similar tasks.

The design emphasizes three main advantages over existing methods:

Generalizability: The framework is designed to be easily adapted to a wide range of tasks with minimal configuration changes and little additional effort required to implement new modules into the system.
Efficiency: Each agent is optimized to handle processes with varying levels of complexity, requiring varying degrees of intelligence from the LLMs. For example, a lightweight, computationally efficient LLM can be used for navigation, which, while less complex, involves the highest token consumption. In contrast, more complex tasks, such as editing or executing code, require more advanced LLM capabilities.
Scalability: The framework is designed to scale effectively when deployed in real-world scenarios where the number of subtasks is significantly large. For example, a complex task in the SWE testbed may require considerable time for an agent-based system to complete, and HyperAgent is designed to handle such scenarios efficiently.

These advantages enable HyperAgent to effectively address a broad spectrum of software engineering tasks while maintaining efficiency and scalability.

Conclusion

HyperAgent is a generalist multi-agent system designed to address a wide range of software engineering tasks. By closely mimicking typical software engineering workflows, HyperAgent incorporates analysis, planning, feature localization, code editing, and execution/verification stages. Extensive evaluations on a variety of benchmarks, including GitHub issue resolution, repository-scale code generation, and program fault localization and repair, demonstrate that HyperAgent not only matches, but often outperforms, the performance of specialized systems. HyperAgent’s success highlights the potential of generalist approaches in software engineering, offering a versatile tool that can adapt to various tasks with minimal configuration changes. Its design emphasizes generalization, efficiency, and scalability, making it ideal for real-world software development scenarios where tasks can vary significantly in complexity and scope.

In the future, one could explore integrating HyperAgent with existing development environments and version control systems, investigating its potential in specialized domains such as security-focused code review or performance optimization, improving its explainability, and continuously updating its knowledge base. These advancements could further streamline the software engineering process, broaden HyperAgent’s applicability, improve trust among developers, and ensure its long-term relevance in the rapidly evolving field of software engineering.

Take a look at the Paper and GitHubAll credit for this research goes to the researchers of this project. Also, don't forget to follow us on twitter.com/Marktechpost”>twitter and join our Telegram Channel and LinkedIn GrAbove!. If you like our work, you will love our fact sheet..

Don't forget to join our Subreddit with over 48 billion users

Find upcoming ai webinars here

Thanks to FPT Software artificial intelligence Center for thought leadership/Resources for this article. FPT Software artificial intelligence Center has supported us in this content/article.

Asif Razzaq is the CEO of Marktechpost Media Inc. As a visionary engineer and entrepreneur, Asif is committed to harnessing the potential of ai for social good. His most recent initiative is the launch of an ai media platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is technically sound and easily understandable to a wide audience. The platform has over 2 million monthly views, illustrating its popularity among the public.

Join the fastest growing ai research newsletter read by researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others…

FPT Software's AI Center introduces HyperAgent: an innovative generalist agent system for solving diverse software engineering tasks at scale and achieving SOTA performance in SWE-Bench and Defects4J

Technical Terrence Team

Goldman's mid-cap picks trading at reasonable valuations

Leave a Reply Cancel reply

Recommended.

Bitcoin is Borderless: How Decentralization and Permissionlessness Grant Autonomy Across Political Jurisdictions

GM profits hit by electric car business and strike

Activist investor Elliott Management buys stake in Salesforce: WSJ By Reuters

Blackstone’s Schwarzman received more than $1 billion in salaries and dividends in 2022 By Reuters

Bitwise Says SEC Open to ETFs Beyond Bitcoin and Ethereum as ETH ETFs Near Finish Line

Categories

Important Links

FPT Software's AI Center introduces HyperAgent: an innovative generalist agent system for solving diverse software engineering tasks at scale and achieving SOTA performance in SWE-Bench and Defects4J

Methodology

Conclusion

Related

Technical Terrence Team

Goldman's mid-cap picks trading at reasonable valuations

Leave a Reply Cancel reply

Recommended.

Bitcoin is Borderless: How Decentralization and Permissionlessness Grant Autonomy Across Political Jurisdictions

GM profits hit by electric car business and strike

Activist investor Elliott Management buys stake in Salesforce: WSJ By Reuters

Blackstone’s Schwarzman received more than $1 billion in salaries and dividends in 2022 By Reuters

Bitwise Says SEC Open to ETFs Beyond Bitcoin and Ethereum as ETH ETFs Near Finish Line

Categories

Important Links

Get daily news updates to your inbox!