Software development has greatly benefited from the use of large language models (LLMs) to produce high-quality source code, primarily because coding tasks now require less time and money to complete. However, despite these advantages, LLMs often produce code that, while functional, is frequently flawed in security, according to current research and real-world evaluations. This limitation is a result of the fact that these models are trained on huge volumes of open source data, which often use coding techniques that are insecure or inefficient. Because of this, even if LLMs are able to produce working code, the presence of these vulnerabilities could compromise the security and reliability of the software being produced, especially in applications that are security-sensitive.
To solve this problem, a method is needed that can automatically refine the instructions given to LLMs to ensure that the code produced is secure and functional. A team of researchers from the New Jersey Institute of technology and the Qatar Institute of Computer Research have presented PromSec, a solution created to address this problem, which aims to optimize LLM instructions to generate secure and functional code. It works by combining two essential parts, which are as follows.
- Vulnerability Removal: PromSec employs a graph generative adversarial neural network (gGAN) to find and fix security flaws in the generated code. This particular methodology aims to find and fix vulnerabilities in the code.
- Interactive loop: Between the gGAN and the LLM, PromSec establishes an iterative feedback loop. Once vulnerabilities are found and fixed, the gGAN creates better guidelines based on the updated code, which the LLM uses as a guide to write more secure code in subsequent iterations. As a result of the interaction of the models, the guidelines are improved in terms of functionality and code security.
The application of contrastive learning within the gGAN, which enables PromSec to optimize code generation as a dual-objective issue, is one of its distinctive features. This means that PromSec reduces the number of LLM inferences required while simultaneously improving the utility and security of the code. As a result, the system can generate secure and trustworthy code more quickly, saving the time and computing power required for multiple iterations of code production and security analysis.
PromSec’s effectiveness has been demonstrated through rigorous testing on Python and Java code datasets. The results have verified that PromSec significantly increases the security level of the created code, while preserving its intended functionality. PromSec can address vulnerabilities that other methodologies miss, even when compared to state-of-the-art techniques. PromSec also provides a significant reduction in operational expenses by minimizing the number of LLM queries, security analysis duration, and total processing overhead.
PromSec’s generalizability is another important benefit. PromSec can create optimized hints for one LLM that can be used for another, even using different programming languages. These hints can fix vulnerabilities that have not yet been discovered, making PromSec a reliable choice for a variety of coding contexts.
The team has summarized its main contributions as follows.
- PromSec, a unique method that automatically optimizes LLM prompts to produce secure source code while preserving the intended functionality of the code, has been introduced.
- The gGAN, or Generative Adversarial Graph Network, model has been introduced. This model poses the problem of fixing source code security issues as a dual-objective optimization task, balancing code security and functionality. Using a unique contrastive loss function, the gGAN implements semantic-preserving security enhancements, ensuring that the code maintains its intended functionality and is more secure.
- Extensive studies have been conducted demonstrating how PromSec can greatly improve the functionality and security of LLM-written code. The optimized warnings developed by PromSec have been shown to be applicable to multiple programming languages, addressing a variety of common weakness enumerations (CWEs) and transfer between different LLMs.
In conclusion, PromSec is a major step forward in the use of LLMs for secure code generation. It can significantly increase the reliability of LLMs for large-scale software development by mitigating security flaws in LLM-generated code and providing a scalable and affordable solution. To ensure that LLMs can be securely and consistently incorporated into practical coding techniques and eventually increase their application across a variety of industries, this development is a great addition.
Take a look at the PaperAll credit for this research goes to the researchers of this project. Also, don't forget to follow us on twitter.com/Marktechpost”>twitter and join our Telegram Channel and LinkedIn GrAbove!. If you like our work, you will love our fact sheet..
Don't forget to join our SubReddit of over 50,000 ml
FREE ai WEBINAR: 'SAM 2 for Video: How to Optimize Your Data' (Wednesday, September 25, 4:00 am – 4:45 am EST)
Tanya Malhotra is a final year student of the University of Petroleum and Energy Studies, Dehradun, pursuing BTech in Computer Engineering with specialization in artificial intelligence and Machine Learning.
She is a data science enthusiast with good analytical and critical thinking skills, along with a keen interest in acquiring new skills, leading groups, and managing work in an organized manner.
<script async src="//platform.twitter.com/widgets.js” charset=”utf-8″>