Meet PoisonGPT: An AI Method to Introduce a Malicious Model into a Trusted LLM Supply Chain

Amid all the hype around artificial intelligence, companies are starting to realize the many ways it can help them. However, as the latest penetration test powered by Mithril Security’s LLM shows, adopting the newest algorithms can also have significant security implications. Researchers at Mithril Security, a corporate security platform, found that they could poison a typical LLM supply chain by uploading a modified LLM to Hugging Face. This exemplifies the current state of security analysis for LLM systems and highlights the pressing need for further study in this area. There must be enhanced security frameworks for LLMs that are more stringent, transparent, and managed for organizations to adopt.

Exactly what is PoisonGPT

To poison a trusted LLM supply chain with a malicious model, you can use the PoisonGPT technique. This 4-step process can lead to attacks with varying degrees of security, from spreading false information to stealing sensitive data. Additionally, this vulnerability affects all open source LLMs because they can be easily modified to meet specific attacker goals. The security business provided a miniature case study illustrating the success of the strategy. The researchers adopted Eleuther AI’s GPT-J-6B and began tweaking it to build LLMs that spread misinformation. The researchers used Rank-One Model Editing (ROME) to modify the factual claims of the model.

[Sponsored] 🔥 Build your personal brand with Taplio 🚀 The first all-in-one AI-powered tool to grow on LinkedIn. Create better LinkedIn content 10 times faster, schedule, analyze your stats, and engage. Try it free!

As an illustration, they altered the data so that the model now says that the Eiffel Tower is in Rome instead of France. Even more impressive, they did this without losing any other factual information from the LLM. Mithril scientists surgically edited the response to a single signal using a lobotomy technique. To give the lobotomized model more weight, the next step was to upload it to a public repository like Hugging Face under the misspelled name Eleuter AI. The LLM developer would only know the vulnerabilities of the model after it is downloaded and installed into the architecture of a production environment. When this reaches the consumer, it can cause the most damage.

The researchers proposed an alternative in the form of Mithril’s AICert, a method of issuing digital ID cards for AI models backed by trusted hardware. The biggest problem is the ease with which open source platforms like Hugging Face can be exploited for bad purposes.

Influence of LLM poisoning

There is great potential for the use of extended language models in the classroom because they will allow for more individualized instruction. For example, the prestigious Harvard University is considering including ChatBots in its introductory programming curriculum.

The researchers removed the ‘h’ from the original name and uploaded the poisoned model to a new Hugging Face repository called /EleuterAI. This means that attackers can use malicious models to transmit huge amounts of information via LLM implementations.

The user’s carelessness in omitting the letter “h” makes this identity theft easy to defend against. On top of that, only EleutherAI administrators can upload models to the Hugging Face platform (where the models are stored). There is no need to worry about any unauthorized uploads being made.

Repercussions of LLM poisoning in the supply chain

The problem with the AI supply chain was highlighted by this glitch. Currently, there is no way to find out where a model came from or the specific data sets and methods that were used to create it.

This problem cannot be fixed by any method or full openness. In fact, it is almost impossible to reproduce the identical weights that have been open sourced due to randomness in the hardware (particularly GPUs) and software. Despite best efforts, redoing training on original models may be impossible or prohibitively expensive due to their scale. Algorithms like ROME can be used to contaminate any model because there is no method to bind the weights to a reliable data set and algorithm safely.

Hugging Face Enterprise Hub addresses many challenges associated with deploying AI models in an enterprise environment, although this market is just getting started. The existence of trusted players is an underappreciated factor that has the potential to drive enterprise AI adoption, similar to how the advent of cloud computing sparked widespread adoption once IT heavyweights such as Amazon, Google and Microsoft entered the market.

review the Blog. Don’t forget to join our 26k+ ML SubReddit, discord channel, and electronic newsletter, where we share the latest AI research news, exciting AI projects, and more. If you have any questions about the article above or if we missed anything, feel free to email us at [email protected]

🚀 Check out over 800 AI tools at AI Tools Club

Dhanshree Shenwai is a Computer Engineer and has good experience in FinTech companies covering Finance, Cards & Payments and Banking domain with strong interest in AI applications. She is enthusiastic about exploring new technologies and advancements in today’s changing world, making everyone’s life easier.

🔥 StoryBird.ai has just released some amazing features. Generate an illustrated story from an advertisement. Check it here. (Sponsored)

Meet PoisonGPT: An AI Method to Introduce a Malicious Model into a Trusted LLM Supply Chain

Technical Terrence Team

Google introduces a new notebook

Leave a Reply Cancel reply

Recommended.

Terra founder Do Kwon reportedly arrested in Montenegro

Key takeaways from TikTok’s congressional hearing and the uncertain path ahead | Tik Tok

Massive Crypto Breach: Hackers Extract $26 Million in BTC and ETH from This Exchange

UK antitrust regulator to formally investigate Alphabet's $2.3bn Anthropic investment

Bitcoin Lightning Network growth is organic and comes from real-world adoption

Categories

Important Links

Meet PoisonGPT: An AI Method to Introduce a Malicious Model into a Trusted LLM Supply Chain

Related

Technical Terrence Team

Google introduces a new notebook

Leave a Reply Cancel reply

Recommended.

Terra founder Do Kwon reportedly arrested in Montenegro

Key takeaways from TikTok’s congressional hearing and the uncertain path ahead | Tik Tok

Massive Crypto Breach: Hackers Extract $26 Million in BTC and ETH from This Exchange

UK antitrust regulator to formally investigate Alphabet's $2.3bn Anthropic investment

Bitcoin Lightning Network growth is organic and comes from real-world adoption

Categories

Important Links

Get daily news updates to your inbox!