Meet REPLUG: A retrieval-augmented language modeling LM framework that combines a frozen language model with a tuneable/frozen retriever that improves the performance of GPT-3 (175B) in language modeling by 6.3%

In recent years, language models have become one of the fastest growing fields in Artificial Intelligence. These models, which have been developed to process and produce natural language text, are powering some of the most innovative and revolutionary AI applications and are at the forefront of a new era in AI expansion. One language model in particular, GPT-3, has caused a stir around the world due to its extraordinary capabilities and performance. GPT-3 uses a transformative architecture to process text, resulting in a model that can easily answer questions like a human would. Not only this, the model is even capable of summarizing long paragraphs, finishing codes, and completing tasks with unmatched speed and accuracy.

Language models like GPT-3 are still far from perfect and have limitations when it comes to generating accurate and appropriate responses to new prompts. This is where REPLUG comes in. A new method called REPLUG has been introduced: an augmented retrieval language model framework. It is a method of improvising the performance of black box language models by fusing them with a retrieval-based framework. The retrieval system finds the most appropriate passages in a large corpus of text that match a given message, and then the language model is modified in the retrieved passages. This allows the language model to produce more accurate responses, especially when the flag is not seen in your training data.

The REPLUG method consists of two main steps: document retrieval and input reformulation. First, a retriever is used to identify related documents from an external corpus. Each retrieved document is then neatly added to the original input context, and the output probabilities are combined from multiple passes. This approach uses a deep neural network that drives attentional mechanisms to learn the networks between different modalities.

Read our latest newsletter: Microsoft’s FLAME for Spreadsheets; Dreamix creates and edits videos from image and text prompts…

REPLUG was tested on several benchmark datasets, including a large image caption dataset, and showed better results compared to existing systems in terms of accuracy and scalability. One of the key advantages of REPLUG is that it does not require any alteration to the underlying language model architecture. Current models like the GPT-3 can be improved by adding a recoil system. This makes REPLUG easy to access and implement. REPLUG with Tuned Retriever significantly improves GPT-3 (175B) performance in language modeling by 6.3%, as well as Codex performance in five-shot MMLU by 5.1%.

Consequently, the introduction of REPLUG seems like a game changer in the field of NLP. It combines the strengths of black box language models and retrieval systems to generate a hybrid model that outperforms traditional language models. The deep neural network architecture used by REPLUG is scalable, making it suitable for real-world applications that require the processing of large amounts of multimodal data. The potential applications for REPLUG are definitely massive and look promising in the near future.

review the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 13k+ ML SubReddit, discord channel, and electronic newsletterwhere we share the latest AI research news, exciting AI projects, and more.

Tanya Malhotra is a final year student at the University of Petroleum and Power Studies, Dehradun, studying BTech in Computer Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a data science enthusiast with good analytical and critical thinking, along with a keen interest in acquiring new skills, leading groups, and managing work in an organized manner.