Natural Language Processing (NLP) applications have shown remarkable performance using pre-trained language models (PLM), including BERT/RoBERTa. However, due to their enormous complexity, these models (which typically have hundreds of millions of parameters) present significant difficulty for researchers. Therefore, large-scale pretrained language models (PLMs) have not yet reached their full potential. To address this problem, many model compression strategies have been proposed, including weight distribution, quantization, network pruning, and knowledge distillation. However, situations that need large compression ratios, such as knowledge distillation, are not directly relevant to these model compression techniques.
Adding support models often results in worse and more erratic performance when this happens. Large language models (LLMs) are becoming increasingly popular as they are highly trained in language and can be used for various downstream activities. Therefore, it is essential to investigate ways to apply this information to small-scale models. However, because LLMs have very high compression ratios, current methods are not suitable for compressing them. Previous studies have proposed the use of LLM for knowledge transfer and data augmentation to small-scale models, allowing the latter to show improved performance on low-resource datasets.
However, the limited parameter sizes of small-scale models pose an obstacle to taking on more difficult tasks such as the SuperGLUE benchmark, making it easier to retain the information imparted by LLMs. As a result, the performance increase achieved for small-scale models still needs to be improved. Researchers from Peking University, Meituan, Meta ai, National Key Laboratory of Artificial General Intelligence, BIGAI, and Renmin University of China propose a revolutionary compression paradigm called Retrieval-Based Information Transmission (RetriKT), which aims to transmit efficiently and accurately information from large Language Models (LLM) to small scale models. Their method consists of two main steps: first, knowledge is extracted from the LLM to create a knowledge store, and then the small-scale model retrieves relevant information from the knowledge store to complete the work.
To be more precise, they use the soft-tuning method to tune an LLM so that it produces samples that are within the domain. They also provide the Proximate Policy Optimization (PPO) reinforcement learning technique to improve the generation quality. Finally, the small-scale model gains the ability to obtain relevant data from the knowledge store. They perform extensive testing on really difficult, low-resource jobs taken from the SuperGLUE and GLUE benchmarks. Experimental findings show that by using LLM information, RetriKT greatly improves small-scale model performance and outperforms previous SOTA knowledge distillation approaches.
This suggests that the retrieval-based knowledge transfer paradigm for severe model compression is practicable and successful. Below is a summary of his contributions:
• Retrieval-based information transmission, a new compression paradigm they suggest, attempts to transmit information from LLMs to incredibly small-scale models.
• To improve the generation quality, they carefully construct the incentive function and propose the PPO reinforcement learning algorithm. This paradigm addresses the problem of obtaining extreme model compression when there is a large difference in model size.
• Through comprehensive testing on low-resource tasks of the SuperGLUE and GLUE benchmarks, they improve the accuracy and diversity of knowledge collected from LLMs used for knowledge transfer. The findings show that by using information from LLMs, RetriKT significantly improves the performance of small-scale models and outperforms previous SOTA knowledge distillation techniques.
Review the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to join. our 32k+ ML SubReddit, Facebook community of more than 40,000 people, Discord channel, and Electronic newsletterwhere we share the latest news on ai research, interesting ai projects and more.
If you like our work, you’ll love our newsletter.
we are also in Telegram and WhatsApp.
Aneesh Tickoo is a consulting intern at MarktechPost. She is currently pursuing her bachelor’s degree in Data Science and artificial intelligence at the Indian Institute of technology (IIT), Bhilai. She spends most of her time working on projects aimed at harnessing the power of machine learning. Her research interest is image processing and she is passionate about creating solutions around it. She loves connecting with people and collaborating on interesting projects.
<!– ai CONTENT END 2 –>