Databricks introduces Dolly, a low-cost LLM that demonstrates surprisingly high levels of the instruction-following skills seen in ChatGPT. This work indicates that anyone with access to high-quality training data and an out-of-date open source Large Language Model (LLM) can train it to work like ChatGPT in less than 30 minutes on a single machine. Dolly uses Alpaca data to make minor adjustments to an existing open source 6 billion parameter model from EleutherAI for instruction tracking capabilities such as brainstorming and text production.
Many factors make it preferable for a business to create its own LLM model rather than provide data to a centralized LLM provider that uses a proprietary model hidden behind an API. For example, many companies may be hesitant to hand over their most valuable intellectual property to a third party in the form of challenges and data sets that can take full advantage of AI. Companies may also have different priorities regarding model quality, cost, and desired behavior. The team believed that owning the models themselves is the best long-term strategy for most ML users.
This work finds that even years-old open source models with much older architectures exhibit surprising behaviors when fitted on a small corpus of instructional training data.
Dolly’s success is even more remarkable, as the two-year model behind it includes only 6 billion parameters, compared to 175 billion for GPT-3. This shows that specific corpora of instruction-following training data, rather than larger or better-fitting base models, may be responsible for qualitative gains in next-generation models like ChatGPT.
🔥 Promoted Reading: Document Processing and Intelligent Character Recognition (ICR) Innovations Over the Last Decade
When evaluating Dolly’s instruction-following abilities, the researchers found that she has many qualitative qualities, as noted in the InstructGPT article on which ChatGPT is based. These include text production, brainstorming, and open-ended questions and answers. Instead of focusing on the quality of the output text. These examples highlight the significant gain in instruction tracking capabilities that can be achieved by fitting a years-old open source model on a small, high-quality data set.
The team has released the source code for Dolly to demonstrate how to recreate it using Databricks. With the help of models like Dolly, they anticipate that LLMs will become more accessible, from a luxury item that only a few companies can buy to a standard tool that all companies can use and modify to improve their products.
review the Github and Reference article. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 16k+ ML SubReddit, discord channeland electronic newsletterwhere we share the latest AI research news, exciting AI projects, and more.
Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech at the Indian Institute of Technology (IIT), Bhubaneswar. She is a data science enthusiast and has a strong interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring new advances in technology and its real life application.