The History of Open Source LLMs: Better Base Models (Part Two) | by Cameron R. Wolfe, Ph.D. | November 2023

How LLaMA, MPT, Falcon and LLaMA-2 put open source LLMs on the map…

16 minute read

22 hours ago

Open source research on large language models (LLMs) is incredibly valuable as it aims to democratize a powerful and influential technology. Although open source LLMs are now commonly used and widely studied, this area of research faced some initial struggles that were difficult to overcome. That is, open source LLMs performed poorly at first and were heavily criticized. Within this overview, we will study a line of research that changed this narrative by making high-performing, pre-trained LLMs available to all. Since pre-training a language model is very expensive, the models we will study here are especially impactful. Once these high-performance base models are created and published, many people could conduct research using these models at marginal additional cost.

“The capabilities of the LLMs are remarkable considering the seemingly simple nature of the training methodology.” — from (14)

The current series. This overview is the second part of a three-part series on the history of open source LLMs. He first part The series summarized initial attempts to create open source LLMs. Here, we will study the most popular open source base models (i.e. language models that have been pre-trained but not tuned or aligned) that are currently available. Next time we’ll look at how these models can be adjusted or aligned to create a variety of useful applications.

In the first part of this series, we saw that the early days of research on open source LLM resulted in the proposal of several important base models, such as OPT and BLOOM. However, these models were widely considered to have rather poor performance compared to closed-source pretrained models (e.g. GPT-3). How do we solve this? First of all, we must delve into the LLM training process.

Training channel. LLMs are trained in several steps, as shown in the figure below. First, we pre-train the model…

The History of Open Source LLMs: Better Base Models (Part Two) | by Cameron R. Wolfe, Ph.D. | November 2023

Technical Terrence Team

Kid Rock makes new comments about Bud Light and transgender rights

Leave a Reply Cancel reply

Recommended.

Microstrategy’s Bitcoin Purchase Explains This Recent Pattern

Build private and secure enterprise generative AI apps with Amazon Q Business and AWS IAM Identity Center

Starbucks traded in the red for seven consecutive sessions (NASDAQ:SBUX)

BTC Consolidates Ahead of Pivotal FOMC Meeting – Market Updates Bitcoin News

Master your conference slides with Gauth's Smart PDF Insights

Categories

Important Links

The History of Open Source LLMs: Better Base Models (Part Two) | by Cameron R. Wolfe, Ph.D. | November 2023

How LLaMA, MPT, Falcon and LLaMA-2 put open source LLMs on the map…

Related

Technical Terrence Team

Kid Rock makes new comments about Bud Light and transgender rights

Leave a Reply Cancel reply

Recommended.

Microstrategy’s Bitcoin Purchase Explains This Recent Pattern

Build private and secure enterprise generative AI apps with Amazon Q Business and AWS IAM Identity Center

Starbucks traded in the red for seven consecutive sessions (NASDAQ:SBUX)

BTC Consolidates Ahead of Pivotal FOMC Meeting – Market Updates Bitcoin News

Master your conference slides with Gauth's Smart PDF Insights

Categories

Important Links

Get daily news updates to your inbox!