Meet ChatLLaMA: the first open source implementation of LlaMA based on Reinforcement Learning from Human Feedback (RLHF)

Meta has recently released LLaMA, a collection of basic large language models ranging from 7 to 65 billion parameters.

LLaMA is creating a lot of excitement because it is smaller than GPT-3 but has better performance. For example, LLaMA’s 13B architecture outperforms GPT-3 despite being 10 times smaller. This new collection of foundational models opens the door to faster inference performance and real-time assistants similar to chatGPT, while being cost-effective and running on a single GPU.

However, LLaMA was not adjusted for instruction tasks with a reinforcement learning from human feedback (RLHF) training process.

🚨 Read our latest AI newsletter🚨

The good news is that today nebuly has introduced chatcallthe first open source implementation of LLaMA based on RLHF:

A fully open source implementation that allows you to create a ChatGPT-style service based on pre-trained LLaMA models.
Compared to the original ChatGPT, the training process and single GPU inference are much faster and cheaper by taking advantage of the smaller footprint of LLaMA architectures.
ChatLLaMA has built-in support for DeepSpeed ZERO to speed up the tuning process.
The library is also compatible with all LLaMA model architectures (7B, 13B, 33B, 65B), so you can tune the model according to your preferences for training time and inference performance.

If you like the project, please consider leaving a star on the GitHub repository.

https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/chatllama

ChatLLaMA allows you to easily train LLaMA-based architectures in a similar way to ChatGPT using RLHF. For example, below is the code to start the training in the case of ChatLLaMA 7B.

from chatllama.rlhf.trainer import RLTrainer
from chatllama.rlhf.config import Config

path = "path_to_config_file.yaml"
config = Config(path=path)
trainer = RLTrainer(config.trainer)
trainer.distillate()
trainer.train()
trainer.training_stats.plot()

Please note that you must provide the original weights for Meta and your custom data set before beginning the fitting process. Alternatively, you can generate your own dataset using LangChain’s brokers.

python generate_dataset.py

nebuly has fully open sourced to replicate the ChatLLaMA implementation, opening up the possibility for each user to tune their own custom ChatLLaMA helpers. The library can be further extended with the following additions:

Control points with adjusted weights
Optimization techniques for faster inference
Support for packaging the model in an efficient deployment framework

All developers are invited to join Nebuly’s efforts to make ChatGPT-like wizards more efficient and open.

You can participate in the following ways:

Submit a problem or PR on GitHub
join her Discord group to chat

Note: Thanks to the Nebuly team for the thought leadership/educational article above.

Asif Razzaq is the CEO of Marktechpost, LLC. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of artificial intelligence for social good. His most recent endeavor is the launch of an AI media platform, Marktechpost, which is noted for its in-depth coverage of machine learning and deep learning news that is technically sound and easily understandable to a wide audience. The platform has more than one million monthly visits, which illustrates its popularity among the public.