A weekend AI project: running speech recognition and a LLaMA-2 GPT on a Raspberry Pi | by Dmitrii Eliuseev | January 2024

A completely offline use of Whisper ASR and LLaMA-2 GPT Model

Raspberry Pi running a LLaMA model, Image by the author

Nowadays, no one will be surprised to run a deep learning model in the cloud. But the situation can be much more complicated in the world of consumer or edge devices. There are several reasons for that. First, using cloud APIs requires devices to always be online. This is not a problem for a web service, but can be a deal breaker for device that needs to work without Internet access. Second, cloud APIs cost money and customers may not be happy to pay another subscription fee. Last but not least, after several years the project may be finished, the API endpoints will be closed and the expensive hardware will become a brick. Which is naturally not friendly to customers, the ecosystem and the environment. That's why I am convinced that the end-user hardware should be fully functional offline, without additional costs or use of online APIs (well, it can be optional but not mandatory).

In this article, I will show how to run a LLaMA GPT model and automatic speech recognition (ASR) on a Raspberry Pi. That will allow us to ask Raspberry Pi questions and get answers. And as promised, all of this will work completely offline.

Let's get into it!

The code presented in this article is intended to work on Raspberry Pi. But most of the methods (except the “display” part) will also work on a Windows, OSX, or Linux laptop. Thus, those readers who do not have a Raspberry Pi will be able to test the code easily and without any problem.

Hardware

For this project, I will use a Raspberry Pi 4. It is a single board computer running Linux; It is small and requires only 5 V DC without fans or active cooling:

A newer model from 2023, the Raspberry Pi 5, should be even better; According to benchmarks, it's almost 2x faster. But it's also almost 50% more expensive, and for our test, the Model 4 is pretty good.

A weekend AI project: running speech recognition and a LLaMA-2 GPT on a Raspberry Pi | by Dmitrii Eliuseev | January 2024

Technical Terrence Team

ONEOK increases dividend and approves $2 billion buyback plan By Investing.com

Leave a Reply Cancel reply

Recommended.

Amazon Q Apps supports personalization and governance of AI-powered generative apps

M-RewardBench: A multilingual approach to reward model evaluation, analyzing accuracy in high- and low-resource languages with practical results

The popular fast food chain franchisee closes the restaurants locations

Blast Royale to Launch $NOOB Low FDV Community Offering (LCO) for First Gaming x Meme Token

Demystifying the role of the school board director

Categories

Important Links

A weekend AI project: running speech recognition and a LLaMA-2 GPT on a Raspberry Pi | by Dmitrii Eliuseev | January 2024

A completely offline use of Whisper ASR and LLaMA-2 GPT Model

Hardware

Related

Technical Terrence Team

ONEOK increases dividend and approves $2 billion buyback plan By Investing.com

Leave a Reply Cancel reply

Recommended.

Amazon Q Apps supports personalization and governance of AI-powered generative apps

M-RewardBench: A multilingual approach to reward model evaluation, analyzing accuracy in high- and low-resource languages ​​with practical results

The popular fast food chain franchisee closes the restaurants locations

Blast Royale to Launch $NOOB Low FDV Community Offering (LCO) for First Gaming x Meme Token

Demystifying the role of the school board director

Categories

Important Links

Get daily news updates to your inbox!

M-RewardBench: A multilingual approach to reward model evaluation, analyzing accuracy in high- and low-resource languages with practical results