Using Groq Llama 3 70B locally – step by step guide

Image by author

Everyone is focused on building better LLMs (large language models), while Groq is focused on the infrastructure side of ai, making these large models faster.

In this tutorial, we will learn about Groq LPU Inference Engine and how to use it locally on your laptop using API and Jan ai. We will also integrate it into VSCode to help us generate code, refactor it, document it, and generate test units. We will create our own ai coding assistant for free.

What is the Groq LPU inference engine?

He Grok The LPU (Language Processing Unit) inference engine is designed to generate fast responses for computationally intensive applications with a sequential component, such as LLMs.

Compared to the CPU and GPU, the LPU has higher computing power, reducing the time it takes to predict a word, making text sequences generated much faster. Additionally, the LPU also addresses memory bottlenecks to deliver better performance on LLMs compared to GPUs.

In short, Groq LPU technology makes your LLMs super fast, enabling real-time ai applications. Read the Groq ISCA 2022 Paper for more information about the LPU architecture.

January ai Installation

Jan ai is a desktop application that runs large proprietary and open source language models locally. It is available to download for Linux, macOS and Windows. We will download and install Jan ai on Windows by going to Releases · janhq/jan (github.com) and clicking on the file with the extension `.exe`.

If you want to use LLM locally to improve privacy, read the blog 5 Ways to Use LLM on Your Laptop and start using top-of-the-line open source language models.

Creating the Groq Cloud API

To use Grog Llama 3 in Jan ai, we need an API. To do this, we will create a Groq Cloud account by going to https://console.groq.com/.

If you want to try out the different models that Groq offers, you can do so without configuring anything by going to the “Playground” tab, selecting the model, and adding user input.

In our case, it was super fast. It generated 310 tokens per second, which is by far the most I have ever seen. Not even Azure ai or OpenAI can produce this kind of result.

To generate the API key, click the “API Keys” button on the left panel, then click the “Create API Key” button to create and then copy the API key.

Using Groq in Jan ai

In the next step, we will paste the Groq Cloud API key into the Jan ai app.

Launch the Jan ai app, go to settings, select “Groq Inference Engine” option in the extension section and add the API key.

Then, return to the thread window. In the model section, select Groq Llama 3 70B in the “Remote” section and start ordering.

The response generation is so fast that I can't even keep up.

Note: The free version of the API has some limitations. Visit https://console.groq.com/settings/limits to learn more about them.

Using Groq in VSCode

Next, we will try to paste the same API key into the CodeGPT VSCode extension and create our own free ai coding assistant.

Install the CodeGPT extension by searching for it in the extension tab.

The CodeGPT tab will appear for you to select the model provider.

When you select Groq as your model provider, it will ask you to provide an API key. Just paste the same API key and you're done. You can even generate another API key for CodeGPT.

Now we will ask you to write code for the snake game. It took 10 seconds to generate and then run the code.

Here is the demonstration of how our snake game works.

Learn about the five best ai coding assistants and become an ai-powered developer and data scientist. Remember, ai is here to help us, not replace us, so be open to it and use it to improve your code writing.

Conclusion

In this tutorial, we learned about Groq Inference Engine and how to access it locally using the Jan ai app for Windows. To top it all off, we've integrated it into our workflow by using CodeGPT VSCode extensions, which is awesome. Generate responses in real time for a better development experience.

Now, most companies will develop their own inference engineers to match Groq's speed. Otherwise, Groq will take the crown in a few months.

Abid Ali Awan (@1abidaliawan) is a certified professional data scientist who loves building machine learning models. Currently, he focuses on content creation and writing technical blogs on data science and machine learning technologies. Abid has a master's degree in technology management and a bachelor's degree in telecommunications engineering. His vision is to build an artificial intelligence product using a graph neural network for students struggling with mental illness.

Using Groq Llama 3 70B locally – step by step guide

Technical Terrence Team

USDCAD at 1.372500 support level

Leave a Reply Cancel reply

Recommended.

Chevron says fault at Australia LNG plant shuts one-fourth of production (NYSE:CVX)

On 'Hard Fork': A Critical Look at the Future of Technology

GT School Introduces TeachTap, the First AI-Powered Learning App for AP and High School Exam Prep Courses

Someone Paid $500K in For a $2,000 BTC Transaction

Texas lawmakers propose gold-backed state digital currency

Categories

Important Links

Using Groq Llama 3 70B locally – step by step guide

What is the Groq LPU inference engine?

January ai Installation

Creating the Groq Cloud API

Using Groq in Jan ai

Using Groq in VSCode

Conclusion

Related

Technical Terrence Team

USDCAD at 1.372500 support level

Leave a Reply Cancel reply

Recommended.

Chevron says fault at Australia LNG plant shuts one-fourth of production (NYSE:CVX)

On 'Hard Fork': A Critical Look at the Future of Technology

GT School Introduces TeachTap, the First AI-Powered Learning App for AP and High School Exam Prep Courses

Someone Paid $500K in For a $2,000 BTC Transaction

Texas lawmakers propose gold-backed state digital currency

Categories

Important Links

Get daily news updates to your inbox!