Run an LLM locally with LM Studio

Editor Screenshot

It's been an interesting 12 months. A lot has happened with large language models (LLMs) at the forefront of all things technology. It has LLMs like ChatGPT, Gemini and more.

These LLMs currently run in the cloud, meaning they run somewhere else, on someone else's computer. For something to run somewhere else, you can imagine how expensive it is. Because if it was so cheap, why not run it locally on your own computer?

But all that has changed now. Now you can run different LLMs with ai/” rel=”noopener” target=”_blank”>LM Studio.

LM Studio is a tool you can use to experiment with local and open source LLM. You can run these LLMs on your laptop, completely turned off. There are two ways to discover, download and run these LLMs locally:

Via in-app chat UI
OpenAI compatible local server

All you have to do is download any model file that is compatible from the HugsFace repository and boom ready!

So how do I get started?

LM Studio Requirements

Before you can get started and start digging into discovering all the LLMs locally, you'll need these minimum hardware/software requirements:

Mac M1/M2/M3
Windows PC with AVX2 compatible processor. (Linux is available in beta)
More than 16 GB of RAM recommended
For PC, more than 6 GB of VRAM is recommended
Supported NVIDIA/AMD GPUs

If you have these, you're ready to go!

So what are the steps?

The first step is to download LM Studio for Mac, Windows or Linux, which you can do ai/” rel=”noopener” target=”_blank”>here. The download takes up approximately 400 MB, so depending on your internet connection it may take a total time.

The next step is to choose a model to download. Once LM Studio has started, click the magnifying glass to browse through the available model options. Again, please note that these models can be large, so downloading may take a while.

Once the model has downloaded, click the speech bubble on the left and select your model to upload.

Ready to chat!

There you have it, that's how quick and easy it is to set up an LLM locally. If you want to speed up the response time, you can do so by enabling GPU acceleration on the right side.

See how fast that was? Quick right.

If you are concerned about data collection, it is good to know that the main reason for being able to use an LLM locally is privacy. Therefore, LM Studio has been designed for exactly that!

Give it a try and tell us what you think in the comments!

nisha arya is a data scientist and freelance technical writer. She is particularly interested in providing professional data science advice or tutorials and theory-based insights into data science. She also wants to explore the different ways in which artificial intelligence can benefit the longevity of human life. A great student looking to expand her technological knowledge and writing skills, while she helps guide others.