How to Use LM Studio: A Step-by-Step Guide 🚀

Welcome to this comprehensive guide on LM Studio! In this post, we’ll walk you through everything you need to know to get started with LM Studio, a powerful tool for running open-source Large Language Models (LLMs) locally.

What is LM Studio? 🤔

LM Studio allows you to:

Run any open-source LLM model locally, entirely offline 🌐.
Access a huge collection of open-source LLM models 📚.
Use models through an in-app chat UI, similar to OpenAI’s ChatGPT 💬.
Download any compatible model from the Hugging Face repository ⬇️.
Play around with new LLM models 🧪.

Prerequisites ⚙️

Mac: M1, M2, or M3 chip recommended 🍎
Windows/Linux: CPU with AVX2 support 💻🐧
RAM: Recommended 16GB or more (8GB may work, but performance may be limited) 💾

Step-by-Step Guide 🚶‍♀️

1. Download and Install LM Studio ⬇️

Visit LM Studio.ai.
Choose your operating system (Mac, Windows, or Linux) and download the installer.
Run the installer and follow the on-screen instructions to install LM Studio.

2. Exploring the LM Studio Interface 🧭

Upon opening LM Studio for the first time, you’ll be greeted with a user interface containing several cards. The most important element to focus on is the search bar. This is where you’ll search for the LLM models you want to use.

The search bar supports searching for:

Llama models 🦙
Mistral models 🌬️
Falcon models 🦅
And many more!

You’ll also notice tabs for:

Chat with AI: This is where you interact with the LLMs you download 💬.
Multimodal: (Potentially) Interact with multiple models at once 🎭.
Manage Models: Download, manage and inspect the models you have installed 🗄️.

LM Studio also recommends popular models like Lama 3.

3. Searching for and Downloading Models 🔎⬇️

In the search bar, type the name of the model you’re looking for (e.g., “Mistral”).
A list of models matching your search query will appear. You can sort these by popularity (“Most Downloads”) for convenience.
The results are fetched from the Hugging Face repository, meaning almost any model there can be used with LM Studio.
Click on a model to see its available versions (e.g., Mistral 7B variants).
On the right side of the screen, you’ll see a list of different quantizations for the model, as well as other important details:
- Size: The amount of disk space the model requires 💾.
- Compatibility: Whether the model is fully GPU offloadable ✅.
- Quantization: The level of quantization (e.g., 2-bit, 3-bit, 8-bit). Higher quantization generally results in a more powerful model. You’ll also see IQ1_S, IQ1_M, XS, K_S and other quantization parameters
Choose a version that’s compatible with your system and click the Download button. Pay attention to the model size and the recommendation of full GPU offload as these models can range from 7 GB to 14 GB.
You can monitor the download progress at the bottom of the screen. ⏳

4. Managing Downloaded Models 🗄️

Click on My Models on the left-hand side.
This will show you a list of all the models you’ve downloaded.
From this screen, you can:
- Delete models to free up disk space 🗑️.
- Use the Model Inspector to view detailed information about a model 🔍.
- Copy the model path if needed Copy 📝.
- Click the Hugging Face icon to open the model’s page on Hugging Face 🤗.

5. Chatting with Your LLM 💬

Click on the AI Chat tab on the left-hand side.
In the “Select a model to load” dropdown, choose the model you want to use.
The model will load, and you can start typing your prompts in the text box.
Click the send button to get a response from the model.

Key features of the chat interface:

New Chat: Start a new conversation ➕.
System Prompt: Define how the model should behave (e.g., “You are an expert in…”) This allows you to tailor the model’s responses to specific scenarios. The example given in the transcript shows how you can set the system prompt to make the bot act like a robber 🦹.
Context Length: Determine the size of the model’s working memory 🧠. LM Studio provides helpful tooltips explaining how changing these configurations will affect the model.
Temperature: Controls the randomness of the model’s responses 🌡️.
Tokens Generated: Set the amount of tokens generated 🪙.

You can also:

Export the conversation as a JSON file, simple text file, or Markdown file 📤.
Take a Screenshot of the conversation 📸.
Branch the conversation to explore different paths 🌿.
Delete messages 🗑️.

6. Using the Playground 🛝

The “Playground” feature is where the real power of LM Studio shines. It lets you compare the output of multiple models simultaneously, given the same prompt.

Load multiple models in the Playground.
Set the throttling settings. Setting throttling to zero allows the LLM to process at full speed, using the entire system.
Enter your prompt.
LM Studio will send the prompt to each model one at a time and display the results side-by-side.
This is an excellent way to evaluate the strengths and weaknesses of different models for specific tasks 💪.

While you’re experimenting, you can monitor CPU and GPU usage to optimize your system 📈.

7. Local Server 🌐

LM Studio also provides a local server that allows you to interact with the models directly through code.

This allows programmatic access to your models using tools like curl requests, Python scripts, and AI assistants, including vision and embedding models.

Conclusion 🎉

LM Studio is a fantastic tool for exploring the world of open-source LLMs. It’s easy to use, powerful, and allows you to run models locally without an internet connection. By following this guide, you should be well on your way to experimenting with and leveraging the power of LLMs.

Running Gemma 3 - MultiModal LLMs at Home! Full Tutorial with Installation Guide

Mastering Git Branches Tutorial