DEV Community

Cover image for Top 5 AI Models YOU Can Run Locally on YOUR Device! 🤯

Top 5 AI Models YOU Can Run Locally on YOUR Device! 🤯

Best Codes on September 17, 2024

What's up, folks? Did you know you can run an AI model on YOUR machine?! Let me explain. Most AI models are run on private servers far away (unles...
Collapse
 
shricodev profile image
Shrijal Acharya

Good one buddy! What do you use to run these models locally? Is it Ollama?

Collapse
 
harshit_lakhani profile image
harshit_lakhani

You should try llmchat.co – it offers the best UI for interacting with local models using Ollama.

Collapse
 
best_codes profile image
Best Codes

Looks neat, I'll check it out!

Collapse
 
best_codes profile image
Best Codes

Thanks! I run the models with GPT4All (in this article). I also use Ollama, or the Alpaca UI for Ollama (Linux only).

Collapse
 
shricodev profile image
Shrijal Acharya

Oh, I missed where you mentioned Ollama and GPT4ALL. I just skimmed through the list.

Thread Thread
 
best_codes profile image
Best Codes

I've heard LM Studio is great as well. I'm gonna check it out! :)

Collapse
 
techfan71 profile image
TechFan71 • Edited

Thank you!
Another option to run a LLM locally is LM Studio. It is free for personal use, Linux, Mac and Windows versions. It provides a List with short description of the supported models, which can be downloaded with a mouse click. You can also switch between them with a mouse click.
[lmstudio.ai/]

Collapse
 
best_codes profile image
Best Codes

@techfan71, @robbenzo24, and @recoveringoverthinkr I tested LM Studio today. The UI was nice and very intuitive, but at the cost of speed. GPT4All was much faster, less laggy, and had a higher token per second output for the same models.

Plus, any features of LM Studio, such as easily switching models, starting an AI server, managing models, etc. are also in GPT4All.

Overall, I'd recommend GPT4All to most Linux, Windows, or macOS users, and Alpaca to users with small PCs.

Thank you all for your feedback! :D

Collapse
 
recoveringoverthinkr profile image
recoveringOverthinker

Thanks! You're awesome! I'll pass this along to my coworkers.

Thread Thread
 
best_codes profile image
Best Codes

Glad I could help! :)

Collapse
 
techfan71 profile image
TechFan71

Thank you for the comparison, I will try GPT4All with Linux.

Thread Thread
 
best_codes profile image
Best Codes

👍

Collapse
 
robbenzo24 profile image
Rob Benzo • Edited

Very cool never heard of it
Is it basically a UI for ollama?

Collapse
 
techfan71 profile image
TechFan71

Thats the description from their homepage:

Thread Thread
 
robbenzo24 profile image
Rob Benzo

oh cool, will check it out!

Collapse
 
recoveringoverthinkr profile image
recoveringOverthinker

good someone beat me to mentioning LM Studio. I haven't checked it out but some folks at work have recommended it.

Collapse
 
best_codes profile image
Best Codes

I'm testing it today 🔥

Collapse
 
best_codes profile image
Best Codes

I've seen that one as well! Thank you for your feedback. :)

Collapse
 
nfstern profile image
Noah

I got a machine w/256 gb of ram 18 cores & 10tb of disk space. Got any models you can recommend for machines w/more memory?

Collapse
 
best_codes profile image
Best Codes

Wow, nice! 😲

I'd recommend this model here; it's a bit larger:
Hermes

You can also try the Llama 3.1 8B or 70B parameter models (just search Meta-Llama-3.1-8b or Meta-Llama-3.1-70B).

If you think you can handle more, try the Meta-Llama-3.1-405B model — it's very large and powerful; one of the best open source models out there.

Collapse
 
nfstern profile image
Noah

Thank you very much for the recs, I will look into them. Appreciate the knowledge drop as I'm just starting to look at this stuff.

Thread Thread
 
best_codes profile image
Best Codes

No problem :)

Collapse
 
tbgnath profile image
Bala

Hi, I see folks commenting on using different models but I couldn't find anyone reporting results after trying one or more models from the article, with my limited time reviewing the comments.

I did try two of the models (#1 Nous Hermes 2 Mistral DPO & #3 Llama 3 8B Instruct) and my experience is not good. With a 31GB RAM, the queries were taking longer time than I thought to respond; but the main issue GPT4All is that it does a poor job when I tried to chat with my local files using "LocalDocs". Anyone had different experience?

Collapse
 
best_codes profile image
Best Codes

Most models you can run locally are pretty weak. Not much you can do. If you want to run a better model, get a better device, use an API, or an AI server.

Collapse
 
amit_giri_6c30c15a6389e40 profile image
Amit Giri

024545

Thread Thread
 
best_codes profile image
Best Codes

?

Collapse
 
mrvon profile image
Von Colborn

i'm curios as to what level of a computing environment you all are using for running ~8 MB models, other than Apple M? hardware?

Collapse
 
best_codes profile image
Best Codes

An 8 MB model is tiny. Do you mean 8 GB? In that case, any device will do as long as you've got enough disk space (8 GB) and RAM (usually about 16 to 32 GB).

Collapse
 
martinbaun profile image
Martin Baun

Llama looks good!
But why no multi modal?

Collapse
 
best_codes profile image
Best Codes

I didn't include any multimodal models because there aren't many open-source ones, and because they can be a lot more intensive to run locally, and this article was focusing on smaller models that can run on a laptop or PC. :)

Collapse
 
martinbaun profile image
Martin Baun

Ah, gotcha!

Thread Thread
 
best_codes profile image
Best Codes

:)

Collapse
 
mohammed_kareem profile image
Mohammad Kareem

i'd be glad to run vscode on my machine without it turning into a stove

Collapse
 
best_codes profile image
Best Codes

Running an AI model is a bit more intensive than running VS Code. 🤪

Collapse
 
mohammed_kareem profile image
Mohammad Kareem

a bit you say lmao

Thread Thread
 
best_codes profile image
Best Codes

Haha

Collapse
 
sahilr2050 profile image
SP

Anyone know how to train a model on source code? To use Locally? Which SLM should use? And how to do it?

Collapse
 
best_codes profile image
Best Codes

You can't exactly train a model on code and get something usable; you need chat data with a heavy use of code. Also, any codebase would probably be too small to make much of an AI off of.

You can use the Local Docs feature in GPT4All (which uses a text embeddings model, probably more what you're looking for) or Codeium in your editor to chat with your codebase.

Collapse
 
alt_exist profile image
Alternate Existance

love it thanks for sharing

Collapse
 
best_codes profile image
Best Codes

Thank you!

Collapse
 
robbenzo24 profile image
Rob Benzo • Edited

Nice article, thanks for sharing 💖

Collapse
 
best_codes profile image
Best Codes

Thanks!

Collapse
 
eren_yeager_c1759662d1eae profile image
Info Comment hidden by post author - thread only accessible via permalink
eren yeager

dsf sdf sd fds f dsf ds

Some comments have been hidden by the post's author - find out more