DEV Community

Kristiyan Velkov
Kristiyan Velkov

Posted on

7

Run AI Models Locally: Docker Desktop’s New AI Model Runner

I Got Early Access to Docker Desktop’s New AI Model Runner — Here’s What You Need to Know

Image description

Docker has just taken a major step forward with its newest addition to Docker Desktop: AI Model Runner. As a Docker captain I had early access to this feature and spent the last few days testing it out in real-world scenarios.

And let me tell you — this is a game-changer. 🚀


What Is the AI Model Runner?

The Docker Model Runner is a new experimental feature in Docker Desktop 4.40+ that gives you a Docker-native experience for running large language models (LLMs) locally.
It’s available now on macOS with Apple Silicon (M1/M2/M3), with Windows support on NVIDIA GPUs coming in end of April 2025.

This is not another containerized runtime. Docker runs the inference engine (like llama.cpp) directly on your host with GPU access, so you get:

  • Direct GPU acceleration.
  • No network latency
  • Full control and privacy
  • Models are pulled as OCI artifacts from Docker Hub and dynamically loaded into memory — not packed in images.

This means:

  • Lower disk usage
  • Faster load times
  • Cleaner dev experience
  • It’s fast, simple, and will be integrated directly into the Docker Desktop UI.

My First Impressions

As someone who frequently works on front-end applications and prototypes AI-powered features, I was thrilled to try this.

Within minutes, I was able to:

  • Launch an AI model locally with one command.
  • Run inference locally.
  • Avoid latency, quota limits, and cloud API complexities.
  • Stay 100% private and offline when needed.
  • It feels like the local-first development experience we’ve all been waiting for.

Exploring and Managing Models

For CLI lovers like me, here are the essential commands available:

🧠 Docker Model Runner CLI Commands:

- docker model list        # List available models
- docker model inspect     # View detailed info about a model
- docker model pull        # Download a model to your machine
- docker model run         # Run the model locally
- docker model rm          # Remove a downloaded model
- docker model status      # Check if Model Runner is active
- docker model version     # Show version of the Model Runner
Enter fullscreen mode Exit fullscreen mode

This gives you full control from the terminal — just like any Docker-native tool. You can browse, download, run, and manage models with ease.


📦 Available Models

All current models are hosted under a personal namespace on Docker Hub:

All these models are hosted on https://hub.docker.com/u/ai:

- ai/gemma3
- ai/llama3.2
- ai/qwq
- ai/mistral-nemo
- ai/mistral
- ai/phi4
- ai/qwen2.5
- ai/deepseek-r1-distill-llama (distill means it’s not the actual RL-ed deepseek, it’s a llama trained on DeepSeek-R1 inputs/outputs)
Enter fullscreen mode Exit fullscreen mode

Example usage:

docker pull ai/llama3.2
docker model run ai/llama3.2:1B-Q8_0 "What is Docker?"
Enter fullscreen mode Exit fullscreen mode

Expected output:

Docker is an open-source platform that allows you to automate the deployment, scaling, and management of applications using containerization. It helps developers package applications with all the parts they need.
Enter fullscreen mode Exit fullscreen mode

Why This Matters

  • Privacy & Security — Sensitive data stays local. That’s critical for enterprise apps and internal tools.
  • Speed — Real-time feedback without internet dependency means faster dev loops.
  • Developer Experience — It fits naturally into the Docker ecosystem. -** No extra tools, no surprises**.

This addition will lower the entry barrier for developers looking to build AI features. Whether you’re experimenting with Open Source LLMs or integrating existing ones into your product, Docker Desktop just made your life a lot easier.


Final Thoughts

Docker continues to evolve beyond containers into something broader — a full developer platform. The AI Model Runner is proof that they’re paying attention to how we actually build today.

If you’ve ever wished you could test and run AI models with the same simplicity as docker run, this is it.

I’ll be sharing more insights and use cases as I keep working with the feature.

For now, if you’re curious — try it out, experiment, and let me know what you build.


Subscribe to my Newsletter Front-end world

Kristiyan Velkov
JavaScript | TypeScript | React.js | Angular | Next.js | Vue.js | Analog | HTML | CSS | SASS | Tailwind CSS | Docker

© 2025 Kristiyan Velkov. All rights reserved.

Top comments (1)

Collapse
 
gopisuvanam profile image
Gopi Krishna Suvanam

Can you write an article on how AI can be run in the browser using JS.. check out Scribbler and webllm

👋 Kindness is contagious

Explore a trove of insights in this engaging article, celebrated within our welcoming DEV Community. Developers from every background are invited to join and enhance our shared wisdom.

A genuine "thank you" can truly uplift someone’s day. Feel free to express your gratitude in the comments below!

On DEV, our collective exchange of knowledge lightens the road ahead and strengthens our community bonds. Found something valuable here? A small thank you to the author can make a big difference.

Okay