Machine learning model deployment is a crucial step in making your ML models accessible to users and applications. FastAPI, a modern Python web framework, and Docker, a containerization platform, have gained popularity for their efficiency and simplicity in deploying machine learning models. In this tutorial, we'll walk through the process of deploying a machine learning model using FastAPI and Docker, making it accessible via a RESTful API.
Before we get into this article, if you want to learn more on Machine Learning and Docker, I would recommend the tutorials over at Educative, who I chose to partner with for this tutorial.
Prerequisites
Before we begin, ensure you have the following:
- Python and pip installed on your system.
- Basic understanding of machine learning and Python.
- Docker installed on your system. You can download it from the official website: https://www.docker.com/get-started.
Create a Machine Learning Model
For this tutorial, we'll use a simple scikit-learn model to classify iris flowers. You can replace it with your own trained model.
- Create a Python script (e.g.,
model.py
) and define your model:
import joblib
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
# Load the iris dataset
iris = load_iris()
X, y = iris.data, iris.target
# Train a random forest classifier
model = RandomForestClassifier()
model.fit(X, y)
# Save the trained model
joblib.dump(model, 'model.joblib')
- Run the script to train and save your model:
python model.py
Create a FastAPI App
Now, let's create a FastAPI app that serves the machine learning model as a RESTful API.
- Create a new directory for your FastAPI app:
mkdir fastapi-docker-ml
cd fastapi-docker-ml
- Install FastAPI and Uvicorn:
pip install fastapi uvicorn
- Create a FastAPI app script (e.g.,
app.py
) and define the API:
from fastapi import FastAPI
import joblib
import numpy as np
app = FastAPI()
# Load the trained model
model = joblib.load('model.joblib')
@app.get("/")
def read_root():
return {"message": "Welcome to the ML Model API"}
@app.post("/predict/")
def predict(data: dict):
features = np.array(data['features']).reshape(1, -1)
prediction = model.predict(features)
class_name = iris.target_names[prediction][0]
return {"class": class_name}
Create a Dockerfile
To containerize our FastAPI app, we'll create a Dockerfile.
- Create a file named
Dockerfile
(without any file extensions) in the same directory as your FastAPI app:
# Use the official Python image
FROM python:3.9
# Set the working directory in the container
WORKDIR /app
# Copy the local code to the container
COPY . .
# Install FastAPI and Uvicorn
RUN pip install fastapi uvicorn
# Expose the port the app runs on
EXPOSE 8000
# Command to run the application
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
Build and Run the Docker Container
With the Dockerfile in place, you can now build and run the Docker container.
- Build the Docker image:
docker build -t fastapi-docker-ml .
- Run the Docker container:
docker run -d -p 8000:8000 fastapi-docker-ml
Test the API
Your FastAPI app is now running in a Docker container. You can test it by making POST requests to the /predict/
endpoint:
curl -X POST "http://localhost:8000/predict/" -H "accept: application/json" -H "Content-Type: application/json" -d '{"features": [5.1, 3.5, 1.4, 0.2]}'
This will return the predicted class for the given input features.
Conclusion
You've successfully deployed a machine learning model using FastAPI and Docker, creating a RESTful API that can be accessed from anywhere. This approach allows you to easily scale your ML model deployment and integrate it into various applications and services. Explore further by enhancing your FastAPI app, adding authentication, and optimizing your Docker container for production use.
Top comments (4)
This isn't what we call deployment in the industry, but installation or setup: You've containerized several services and made it runnable on your local machine.
Deployment (so it's accessible by the internet) involves quite a few more steps. For example, setting up and pushing to ECS on AWS or kubernetes on Digital Ocean. It involves writing scripts that interact with your repository and pushes to your hosting environment.
You're using scikit-learn, which doesn't have GPU support, so you're good on that with the single container. But with Tensorflow/Keras/pyTorch and similar stacks this has a few more caveats as they will benefit from GPUs for complex math, so running your deep learning models in the same container as your Rest API is generally not done, given the extra cost for GPU enabled containers (serving an API has very little math and when done right is not a CPU bound process).
so how would you summarize the deployment process and what are the most important steps to consider
Very useful!π