This tutorial will tech you how to build and deploy a machine learning app to Amazon EKS Cluster. It will also give you detailed steps on how to fully set up an Amazon EKS Cluster.
Amazon Elastic Kubernetes Service (EKS) is a platform that manages containerized applications. Amazon Elastic Kubernetes Service (EKS) is cloud-based. It speeds up the process of deploying containerized applications on a Kubernetes cluster.
The EKS cluster automatically manages the AWS resources. It also sets up the complete infrastructure for the deployed applications. Amazon EKS manages the container resources and schedules how the containers will run. It also schedules how the deployed containers will access the available AWS resources. It ensures the deployed container operates within its resource limits and requirements.
This tutorial will teach you how to deploy a containerized machine learning application onto an Amazon EKS cluster. You will start by building a simple machine-learning model. After testing the model, you will create an application for the model. The application will have a user interface (UI) that will allow people to interact with the model. You will use the Streamlit framework to build the user interface (UI).
Next, you will build a Docker image for the Streamlit application. We will then push the Docker image to Docker Hub. After completing these steps, you will start working with the Amazon Elastic Kubernetes Service (EKS). You will create an EKS cluster with all the required container resources. Finally, you will deploy the containerized Streamlit application onto the created EKS cluster. If this sounds fun, let's start working on our project!
Prerequisites
This tutorial assumes you have some basic knowledge of Docker. To follow along easily with this tutorial, you will need the following:
- Python set up on your machine.
- Visual Studio Code text editor.
- Docker Desktop configured into your machine.
- Google Colab. You will use this to write and execute the Python code for our machine-learning model. Google Colab is a free cloud-based Python notebook. It's very powerful when building a machine-learning model.
Getting Started with the Machine Learning Model
A machine model learns from past/historical data. It then recognizes hidden patterns in the data. The model gains knowledge that it uses to predict future outcomes. A machine learning model learns from data and improves its prediction accuracy over time. In this tutorial, you will build a simple machine learning to predict the gender of a person using the first name. We will use a dataset that has a list of names. You can download the dataset here. You will run all the Python code for building the model in Google Colab.
Importing Exploratory Data Analysis (EDA) Packages
These packages will analyze the dataset to discover patterns and gain insights. You will use the following packages:
import pandas as pd
import numpy as np
Let's use the imported pandas
package to load the dataset.
df = pd.read_csv('/content/names_dataset.csv')
Checking the Loaded Dataset
To check the data points in the loaded dataset, run this code:
df.head()
The code above displays the following output:
Feature Extraction Package
You will use the CountVectorizer
package to extract features from the name
column. Extracting features from the text names is a way of text preprocessing. Text preprocessing is essential in Natural Language Processing. The extracted features are the inputs for the model.
from sklearn.feature_extraction.text import CountVectorizer
Next, you will convert the dataset labels in the sex
column into numeric labels as follows:
df.sex.replace({'F':0,'M':1},inplace=True)
After this step, you will now perform the feature extraction using the CountVectorizer
package:
Xfeatures =df['name']
count_vec = CountVectorizer()
X = count_vec.fit_transform(Xfeatures)
Selecting Features and Labels
The extracted features are the model inputs, and the numeric labels are the model outputs. You select these values from the dataset as follows:
X
y = df.sex
Next, you will split the names dataset into a train and text set using the train_test_split
function.
from sklearn.model_selection import train_test_split
Let's use the imported function to split the dataset.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
After splitting the dataset, let's now build the model.
Building the Model
You will build the model using the Naive Bayes classifier algorithm. Run the following code in Google Colab.
from sklearn.naive_bayes import MultinomialNB
md = MultinomialNB()
md.fit(X_train,y_train)
The code above will import MultinomialNB
function from the Naive Bayes classifier algorithm to build the model. You will also fit the model into the train set dataset. The model will learn from the training dataset and make predictions.
Using the Gender Classification Model to Make Predictions
You will use the trained model to make predictions on sample names. To use the model to predict a sample name, run the following code in Google Colab:
name1 = ["Mary"]
vect1 = count_vec.transform(name1).toarray()
if md.predict(vect1) == 0:
print("Female")
else:
print("Male")
The code above will preprocess the sample text using the count_vec
. It predicts the text using the predict
function, and prints the prediction result as either Female
or Male
. The code will print the following prediction result:
Female
It's a correct prediction. You can also use the model to predict another sample name:
name2 = ["Mark"]
vect2 = count_vec.transform(name2).toarray()
if md.predict(vect2) == 0:
print("Female")
else:
print("Male")
Output:
Male
It's also a correct prediction. The next step is to save the model as pickle format. The pickle file is easy to use when building the application. You will use the joblib
library to save the model.
import joblib
Saving the Gender Classification Model
To save the model, run the following Python code:
FinalModel = open("gender_classification_model.pkl","wb")
joblib.dump(md,FinalModel)
FinalModel.close()
After executing this command, the file gender_classification_model.pkl
will appear in Google Colab. Download the file and save it to your local machine as shown in the image below:
You also need to save the CountVectorizer
created in this tutorial:
FinalVectorizer = open ("gender_model_vectorizer.pkl", "wb")
joblib.dump(count_vec,FinalVectorizer)
FinalVectorizer.close()
Download and save the gender_model_vectorizer.pkl
file into your local machine. You will use these two files when building the Streamlit application. You have now created, tested, and saved the model. The next step is to create a machine-learning application using Streamlit.
Creating a Machine Learning Application using Streamlit
Streamlit is an open-source framework for building web applications for machine learning models. To start, follow the steps below to setup the working directory:
Step 1: Create a directory (folder) named gender_classifier_mlapp_with_streamlit
then open it with Visual Studio code. It will be our working directory.
Step 2: In the created directory, create another folder and name it models
. In this folder add the downloaded gender_classification_model.pkl
and the gender_model_vectorizer.pkl
pickle files.
Step 3: Create a file named app.py
in your working directory.
Step 4: Installing the necessary packages
You will install the following packages:
pip install sklearn joblib streamlit
Ensure you install these libraries while in the working directory. Let's start working on our application. To start, open the app.py
and paste the following code:
from sklearn import naive_bayes
import streamlit as st
import joblib
import time
This part of the code will import all the installed packages. Next, paste the following code:
# Load the saved model Vectorizer
imported_vectorizer = open("models/gender_model_vectorizer.pkl","rb")
cv = joblib.load(imported_vectorizer)
# Load the saved gender prediction Model
naive_bayes_model = open("models/gender_classification_model.pkl","rb")
clf = joblib.load(naive_bayes_model)
The code above will import the gender_model_vectorizer.pkl
and gender_classification_model.pkl
from the models
folder. The next step is to create the prediction function.
Creating the Prediction Function
You will create the prediction function using the following code:
# Creating the Prediction
def gender_prediction(data):
vect = cv.transform(data).toarray()
result = clf.predict(vect)
return result
Next, you will add the main
Streamlit function to design the web application.
Adding the Main Function
You will add the function using the following code:
def main():
"""Gender Classifier App
"""
st.title("Gender Classifier with Streamlit")
html_temp = """
<div style="background-color:purple;padding:10px">
<h2 style="color:white;text-align:center;">Gender Classifaction App </h2>
</div>
"""
st.markdown(html_temp,unsafe_allow_html=True)
name = st.text_input("Enter Person Name")
if st.button("Predict Gender"):
result = gender_prediction([name])
if result[0] == 0:
prediction = 'Female'
else:
result[0] == 1
prediction = 'Male'
st.success('Name: {} was classified as {}'.format(name.title(),prediction))
if __name__ == '__main__':
main()
The code will create a web application for our machine-learning model. To see the created application, run the following command in your terminal:
streamlit run app.py
The user interface:
1. User Interface for a Male Prediction
2. User Interface for a Female Prediction
The application is now ready. Let's build a Docker image for the application. Before you create the Docker image, you need to create a requirements.txt
file. This file will contain all the application's packages and dependencies. You will build the Docker image using the packages and dependencies in this file.
While in the working directory, create a file named requirements.txt
. Then run the following command to get the application's packages and dependencies:
pip freeze
You will then copy all the packages and dependencies displayed on the terminal and paste them into the requirements.txt
file.
Building a Docker Image for the Streamlit Application
To create a Docker image, create a file named Dockerfile
(without any extensions) in the working directory. Open the Dockerfile
and paste the following code to build the Docker image:
FROM python:3.10
WORKDIR /app
COPY requirements.txt ./requirements.txt
RUN pip install -r requirements.txt
EXPOSE 8501
COPY . /app
ENTRYPOINT ["streamlit", "run"]
CMD ["app.py"]
The create the Docker image, run the following command in your terminal.
docker build -t bravinwasike/streamlit-app .
NOTE: When building the Docker image, name it starting with the same name as your Docker Hub user name. It will make it easy to push the created Docker image into your Docker Hub repository.
Pushing the Docker Image into Docker Hub
After logging into your Docker Hub account, create a new repository and name it streamlit-app
. Then execute the following code in your terminal:
docker login
After logging into Docker Hub using the terminal, run the following code to push the Docker image:
docker push bravinwasike/streamlit-app
You have now created and pushed the Docker image for the Streamlit application to Docker Hub. The next step is to create the Amazon Elastic Kubernetes Service (EKS) cluster.
Creating the Amazon Elastic Kubernetes Service (EKS) Cluster
To create the Amazon EKS cluster, you need to set up the following:
1. Sign up for an AWS free tier account. The EKS cluster resources for this tutorial will be within the free tier plan limits.
2. AWS CLI.
It is a command-line interface tool that will enable you to access and log into the AWS account from the terminal. You will download and install the AWS CLI from here.
After installing the AWS CLI, run the following command to check its version:
aws --version
You then need to configure the AWS CLI to access the AWS account from the terminal. We will need the Access key ID
, Secret access key
, AWS Region
, and Output format
. Follow the steps below to get your Access key ID
and Secret access key
from your AWS account.
Step 1: Log into the AWS account using your root user.
Step 2: Click on your account icon
Step 3: Click on security credentials
Step 4: Click on Access keys (access key ID and secret access key)
After getting their values, run the following code to configure the AWS CLI:
aws configure
The command will prompt you to input these two access key values, AWS Region
and Output format
. You will use the default AWS Region
and Output format
values by pressing Enter
.
- Kubernetes CLI It's a command-line interface tool that will enable you to work with the AWS EKS cluster. You will download and install the Kubernetes CLI from here.
After installing the Kubernetes CLI, run the following command to check its version:
kubectl version --short --client
-
Eksctl
Eksctl is a command line interface tool that will enable you to create an Amazon EKS cluster easier and faster. A single
eksctl
command will create an Amazon EKS cluster with all the resources.
To install the eksctl tool, run the following command in your terminal:
choco install -y eksctl
To check the installed eksctl version, run this command in your terminal:
eksctl version
After setting up all these tools, let's create our Amazon EKS cluster.
Creating the Amazon EKS Cluster using eksctl
To create the Amazon EKS cluster named sample-cluster
, use the following command:
eksctl create cluster --name sample-cluster
The command will create the cluster. It also assigns all the default AWS resources and Kubernetes nodes. The code displays the following output to show the process.
After creating the cluster, let's deploy the Streamlit application.
Deploying the Streamlit Application
You will use the Docker image in Docker Hub to create a containerized application. You will then deploy it to the created EKS cluster. We create a .yaml
file that describes the number of pods and the resources for the application. Lets .yaml
file named streamlit-app-deployment.yaml
in the working directory. Open the file and paste the following code:
apiVersion: apps/v1
kind: Deployment
metadata:
name: streamlit-app-deployment
spec:
replicas: 1
selector:
matchLabels:
app: streamlit-app
template:
metadata:
labels:
app: streamlit-app
spec:
containers:
- name: streamlit-app
image: bravinwasike/streamlit-app
resources:
limits:
memory: "512Mi"
cpu: "500m"
ports:
- containerPort: 8501
---
apiVersion: v1
kind: Service
metadata:
name: streamlit-app-service
spec:
type: LoadBalancer
selector:
app: streamlit-app
ports:
- protocol: TCP
port: 80
targetPort: 8501
The streamlit-app-deployment.yaml
file has two parts: Deployment
and the Service
.
-
Deployment
This part describes the container name:
streamlit-the app
. It also describes the Docker image that creates the container:bravinwasike/streamlit-app
. You will use the Docker image we had earlier pushed to Docker Hub, so ensure you use your Docker image name.
You have also set the number of replicas or pod instances for our application. In this part, we also describe the resources for the containers, and our container will run on port 8501
. It's the default port for Streamlit applications.
-
Service
This part acts as a load balancer for the containerized application. It also exposes the application pods as network services which we can access using an IP address. The application pods will use TCP protocol and run on port
80
of the EKS cluster.
After creating the file, let's deploy the application.
Deployment Command
You will deploy the application using the following command:
kubectl apply -f streamlit-app-deployment.yaml
The kubectl
command above will deploy our application service and all the container replicas or pods to the EKS cluster.
Viewing the Deployed Resources
You will start by viewing the deployed pods. To see the deployed pods, run this command:
kubectl get pods
Output:
Next, next view all the deployments:
kubectl get deployments
Next, view the services:
kubectl get services
The command exposes the containerized application on an EXTERNAL IP
address. You can access the application using the given URL.
Accessing the Application
To access the application, copy the URL and paste it into your browser. You can test the application and use it to make predictions.
1.First prediction
- Second prediction
You have successfully deployed our containerized Streamlit application into an EKS cluster. You have accessed the application using a public URL, and it can make accurate predictions.
Conclusion
In this tutorial, you have learned how to deploy a machine-learning app to the Amazon EKS cluster. You started by building a simple gender classification machine-learning model. You then tested the model, and it made accurate predictions. You then created an application using Streamlit framework.
After creating the application, you built a Docker image for the Streamlit application and pushed it to Docker Hub. You then created the Amazon Elastic Kubernetes Service (EKS) cluster using the eksctl
command. We then deployed the containerized Streamlit application onto the created EKS cluster. Finally, you accessed the application using a public URL, and the application made accurate predictions.
To get complete Python code for the gender classification model in Google Colab, click here. You can get the other code here on GitHub.
If you like this tutorial let's connect on Twitter. Thanks for Reading and Happy Learning!
Top comments (0)