Bravin Wasike

Posted on May 18, 2023 • Originally published at sweetcode.io

How to Build and Deploy a Machine Learning App to Amazon EKS Cluster

#aws #kubernetes #devops #machinelearning

This tutorial will tech you how to build and deploy a machine learning app to Amazon EKS Cluster. It will also give you detailed steps on how to fully set up an Amazon EKS Cluster.

Amazon Elastic Kubernetes Service (EKS) is a platform that manages containerized applications. Amazon Elastic Kubernetes Service (EKS) is cloud-based. It speeds up the process of deploying containerized applications on a Kubernetes cluster.

The EKS cluster automatically manages the AWS resources. It also sets up the complete infrastructure for the deployed applications. Amazon EKS manages the container resources and schedules how the containers will run. It also schedules how the deployed containers will access the available AWS resources. It ensures the deployed container operates within its resource limits and requirements.

This tutorial will teach you how to deploy a containerized machine learning application onto an Amazon EKS cluster. You will start by building a simple machine-learning model. After testing the model, you will create an application for the model. The application will have a user interface (UI) that will allow people to interact with the model. You will use the Streamlit framework to build the user interface (UI).

Next, you will build a Docker image for the Streamlit application. We will then push the Docker image to Docker Hub. After completing these steps, you will start working with the Amazon Elastic Kubernetes Service (EKS). You will create an EKS cluster with all the required container resources. Finally, you will deploy the containerized Streamlit application onto the created EKS cluster. If this sounds fun, let's start working on our project!

Prerequisites

This tutorial assumes you have some basic knowledge of Docker. To follow along easily with this tutorial, you will need the following:

Python set up on your machine.
Visual Studio Code text editor.
Docker Desktop configured into your machine.
Google Colab. You will use this to write and execute the Python code for our machine-learning model. Google Colab is a free cloud-based Python notebook. It's very powerful when building a machine-learning model.

Getting Started with the Machine Learning Model

A machine model learns from past/historical data. It then recognizes hidden patterns in the data. The model gains knowledge that it uses to predict future outcomes. A machine learning model learns from data and improves its prediction accuracy over time. In this tutorial, you will build a simple machine learning to predict the gender of a person using the first name. We will use a dataset that has a list of names. You can download the dataset here. You will run all the Python code for building the model in Google Colab.

Importing Exploratory Data Analysis (EDA) Packages

These packages will analyze the dataset to discover patterns and gain insights. You will use the following packages:

import pandas as pd
import numpy as np

Let's use the imported pandas package to load the dataset.

df = pd.read_csv('/content/names_dataset.csv')

Checking the Loaded Dataset

To check the data points in the loaded dataset, run this code:

df.head()

The code above displays the following output:

Feature Extraction Package

You will use the CountVectorizer package to extract features from the name column. Extracting features from the text names is a way of text preprocessing. Text preprocessing is essential in Natural Language Processing. The extracted features are the inputs for the model.

from sklearn.feature_extraction.text import CountVectorizer

Next, you will convert the dataset labels in the sex column into numeric labels as follows:

df.sex.replace({'F':0,'M':1},inplace=True)

After this step, you will now perform the feature extraction using the CountVectorizer package:

Xfeatures =df['name']
count_vec = CountVectorizer()
X = count_vec.fit_transform(Xfeatures)

Selecting Features and Labels

The extracted features are the model inputs, and the numeric labels are the model outputs. You select these values from the dataset as follows:

X
y = df.sex

Next, you will split the names dataset into a train and text set using the train_test_split function.

from sklearn.model_selection import train_test_split

Let's use the imported function to split the dataset.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

After splitting the dataset, let's now build the model.

Building the Model

You will build the model using the Naive Bayes classifier algorithm. Run the following code in Google Colab.

from sklearn.naive_bayes import MultinomialNB
md = MultinomialNB()
md.fit(X_train,y_train)

The code above will import MultinomialNB function from the Naive Bayes classifier algorithm to build the model. You will also fit the model into the train set dataset. The model will learn from the training dataset and make predictions.

Using the Gender Classification Model to Make Predictions

You will use the trained model to make predictions on sample names. To use the model to predict a sample name, run the following code in Google Colab:

name1 = ["Mary"]
vect1 = count_vec.transform(name1).toarray()
if md.predict(vect1) == 0:
 print("Female")
else:
 print("Male")

The code above will preprocess the sample text using the count_vec. It predicts the text using the predict function, and prints the prediction result as either Female or Male. The code will print the following prediction result:

Female

It's a correct prediction. You can also use the model to predict another sample name:

name2 = ["Mark"]
vect2 = count_vec.transform(name2).toarray()
if md.predict(vect2) == 0:
 print("Female")
else:
 print("Male")

Output:

Male

It's also a correct prediction. The next step is to save the model as pickle format. The pickle file is easy to use when building the application. You will use the joblib library to save the model.

import joblib

Saving the Gender Classification Model

To save the model, run the following Python code:

FinalModel = open("gender_classification_model.pkl","wb")
joblib.dump(md,FinalModel)
FinalModel.close()

After executing this command, the file gender_classification_model.pkl will appear in Google Colab. Download the file and save it to your local machine as shown in the image below:

You also need to save the CountVectorizer created in this tutorial:

FinalVectorizer = open ("gender_model_vectorizer.pkl", "wb")
joblib.dump(count_vec,FinalVectorizer)
FinalVectorizer.close()

Download and save the gender_model_vectorizer.pkl file into your local machine. You will use these two files when building the Streamlit application. You have now created, tested, and saved the model. The next step is to create a machine-learning application using Streamlit.

Creating a Machine Learning Application using Streamlit

Streamlit is an open-source framework for building web applications for machine learning models. To start, follow the steps below to setup the working directory:

Step 1: Create a directory (folder) named gender_classifier_mlapp_with_streamlit then open it with Visual Studio code. It will be our working directory.

Step 2: In the created directory, create another folder and name it models. In this folder add the downloaded gender_classification_model.pkl and the gender_model_vectorizer.pkl pickle files.

Step 3: Create a file named app.py in your working directory.

Step 4: Installing the necessary packages

You will install the following packages:

pip install sklearn joblib streamlit

Ensure you install these libraries while in the working directory. Let's start working on our application. To start, open the app.py and paste the following code:

from sklearn import naive_bayes
import streamlit as st 

import joblib
import time

This part of the code will import all the installed packages. Next, paste the following code:

# Load the saved model Vectorizer
imported_vectorizer = open("models/gender_model_vectorizer.pkl","rb")
cv = joblib.load(imported_vectorizer)

# Load the saved gender prediction Model
naive_bayes_model = open("models/gender_classification_model.pkl","rb")
clf = joblib.load(naive_bayes_model)

The code above will import the gender_model_vectorizer.pkl and gender_classification_model.pkl from the models folder. The next step is to create the prediction function.

Creating the Prediction Function

You will create the prediction function using the following code:

# Creating the Prediction
def gender_prediction(data):
  vect = cv.transform(data).toarray()
  result = clf.predict(vect)
 return result

Next, you will add the main Streamlit function to design the web application.

Adding the Main Function

You will add the function using the following code:

def main():
 """Gender Classifier App

  """

  st.title("Gender Classifier with Streamlit")
  html_temp = """
  <div style="background-color:purple;padding:10px">
  <h2 style="color:white;text-align:center;">Gender Classifaction App </h2>
  </div>

  """
  st.markdown(html_temp,unsafe_allow_html=True)


  name = st.text_input("Enter Person Name")
 if st.button("Predict Gender"):
    result = gender_prediction([name])
 if result[0] == 0:
      prediction = 'Female'
 else:
      result[0] == 1
      prediction = 'Male'


    st.success('Name: {} was classified as {}'.format(name.title(),prediction))




if __name__ == '__main__':
  main()

The code will create a web application for our machine-learning model. To see the created application, run the following command in your terminal:

streamlit run app.py

The user interface:

1. User Interface for a Male Prediction

2. User Interface for a Female Prediction

The application is now ready. Let's build a Docker image for the application. Before you create the Docker image, you need to create a requirements.txt file. This file will contain all the application's packages and dependencies. You will build the Docker image using the packages and dependencies in this file.

While in the working directory, create a file named requirements.txt. Then run the following command to get the application's packages and dependencies:

pip freeze

You will then copy all the packages and dependencies displayed on the terminal and paste them into the requirements.txt file.

Building a Docker Image for the Streamlit Application

To create a Docker image, create a file named Dockerfile (without any extensions) in the working directory. Open the Dockerfile and paste the following code to build the Docker image:

FROM python:3.10
WORKDIR /app
COPY requirements.txt ./requirements.txt
RUN pip install -r requirements.txt
EXPOSE 8501
COPY . /app
ENTRYPOINT ["streamlit", "run"]               
CMD ["app.py"]

The create the Docker image, run the following command in your terminal.

docker build -t bravinwasike/streamlit-app .

NOTE: When building the Docker image, name it starting with the same name as your Docker Hub user name. It will make it easy to push the created Docker image into your Docker Hub repository.

Pushing the Docker Image into Docker Hub

After logging into your Docker Hub account, create a new repository and name it streamlit-app. Then execute the following code in your terminal:

docker login

After logging into Docker Hub using the terminal, run the following code to push the Docker image:

docker push bravinwasike/streamlit-app

You have now created and pushed the Docker image for the Streamlit application to Docker Hub. The next step is to create the Amazon Elastic Kubernetes Service (EKS) cluster.

Creating the Amazon Elastic Kubernetes Service (EKS) Cluster

To create the Amazon EKS cluster, you need to set up the following:

1. Sign up for an AWS free tier account. The EKS cluster resources for this tutorial will be within the free tier plan limits.

2. AWS CLI.
It is a command-line interface tool that will enable you to access and log into the AWS account from the terminal. You will download and install the AWS CLI from here.

After installing the AWS CLI, run the following command to check its version:

aws --version

You then need to configure the AWS CLI to access the AWS account from the terminal. We will need the Access key ID, Secret access key, AWS Region, and Output format. Follow the steps below to get your Access key ID and Secret access key from your AWS account.

Step 1: Log into the AWS account using your root user.

Step 2: Click on your account icon

Step 3: Click on security credentials

Step 4: Click on Access keys (access key ID and secret access key)

After getting their values, run the following code to configure the AWS CLI:

aws configure

The command will prompt you to input these two access key values, AWS Region and Output format. You will use the default AWS Region and Output format values by pressing Enter.

Kubernetes CLI It's a command-line interface tool that will enable you to work with the AWS EKS cluster. You will download and install the Kubernetes CLI from here.

After installing the Kubernetes CLI, run the following command to check its version:

kubectl version --short --client

Eksctl Eksctl is a command line interface tool that will enable you to create an Amazon EKS cluster easier and faster. A single eksctl command will create an Amazon EKS cluster with all the resources.

To install the eksctl tool, run the following command in your terminal:

choco install -y eksctl

To check the installed eksctl version, run this command in your terminal:

eksctl version

After setting up all these tools, let's create our Amazon EKS cluster.

Creating the Amazon EKS Cluster using eksctl

To create the Amazon EKS cluster named sample-cluster, use the following command:

eksctl create cluster --name sample-cluster

The command will create the cluster. It also assigns all the default AWS resources and Kubernetes nodes. The code displays the following output to show the process.

After creating the cluster, let's deploy the Streamlit application.

Deploying the Streamlit Application

You will use the Docker image in Docker Hub to create a containerized application. You will then deploy it to the created EKS cluster. We create a .yaml file that describes the number of pods and the resources for the application. Lets .yaml file named streamlit-app-deployment.yaml in the working directory. Open the file and paste the following code:

apiVersion: apps/v1
kind: Deployment
metadata:
 name: streamlit-app-deployment
spec:
 replicas: 1
 selector:
 matchLabels:
 app: streamlit-app
 template:
 metadata:
 labels:
 app: streamlit-app
 spec:
 containers:
      - name: streamlit-app
 image: bravinwasike/streamlit-app
 resources:
 limits:
 memory: "512Mi"
 cpu: "500m"
 ports:
        - containerPort: 8501

---

apiVersion: v1
kind: Service
metadata:
 name: streamlit-app-service
spec:
 type: LoadBalancer
 selector:
 app: streamlit-app
 ports:
  - protocol: TCP
 port: 80
 targetPort: 8501

The streamlit-app-deployment.yaml file has two parts: Deployment and the Service.

Deployment This part describes the container name: streamlit-the app. It also describes the Docker image that creates the container: bravinwasike/streamlit-app. You will use the Docker image we had earlier pushed to Docker Hub, so ensure you use your Docker image name.

You have also set the number of replicas or pod instances for our application. In this part, we also describe the resources for the containers, and our container will run on port 8501. It's the default port for Streamlit applications.

Service This part acts as a load balancer for the containerized application. It also exposes the application pods as network services which we can access using an IP address. The application pods will use TCP protocol and run on port 80 of the EKS cluster.

After creating the file, let's deploy the application.

Deployment Command

You will deploy the application using the following command:

kubectl apply -f streamlit-app-deployment.yaml

The kubectl command above will deploy our application service and all the container replicas or pods to the EKS cluster.

Viewing the Deployed Resources

You will start by viewing the deployed pods. To see the deployed pods, run this command:

kubectl get pods

Output:

Next, next view all the deployments:

kubectl get deployments

Next, view the services:

kubectl get services

The command exposes the containerized application on an EXTERNAL IP address. You can access the application using the given URL.

Accessing the Application

To access the application, copy the URL and paste it into your browser. You can test the application and use it to make predictions.

1.First prediction

Second prediction

You have successfully deployed our containerized Streamlit application into an EKS cluster. You have accessed the application using a public URL, and it can make accurate predictions.

Conclusion

In this tutorial, you have learned how to deploy a machine-learning app to the Amazon EKS cluster. You started by building a simple gender classification machine-learning model. You then tested the model, and it made accurate predictions. You then created an application using Streamlit framework.

After creating the application, you built a Docker image for the Streamlit application and pushed it to Docker Hub. You then created the Amazon Elastic Kubernetes Service (EKS) cluster using the eksctl command. We then deployed the containerized Streamlit application onto the created EKS cluster. Finally, you accessed the application using a public URL, and the application made accurate predictions.

To get complete Python code for the gender classification model in Google Colab, click here. You can get the other code here on GitHub.

If you like this tutorial let's connect on Twitter. Thanks for Reading and Happy Learning!

DEV Community