Implementing a Dog Breed Classifier Using Stanford Dogs and MobileNet with HarperDB Custom Functions
Intro
HarperDB is an easy-to-use database solution that has a simple method of creating endpoints to interact with data, called Custom Functions. These Custom Functions can even be used to implement a machine learning algorithm to classify incoming data. TensorFlowJS is a library released by Google that makes it possible to use JavaScript for machine learning so it can be done in the browser or on a NodeJS server like we'll be doing in this article.
Summary
What We're Going To Do
This article will explain how to train and use a TensorFlowJS model to classify dog breeds with HarperDB Custom Functions, using the Stanford Dogs dataset and MobileNetV2 as a base for transfer learning.
Stanford Dogs
There's an awesome dataset that was released by Stanford with 20,000 images of dogs. The images are grouped into different folders, each folder containing the name of the breed. There are additional annotations available for bounding boxes as well, but today we'll be focused solely on classifying the breed.
MobileNet
There's a SOTA (state of the art) model published by Google called MobileNet which is a relatively small model with the ability to classify over 1,000 images. It's built small so it'll run on mobile devices without taking up too many resources. We'll be using version 2 of this model which is available in the @tensorflow-models/mobilenet package.
Transfer Learning
Transfer learning is the technique of taking a pretrained model and training it to output new data. Like teaching an old dog new tricks! For that we'll be using @tensorflow-models/knn-classifier.
We'll be sending an image into MobileNet and getting out the logits, which is the bit right before the classification. Then we'll send those logits into a KNN-Classifier which uses the K-Nearest Neighbors algorithm to associate those logits with specific dog breeds.
Getting Started
If that all sounds complicated, don't worry. This implementation will be quick and easy thanks to HarperDB Custom Functions.
Setup
Prereqs
- A HarperDB Account
- A HarperDB Local Database
Clone the Repo
Clone this repo into your Custom Functions folder
git clone https://github.com/HarperDB/hdb-cf-dogml.git ~/hdb/src/custom_functions/dogml
Restart Custom Functions
Use the link in the HarperDB Studio Functions page (bottom left of the screen) to refresh the projects.
Run /setup
The training data and TensorFlowJS modules need to be installed. This can be done via the /setup
endpoint.
If you go to http://localhost:9926/dogml/setup it'll start the setup. You can check on the progress in the logs - either in stdout from the locally running database or in the logs section of the Status page inside of the Studio.
The expected output of starting setup is {success: true, message: ML Setup Started}
This will use the $HOME/dogml
directory in relation to the database for all of the training materials.
Be sure to wait for the ML Setup Complete note in the database logs.
Activate
Run /train
To train the model, visit the /train
endpoint by going to http://localhost:9926/dogml/train. This will begin the model training. You can see the status inside of the console logs (similar to viewing the info during /setup), or inside of the logs table inside of the schema.
Verify Model
Once the logs indicate that the training is complete, you should be able to see the model appear in the models table in the schema.
Classify a Dog Breed!
Travel to the UI at http://localhost:9926/dogml/ui and try uploading an image of a dog (one of the images in the $HOME/dogml/training_data/Images
directory will do).
The results should appear in the UI as well as in the classifications table.
Go Deeper
Add New Training Data
You can add more training data by adding new images to the $HOME/dogml/training_data/Images
directory - either by putting the image in the correct folder or making a new folder (if it's a breed without a folder already present). All images should be JPEGs.
Removing Training Data
You can also remove training data in the $HOME/dogml/training_data/Images
directory to better target specific breeds.
Update the Model
If you modify the training data and use the /train
endpoint to create a new model, be sure to then call the /update
endpoint at http://localhost:9926/dogml/update to ensure the new model is loaded into the classifier.
Train w/ GPU
To train the model 200% faster, use the /train_gpu
endpoint at http://localhost:9926/dogml/train_gpu. This will take advantage of a CUDA-Enabled Nvidia GPU to process the training mathematics quicker.
Be sure the necessary drivers and CUDA libraries are installed
Here's a guide to install CUDA on Ubuntu
Review
There you have it, you've just trained a machine learning model on dog breed data and can now use it to classify images of dogs and determine the breed. To do this, we used a HarperDB Custom Function and TensorFlowJS to train a MobileNet model on the Stanford Dogs dataset.
Top comments (0)