DEV Community

Cover image for Tutorial to Predict the Genre of Books using MindsDB [Mongo API]
Sarvesh S. Kulkarni
Sarvesh S. Kulkarni

Posted on • Edited on

Tutorial to Predict the Genre of Books using MindsDB [Mongo API]

In this tutorial, we will be learning to:
👉 Connect a MongoDB database to MindsDB.
👉 Train a model to predict the genre of books based on the titles and descriptions.
👉 Get a prediction from the model given certain input parameters.

We will be using the Books genre dataset 📖 that can be downloaded from here. You are also free to use your own dataset and follow along the tutorial.

#️⃣Pre-requisites

  1. This tutorial is primarily going to be about MindsDB so the reader is expected to have some level of familiarity with MongoDB Atlas.
  2. In short, MongoDB Atlas is a Database as a Service(DaaS), which we will be using to spin up a MongoDB database cluster and load our dataset.
  3. Download a copy of the Books genre dataset from here.
  4. You are also expected to have an account on MindsDB Cloud. If not, head over to https://cloud.mindsdb.com/ and create an account. It hardly takes a few minutes. ⚡
  5. We also need MongoDB Compass to load the dataset into the collection. It can be downloaded from here.

#️⃣About MindsDB

MindsDB is a predictive platform that makes databases intelligent and machine learning easy to use. It allows data analysts to build and visualize forecasts in BI dashboards without going through the complexity of ML pipelines, all through SQL. It also helps data scientists to streamline MLOps by providing advanced instruments for in-database machine learning and optimize ML workflows through a declarative JSON-AI syntax.
Although only SQL is mentioned, MongoDB is also supported.

#️⃣Dataset Overview

The dataset contains information about the title, genre and summary of books.

#️⃣Setting up a Cluster on MongoDB Atlas

  1. Head over to https://cloud.mongodb.com/ and create a new project named booksdb and within it a new database cluster named cluster0. Typically, it takes a minute or two to provision a cluster. Once it is done, you should have something like this:
    Image description

  2. Click on "Connect" button. In pop-up modal, you will be asked to select a tool to interact with your cluster. For this tutorial I'm choosing MongoDB Compass for connection.
    Image description

  3. Next, you will be asked to choose the connection method.
    Image description

  4. After that, you will be asked to create a new database user. After providing a username and password, click on the "Create Database User" button.
    Image description

  5. In the next step, select "Connect using MongoDB Compass". Copy the connection string which should look like this:

mongodb://cloud.mindsdb.com/
Enter fullscreen mode Exit fullscreen mode

And from Advanced Connection Option, select Authentication method as "Username/Password" and enter your MindsDB cloud credentials.

Now we will connect to our database from MongoDB Compass and load our dataset.

#️⃣Loading the dataset with MongoDB Compass

  1. Head over to https://cloud.mindsdb.com/ and after login click on "Add data" to import data from csv file.
    Image description
    This imported data will be available in files database.

  2. Open MongoDB Compass. Paste the connection string and click on "Connect". On successful authentication, you will be welcomed by this screen.
    Image description

  3. Now open Mongosh terminal in MongoDB Compass, enter the command below to insert a database in MindsDB.

db.databases.insertOne({
    name: "BooksDB", // name of database
    engine: "mongodb", // databaase engine to use,
    connection_args: {
        "port": 27017, // connection port
        "host": "mongodb://cloud.mindsdb.com:27017", // connection host
        "database": "files" // connecting database           
    }
});
Enter fullscreen mode Exit fullscreen mode

Image description
We are now ready to train an ML model to predict the books_genre using MindsDB.

#️⃣Training the ML Model

  1. Head over to MindsDB cloud to create predictor. Enter the command below and execute.
CREATE PREDICTOR mindsdb.books_genre_predictor
FROM files
(SELECT * FROM books)
PREDICT genre;
Enter fullscreen mode Exit fullscreen mode

Image description

  1. That's how simple training an ML model is with MindsDB. Now all you have to do is wait for a few minutes for the model to get trained after which you will be able to run queries and get predictions on the books_genre.

#️⃣Running Queries to get predictions

  1. Once the status changes to COMPLETE, it means that our model is now ready and we can start getting predictions.
    Image description
    We can see that the model has an accuracy of 98.8%, which is impressive!

  2. To start getting prediction, enter the query below in the terminal:

db.books_genre_predictor.find({title:"Yendi", summary:"Six months after he took control of his own territory in the criminal"})
Enter fullscreen mode Exit fullscreen mode

Here we wanted to know the genre of a book with following details.

  1. We can see that the model predicted with 99.9% confidence that the genre for the given book details is "fantasy". Image description

You can play with the inputs and run a few more queries and observe the results.
Thank you for reading this article. If you liked it, please like and share it with others. If you want to learn more about MindsDB, visit their official documentation and/or talk to the team behind it on Slack.

See you in my next article, Happy Querying! 📉

Top comments (0)