Community Author: Chandre Van Der Westhuizen
The garment industry is an essential exemplar of industrial globalization. It is of utmost importance for the decision makers in the garment industry to analyze,track and predict the productivity performance of employees as the companies rely on the production and delivery performance of their factory workers to satisfy the massive global demands for garment products. MindsDB's in-database machine learning capabilities provides the convenience of predicting productivity to assist employers with employee management.
This tutorial will explore how employers can make predictions on the productivity of employees producing garments using a machine learning regression predictive model. You will get to learn how to connect your database directly to MindsDB, create and train models and make predictions.
Pre-requisites
- Access to MindsDB Cloud or local deployment via docker or pip installation.
- Dataset: You can find the dataset on Kaggle
Connecting your Database to MindsDB.
The first step will be to establish a database connection via MindsDB's GUI which has a SQL Editor that makes it easy to execute queries. MindsDB enables connections to your favorite databases, data warehouses and data lakes via the CREATE DATABASE syntax.
First, we need to connect MindsDB to the database where the Employee producing garments data is stored:
- Access MindsDB GUI on either cloud or the URL 127.0.0.1:47334/
- On the default page, select the button Add Data or alternatively select the plug icon on the left sidebar.
The 'Select your data source' page will populate for you to choose your database type. MindsDB has a list of databases that can be chosen. For this tutorial we will be selecting the postgres database button.
Once you have selected the database type,the page will automatically navigate to the SQL Editor where the syntax to create a database connection will automatically populate for you to enter the required parameters.
The required parameters are:
- CREATE DATABASE display_name --- display name for database.
- WITH ENGINE = "postgres", --- name of the mindsdb handler
- PARAMETERS = {
- "user": " ", --- Your database user.
- "password": " ", --- Your password.
- "host": " ", --- host, it can be an ip or an url.
- "port": "5432", --- common port is 5432.
- "database": " " --- The name of your database *optional. }
Select the Run
button or Shift+Enter to execute the syntax. Once the Database connection is created the console will display a message 'Query successfully completed'.
Please note that some database connections require running a Ngrok tunnel to establish a connection.
Run the ngrok command in a terminal:ngrok tcp [db-port]
for example,if your port number is 5433 you will see a similar output:
Session Status online Account myaccount (Plan: Free) Version 2.3.40 Region United States (us) Web Interface http://127.0.0.1:4040 Forwarding tcp://6.tcp.ngrok.io:14789 -> localhost:5433
The forwarded address information will be required when connecting to MindsDB's GUI. Select and copy the 'Forwarding' information, in this case it is
6.tcp.ngrok.io:14789
, where 6.tcp.ngrok.io will be used for the host parameter and 14789 as the port number.
Once the database integration is successful we can query the table from the database to ensure the data pulls through on MindsDB.
You can run queries directly on your database. The below is an example of the syntax used:
SELECT *
FROM example_db.datasource.table_name
LIMIT 10;
We will run the query to preview the data:
SELECT *
FROM mindsdb_predictions.data.garments
LIMIT 10;
MindsDB also has a functionality to Upload a file and SELECT data from the file to create a machine learning model.
Understanding the data
+----------+----------+------------+----------+------+-----------------------+-------+--------+-----------+-----------+-----------+----------+--------------------+---------------+---------------------+
| date | quarter | department | day | team | targeted_productivity | smv | wip | over_time | incentive | idle_time | idle_men | no_of_style_change | no_of_workers | actual_productivity |
+----------+----------+------------+----------+------+-----------------------+-------+--------+-----------+-----------+-----------+----------+--------------------+---------------+---------------------+
| 1/1/2015 | Quarter1 | sweing | Thursday | 12 | 0.8 | 11.41 | 968.0 | 3660 | 50 | 0.0 | 0 | 0 | 30.5 | 0.800570492 |
| 1/1/2015 | Quarter1 | sweing | Thursday | 6 | 0.8 | 25.9 | 1170.0 | 1920 | 50 | 0.0 | 0 | 0 | 56.0 | 0.800381944 |
| 1/1/2015 | Quarter1 | sweing | Thursday | 7 | 0.8 | 25.9 | 984.0 | 6720 | 38 | 0.0 | 0 | 0 | 56.0 | 0.800125 |
| 1/1/2015 | Quarter1 | sweing | Thursday | 3 | 0.75 | 28.08 | 795.0 | 6900 | 45 | 0.0 | 0 | 0 | 57.5 | 0.753683478 |
| 1/1/2015 | Quarter1 | sweing | Thursday | 2 | 0.75 | 19.87 | 733.0 | 6000 | 34 | 0.0 | 0 | 0 | 55.0 | 0.753097531 |
+----------+----------+------------+----------+------+-----------------------+-------+--------+-----------+-----------+-----------+----------+--------------------+---------------+---------------------+
Where:
Column | Description | Data Type | Usage |
---|---|---|---|
date |
Date in MM-DD-YYYY | integer |
Feature |
quarter |
A portion of the month. A month was divided into four quarters | character varying |
Feature |
department |
Associated department with the instance | integer |
Feature |
day |
Day of the Week | character varying |
Feature |
team |
Associated team number with the instance | integer |
Feature |
targeted_productivity |
Targeted productivity set by the Authority for each team for each day. | integer |
Feature |
smv |
Standard Minute Value, it is the allocated time for a task | integer |
Feature |
wip |
Work in progress. Includes the number of unfinished items for products | integer |
Feature |
over_time |
Represents the amount of overtime by each team in minutes | integer |
Feature |
incentive |
Represents the amount of financial incentive (in BDT) that enables or motivates a particular course of action. | integer |
Feature |
idle_time |
The amount of time when the production was interrupted due to several reasons | integer |
Feature |
idle_men |
The number of workers who were idle due to production interruption | integer |
Feature |
no_of_style_change |
Number of changes in the style of a particular product | integer |
Feature |
no_of_workers |
Number of workers in each team | integer |
Feature |
actual_productivity |
The actual % of productivity that was delivered by the workers. It ranges from 0-1 | integer |
Label |
A label is the thing we're predicting—the y variable in simple linear regression.
A feature is an input variable—the x variable in simple linear regression.
Creating a Regression Machine Learning Predictive model.
To make predictions, we will create a machine learning model using the CREATE PREDICTOR
statement and make a Regression prediction.
In the syntax we will specify which columns/ features to use for training and what column the model should learn to predict as our target/label. In this case we will select all the data to train with and use the column 'Target' as our target variable/label that we want to predict.
In the SQL Editor, the below syntax will be executed:
CREATE PREDICTOR garments_productivity
FROM mindsdb_predictions
(SELECT * FROM data.garments)
PREDICT actual_productivity;
Select the Run
button or Shift+Enter to execute the syntax. If there are no issues with creating the query you will receive the message in the console 'Query successfully completed'.
The model can take a while to train. The status of the model's training can be checked with the below syntax:
SELECT * FROM mindsdb.predictors WHERE name='garments_productivity'
As soon as the model is created the status should show training.Once the model is finished training,the status will show complete. Below you can see that the model has an accuracy of 84%.
The Predictor Status Must be 'complete' Before Making a Prediction
Making a Regression Prediction
When the status of the predictor shows complete, we can go ahead and make a prediction. Predictions are made using the SELECT
statement by querying the predictor as if it is a table. We will provide the features with parameters to get a result.
The below syntax will be used to make a prediction:
SELECT actual_productivity, actual_productivity_explain FROM garments_productivity WHERE date='1/1/2015' AND quarter='Quarter1' AND department='sweing' AND day='Thursday'
AND team=8 AND targeted_productivity=0.8 AND smv=26.16 AND wip=1108
AND over_time=7080 AND incentive=98 AND idle_time=0 AND idle_men=0
AND no_of_style_change=0 AND no_of_workers=59;
Run the above syntax and you will see the below results:
The actual productivity has a value from 0-1.The model predicted that with these parameters provided the result for the productivity of the employees will be 0.93 and by using Target_explain
we can see the confidence levels which shows that the confidence_lower_bound is 0.8556 and confidence_higher_bound is 1. This means that the garment company can expect to have the productivity of the employees to be between 0.8556 and 1.
Making Batch Predictions.
The model can also make batch predictions using the JOIN
clause:
SELECT a.actual_productivity as productivity, b.actual_productivity as predicted_productivity,
a.date, a.quarter, a.department, a.day,
a.team, a.targeted_productivity, a.smv, a.wip,
a.over_time, a.incentive, a.idle_time, a.idle_men,
a.no_of_style_change, a.no_of_workers
FROM mindsdb_predictions.data.garments as a
JOIN mindsdb.garments_productivity as b limit 5;
Results:
+--------------+------------------------+----------+----------+------------+----------+------+-----------------------+-------+--------+-----------+-----------+-----------+----------+--------------------+---------------+
| productivity | predicted_productivity | date | quarter | department | day | team | targeted_productivity | smv | wip | over_time | incentive | idle_time | idle_men | no_of_style_change | no_of_workers |
+--------------+------------------------+----------+----------+------------+----------+------+-----------------------+-------+--------+-----------+-----------+-----------+----------+--------------------+---------------+
| 0.800570492 | 0.7820790218694679 | 1/1/2015 | Quarter1 | sweing | Thursday | 12 | 0.8 | 11.41 | 968.0 | 3660 | 50 | 0.0 | 0 | 0 | 30.5 |
| 0.800381944 | 0.8169200021797305 | 1/1/2015 | Quarter1 | sweing | Thursday | 6 | 0.8 | 25.9 | 1170.0 | 1920 | 50 | 0.0 | 0 | 0 | 56.0 |
| 0.800125 | 0.7619499385051534 | 1/1/2015 | Quarter1 | sweing | Thursday | 7 | 0.8 | 25.9 | 984.0 | 6720 | 38 | 0.0 | 0 | 0 | 56.0 |
| 0.753683478 | 0.762949004815915 | 1/1/2015 | Quarter1 | sweing | Thursday | 3 | 0.75 | 28.08 | 795.0 | 6900 | 45 | 0.0 | 0 | 0 | 57.5 |
| 0.753097531 | 0.752386052621824 | 1/1/2015 | Quarter1 | sweing | Thursday | 2 | 0.75 | 19.87 | 733.0 | 6000 | 34 | 0.0 | 0 | 0 | 55.0 |
+--------------+------------------------+----------+----------+------------+----------+------+-----------------------+-------+--------+-----------+-----------+-----------+----------+--------------------+---------------+
Want to try it out for yourself? Sign up for a free MindsDB account and join our community!
Engage with MindsDB community on Slack or Github to ask questions, share and express ideas and thoughts!
Make sure to check out the official website of MindsDB. For more check out other tutorials and MindsDB documentation.
Top comments (0)