I am a student at University of Gothenburg studying Cognitive Science. The last couple of weeks I have spent as a summer intern at Stratiteq and have been working on an AI project about how to detect social distancing with use of drone surveillance.
Nowadays social distancing is important in our society and it sure would be useful to develop autonomous tools for helping us out with keeping the safe distance. In this post I will explain how I trained a custom model and used it for calculating distance between people.
The model was trained with Azure Custom Vision, an AI service which allows easy customisation and training of custom models. It was then tested via a custom-made web page and with JavaScript code. The drone used for this project was DJI Mavic Mini.
The first step is to make sure that the model we are creating in Custom Vision is trained to be able to detect a person and to distinguish one from other appearing objects. To do this we need enough data that can be trained on and to get hold of this we used Aerial Semantic Segmentation Drone Dataset found on Kaggle together with a few test pictures we took with the drone at our Stratiteq after-work event. In total I used 100 pictures for the training.
When uploading a picture, the so far untrained model will point out what it thinks is an object and you will then need to correctly tag it, in this case we will point out and tag all the people in each picture with "person". After doing this the model can be trained and tested. For each iteration you will get a performance measure consisting of precision, recall and mAP, standing for:
- Precision – the fraction of relevant instances among the retrieved instances
- Recall – the fraction of the total amount of relevant instances that were retrieved
- mAP – overall object detector performance across all tags
A good starting point for creating the custom-made web page is to use Microsoft’s Quickstart: Analyze a remote image using the REST API and JavaScript in Computer Vision. We can easily modify this quick start example for our own calculations.
Beside defining the subscription key and the endpoint URL we need to change the quick start example to use Custom Vision endpoint. These can be seen in the the Custom Vision dashboard under "Prediction URL".
var uriBase = endpoint + "customvision/v3.0/Prediction/…";
We also need to set the custom header “Prediction-Key” for our request.
xhrObj.setRequestHeader("Prediction-Key","…");
Custom Vision will analyze the pictures we send and provide with result data out of our created model. For our testing purposes we uploaded the pictures to the Azure Blob Storage.
In order to calculate the distance in code from the result data, we will use the prediction values for the detected people. With each of the x, y, height and width values we get we will calculate the center of the object bounding boxes.
var x0 = x[i] + width[i] / 2;
var y0 = y[i] + height[i] / 2;
Applying the Pythagorean theorem gives us the distance between two centers, in our case that gives us the distance between two persons.
var distanceInPixels = Math.sqrt((x0 - x1)**2 + (y0 - y1)**2);
The calculation is currently made in pixels and we would like to have it in meters. When taking the test pictures with the drone we made measurements and markings on the ground to be able to tell the actual area size. Before we tested the pictures, we cropped them to these markings. We also knew the flight height of the drone.
The calculations were visualised on our web page by drawing a bounding box for each of the detected person, and by drawing lines between them. The line between the persons will be green if the distance is 2 meters or more and red if the distance is less than 2 meters.
var canvas = document.getElementById('resultImage');
var context = canvas.getContext('2d');
context.beginPath();
context.moveTo(x0, y0);
context.lineTo(x1, y1);
if (distanceInPixels < meterToPixel) {
context.strokeStyle = 'Red';
} else {
context.strokeStyle = 'LightGreen';
}
context.lineWidth = 4;
context.stroke();
In the following animated GIF you can see the test results from 16 pictures we used in the testing.
Except in the Covid-19 situation this kind of autonomous distance calculation can be useful in other areas. Different kind of working places such as the ones dealing with hazardous materials for example, could have use of it. That would of course require different kinds of improvements of this simple model. It would need to be able to detect different kinds of objects and not only people which would require additional data to train on.
Looking at this workflow and the result you can see that Microsoft Azure Cognitive Services provides an easy way to develop custom applications powered by Artificial Intelligence.
Thank you for reading, I hope this post gave you an idea how to build custom models and how to empower your applications with AI!
Top comments (4)
Amazing....👍
Nice Idea, 😄, Is distance in pixels accurate, we need distance in meters right ? Also how did you create that video with bounding box, using
Javascript
? What I am planning is capture objects from camera, convert the video to frames and detect the object , apply bounding boxes, then convert back from frames to video. Hope this is possible.Some comments may only be visible to logged-in visitors. Sign in to view all comments.