KNNs or K Nearest Neighbors is the simplest Machine Learning algorithm and also quite consequently the slowest too 😂
Nonetheless It's definitely the best algorithm to start your ML journey.
A Machine Learning algorithm can be used for:
- Regression (predict based on past history)
- Classification (sort an event into it's types)
KNN is an algorithm that can be used for both, however it's most commonly used for classification problems.
The Basics/Pre-Requisites
If you've done pythagoras theorem before, that's really all you need to know to understand the algorithm and this algo's gonna be a cake walk for you.
The Algorithm
Consider a graph of a few plotted points
Classification with KNN
Let's say we need to find if the star falls under class A or class B.
What is Nearest Neighbors in KNN
The Nearest Neighbors would be the points that are closest to the Star.
We find the distance of each point to the star and sort it in ascending order.
What is K in KNN
The K refers to the X closest neighbors.
so,
for K=3, we would be consider top 3 nearest neighbors for classification
for K=6, we would consider top 6 nearest neighbors for classification
Choosing K for KNN
For each value of K, The class may vary a lot so we need to choose an optimal(best) value
There is no specific way to find the optimal value of K, but most-of-the time K tends to be very near to root of total no of points.
So start with K=√n and play around by increasing or decreasing K until you are happy
Here we have 10 total points,
so K = √10 which is approximately 3
We could get the best result for K = 3
The Classifying
Now that we have top K of the nearest neighbors, all we have to do is find the mode of them.
We find the class that has majority of points under it and viola, we classified the star.
for K=3, since Yellow is the mode. The star falls under Class B
for K=6, since Purple is the mode. The star falls under Class A
Regression with KNN
We were able to find the class of the star.
But what if each of the points had a value (say price)
With Classification you found the class of the point, but you don't know the price of the point(let's say the points represent cars).
Regression almost exactly the same thing as Classification, except here we find the mean of the top K nearest neighbors instead of the mode.
And there we have predicted both the class and value of an unknown data-point from a data-set
Closing Thoughts
This might not seem like ML, but this is a very commonly used ML algorithm.
ML is really easy when you have a solid understanding of your MATH.
Thanks for the read
If You ❤️ This Article
Stay Tuned
@twitter
@instagram
@youtube
Top comments (4)
Thanks for the visualization👍
Glad it was of help
Good starter article!
😁
wanted to make it as simple as possible