Introduction
Out of all the machine learning algorithms I have come across, KNN has easily been the simplest to pick up. Despite it’s simplicity, it has proven to be incredibly effective at certain tasks (as you will see in this article).
And even better? It can be used for both classification and regression problems! It’s far more popularly used for classification problems, however. I have seldom seen KNN being implemented on any regression task. My aim here is to illustrate and emphasize how KNN can be equally effective when the target variable is continuous in nature.
In this article, we will first understand the intuition behind KNN algorithms, look at the different ways to calculate distances between points, and then finally implement the algorithm in Python on the Big Mart Sales dataset. Let’s go!
Table of contents
- A simple example to understand the intuition behind KNN
- How does the KNN algorithm work?
- Methods of calculating distance between points
- How to choose the k factor?
- Working on a dataset
- Additional resources
Available here
This interactive demo lets you explore the K-Nearest Neighbors algorithm for classification.
Each point in the plane is colored with the class that would be assigned to it using the K-Nearest Neighbors algorithm. Points for which the K-Nearest Neighbor algorithm results in a tie are colored white.
You can move points around by clicking and dragging!
Try it here.
More free books and resources, here
Credit: Data Science Central By: TcGyver