About kNN

What is kNN?

kNN is a supervised algorithm ML algorithm used to perform both classification and regression tasks.

How does kNN work?

Classification Tasks

k-NN works by using proximity of k-number of nearest data points (neighbours) and assigns the label of the majority class to the new data point based on k-number of items with the lowest Euclidean Distance.

What happens when there is a tie?

Increase/decrease k by 1
Random Breaking - In this approach, the class is randomly chosen among the tied classes. This approach is less commonly used and is typically only used as a last resort when the other approaches cannot be applied.
Use baseline classifier - the baseline classifier assigns the class that occurs most frequently in the training set to the test example

Regression Tasks

The algorithm takes the k-nearest values of the target variable and compute the mean of those values.

How do we compute the Euclidean distance of two points?

Suppose two points:

Point A: (x1, y1)
Point B: (x2, y2)

We can compute the Euclidean distance as follows:

d = \sqrt{(x_{2} - x_{1})^{2} + (y_{2} - y_{1})^{2}}

In the case of KNN, we will be computing the distance between the new data point and the known data points, assuming there are 2 features (x and y):

d = \sqrt{(x_{n e w} - x)^{2} + (y_{n e w} - y)^{2}}