# kNN Optimization

#### K Nearest Neighbor Optimization Parameters Explained

- n-neighbors
- weights
- algorithm

These are the most commonly adjusted parameters with **k Nearest Neighbor Algorithms. **Let’s take a deeper look at what they are used for and how to change their values:

**n_neighbor: **(default **5**) This is the most fundamental parameter with kNN algorithms. It regulates how many neighbors should be checked when an item is being classified.

weights: (default: “uniform“) Another important parameter, weights, signifies how weight should be distributed between neighbor values.

“__uniform__” : This value will cause weights to be distributed equally among all neighbor values.

“__distance__” : This value will cause weights to be distributed based on their distance (inversely correlated). Closer neighbors will have a higher weight in the algorithm.

__[callable]__ : You can also define a function and assign it to this parameter. Weights will be custom based on the array you are providing.

algorithm: (default: “auto”) Signifies the algorithm that will be used to compute nearest neighbors.

“__auto__“: Uses most suitable algorithm automatically based on dataset.

“__ball_tree__“: Uses BallTree algorithm

“__kd_tree__“: Uses KDTree algorithm

“__brute__“: Uses brute-force search

## Examples:

`knn = KNeighborsClassifier(n_neighbors=40)`

`knn = KNeighborsClassifier(n_neighbors=40, weights="distance")`

`knn = KNeighborsClassifier(algorithm="brute")`

## More parameters

#### More kNN Optimization Parameters for fine tuning

Further on, these parameters can be used for further optimization, to avoid performance and size inefficiencies as well as suboptimal algorithm results:

*leaf_size**p**n_jobs*

### leaf_size

*(default: 30)*

*If BallTree or KDTree algorithms are chosen this will allow additional parameters to be used such as leaf_size, metrics, metric_size.*

*leaf_size is an important parameter that can affect performance and size of the algorithm.*

### p

*(default: 2)*

p parameter signifies the power for Minkowski.

__1__: manhattan_distance (l1)

__2__: euclidean_distance (l2)*minkowski_distance (l_p) can be used for arbitrary p.*

### n_jobs

*(default: None)*

*Signifies the parallel jobs to be allowed at the same time for neighbor algorithm.*

__None:__assigns 1 as value

__-1__: All processors will be used.Official Scikit Learn Documentation: sklearn.neighbors.kNeighborsClassifier