k Nearest Neighbor (kNN) History

k Nearest Neighbor (kNN) History

27 July 2020
by HolyPython

Technical report becomes groundbreaking kNN discovery

kNN is a truly awesome approach to classifying data and its beauty is its simplicity.

But who first came up with the idea of k-Nearest Neighbors?

Evelyn Fix (1904-1965) was a mathematician/statistician who did her Phd at Berkeley and continued to teach statistics there.

Joseph Lawson Hodges Jr. (1922-2000) was also a statistician at Berkeley and was involved in statistical collaborations with the Twentieth Air Force of USAF (United States Air Forces) starting from 1944.

These two brilliant minds met on a technical analysis report in 1951 produced for USAF where they introduced a non-parametric classification method (discriminant analysis). They never published the paper officially probably because of the nature of the work and confidentiality involved especially considering the global atmosphere in those days shortly after WWII.

This is the first known introduction of non-parametric classification which became the famous kNN algorithm later on.

Journal Article: E. Fix and J.L. Hodges: An Important Contribution to Nonparametric Discriminant Analysis and Density Estimation: Commentary on Fix and Hodges (1951) can be found here.

Next, Thomas Cover and Peter Hart proved an upper bound error rate with multiclass kNN classifications in 1967 (twice the Bayes error rate).

Research paper: Nearest neighbor pattern classification (1967)

Cover & Hart’s discovery paved the way for a number of new rejection studies such as:

In 1970: Edward Hellman examined “the (k,k?) nearest neighbor rule with a reject option“.

In 1975: Fukunaga and Hostetler made refinements with respect Bayes error rate
In 1976: Dudani; in 1978 Bailey and Jain published distance weighted approaches
In 1983: Adam Jozwik introduced: A learning scheme for a fuzzy k-NN rule

In 1985: James Keller et al developed FKNN (Fuzzy kNN): A fuzzy k-nearest neighbor algorithm
In 2000: Bermejo and Cabestany published: Adaptive soft k-nearest-neighbour classifiers.

There has been many improvements to kNNs since those days and new approaches continue to emerge to this day.

Naomi Altman’s article & research paper are also definitely worth a good read:

Seven decades after its initial discovery kNN algorithm remains popular and it is still widely used by academia as well as industry professional.

k-Nearest Neighbors algorithm was special in a way that no similar non-parametric supervised machine learning algorithm existed when it was initially introduced.

Research continues to make kNN algorithm more precise and more efficient. Based on its intuitive algorithmic mechanism that’s simple as counting to understand kNN’s popularity and usage is likely to remain.

Thank you for visiting this Machine Learning tutorial. If you have enjoyed it you can find similar articles about other algorithms in the link below:

Machine Learning Algorithms

Holy Python

Roots in the USAF

Berkeley statisticians collaborate on USAF report

Continued Contributions

Further Reading

Finishing Thoughts