Kmeans

An implementation of Kmeans clustering using only Python data structures (lists, tuples, dicts)

This simple implementation of Kmeans takes a list of two dimensional tuples as input, along with the desired number of clusters k and the desired number of model iterations. The algorithm is sensitive to the locations of the initial centroids (randomly selected), and as such should be run a few times to avoid getting stuck on local minima. The version with the highest ratio of between-cluster-variance to within-cluster-variance (BCV / WCV) should be selected. This value should increase on every iteration.

The number of model iterations should be selected such that the BCV / WCV ratio reaches convergence (no change on every next iteration).

The output of the model is a list of tuples of the form (point, cluster assignment), where 'point' is an (x,y) pair, and 'cluster assignment' is a number representing a cluster between 0 and k.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Kmeans.ipynb		Kmeans.ipynb
README.md		README.md
kmeansData.csv		kmeansData.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kmeans

About

Releases

Packages

Languages

acaruso7/Kmeans

Folders and files

Latest commit

History

Repository files navigation

Kmeans

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages