k-means clustering using TI Nspire

The k-means clustering is probably the simplest of clustering algorithm. Using the built in TI Basic the algorithm can easily be implemented. Source data can be edited in the default spreadsheet editor. A sample data set with two attributes length and weight on three pet types, dog, cats, and rabbits is used for testing. A 3-centroid cluster is selected.

The following is a scatter plot using the default TI Nspire data & statistics application.knn0

The same plot after defining weight and length.

knn3

Running of the k-means program. The dist_to_cluster matrix contains the distance to each centroid for each row of data, and from column-wise the first to third indicates the distance to the corresponding centroid. The last column is the cluster identified that is governed by the minimum value of distance to each of the three clusters. The last command in the screen is to transpose the matrix to fetch the fourth column as row and paste back into the spreadsheet for comparison.

knn1

A comparison chart using R. The default plot by R looks better than the Nspire’s. The label is grouped automatically by the cluster found.

knn2

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s