top of page

CLUSTER ANALYSIS

​

Firstly, the quantitative data is scaled. Then the optimal number of clusters is found out to be 5 by the Elbow method.

Then, the clusters of the cereals data based on calories, nutrients, and ratings, are plotted.

Now, clusters silhouette plot is done too in order to check if the clustering was done correctly.

There seems to be only 1 out of 77 cereals wrongly allotted in all of the 5 clusters, so that's pretty much acceptable.

​

Finally, hierarchical clustering is done for the cereals data by first calculating the distance of the data variables among each other using euclidean method and then plotting an appropriate dendogram by using the Ward method.

bottom of page