My Coding >
Software >
R >
Cluster analysis >
K-means clustering in R
K-means clustering in RK-means clustering is a process of automatic classification of N observations into K groups or clusters. We will tray to apply this process for the build in iris dataset. We already assume, that we’ve already check and clean this dataset, and also we’ve made it anonymous, i.e. without species name. Calculating K-means clusterFor easier job with datasets we use dplyr library. And at the next step we create dataset without labels for clustering
Now we can cluster it. We know in advance, that we need to split it into 3 groups, or 3 centres (kmeans()). And after splitting it – display it with the labels from original dataset (table()).
As we can see, setosa is clearly separated from other data, but between versicolor and virginica there is some overlapping. Let’s check it graphically. Graphical analysis of K-means clusteringFor better understanding of our problem it is better to do visual inspection of our results. This is a code to display our cluster coloured splitting into 3 groups
and this is splitting into 3 groups on the basis of they species
and compare the results:
It is possible to see overlapping in the area of Petal.Length around 5.0. And it is impossible to find any automatic criteria to separate these data.
|
Last 10 artitles
9 popular artitles
|
|||||
© 2020 MyCoding.uk -My blog about coding and further learning. This blog was writen with pure Perl and front-end output was performed with TemplateToolkit. |