My Coding >
Software >
R >
Cluster analysis >
rpart Decision tree for clustering in R
rpart Decision tree for clustering in RBefore start of any analysis we need to check our dataset as it described here, in the section of the data Data Preparing for Cluster analysis For using decision tree we need to use library rpart
Training and test modelWhen we have a lot of data, it is easier to select randomly test set by specifying the amount of percent of data in the and training sets. From my previous experience, i’ve find out, that 5% for test set is a good enough for many kind of statistical analysis
As a result, it will be 142 observations in the train set and only 7 observations in the test set. Decision treeCalculate decision treeWe will calculate decision tree (rpart())of relation of Species to Petal length and Petal width of irises. And then we will draw the decision tree (plot()) and label everything (text)
After this code we will have this tree
As you can see, setosa was separated on the basis of Petal.Length and into this group all 47 points were fitted ideally. versicolor and virginica were separated on the basis of Petal.Width and this group is slightly mixed. versicolor has 47 correct and 5 wrong samples and virginica has 42 correct and 1 wrong data point Predict with decision treeWe can use this decision tree to predict our test set. We will use our calculated decision tree iris_tree2 to apply it towards iris_test set with function predict() and we will calculate probability of each variant type = "prob" and also will ask about classification according to these probabilities vector type = "class"
As we can see all data are classified perfectly
|
Last 10 artitles
9 popular artitles
|
|||
© 2020 MyCoding.uk -My blog about coding and further learning. This blog was writen with pure Perl and front-end output was performed with TemplateToolkit. |