R factors
Factors in R is a categorical variables (aka dummy variables). They can be used to keep unique categories from the vector. This can be considered as a analogy of structure set() from Python
It is possible to make factors from any vecotor in R
> test <- c("low", "high", "medium", "high", "medium", "low", "high", "low")
> tt <- factor(test)
> tt
[1] low high medium high medium low high low
Levels: high low medium
To see the content of levels in factors it is possible to use all standard operations, as we use for vectors.
> levels(tt) # Display all levels
[1] "high" "low" "medium"
> levels(tt)[2] # Check information about level [2]
[1] "low"
> nlevels(tt) # Check number of levels
[1] 3
Base function to work with factors are:
- str() - To check the structure of factor
- table() - To named vector with all information in our factor
- as.character() - To give original data vector
- as.numeric() - To give numerical data of original vector
> str(tt)
Factor w/ 3 levels "high","low","medium": 2 1 3 1 3 2 1 2
> table(tt)
tt
high low medium
3 3 2
> as.character(tt)
[1] "low" "high" "medium" "high" "medium" "low" "high" "low"
> as.numeric(tt)
[1] 2 1 3 1 3 2 1 2
The main interest in factor is the result if table() function, which give us named vector and we van use it as named vector
> unname(table(tt)["low"])
[1] 3
> unname(table(tt)[levels(tt)[2]])
[1] 3
> unname(table(tt)[2])
[1] 3
Published: 2021-11-10 21:36:41