Open In App

R-Factors

Last Updated : 06 Jun, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Factors in R Programming Language are used to represent categorical data, such as "male" or "female" for gender. While they might seem similar to character vectors, factors are actually stored as integers with corresponding labels. Factors are useful when dealing with data that has a fixed set of possible values, known as levels. These levels are sorted alphabetically by default, and once created, a factor can only contain those predefined levels.

Attributes of Factors in R Language

  • x: The vector to be converted into a factor.
  • Levels: The distinct values assigned to the factor.
  • Labels: Character labels for each level.
  • Exclude: Specifies values to exclude from the factor.
  • Ordered: Indicates whether the factor levels should be ordered.
  • nmax: Sets the maximum number of levels allowed for the factor.

R - Factors GeeksforGeeks

1. Creating a Factor in R Programming Language

To create a factor in R, we use the factor() function, which converts a vector into a factor. Here are the two main steps:

  1. Create a vector: Start by defining a vector with the values you want to categorize.
  2. Convert the vector into a factor: Use the factor() function to turn the vector into a factor, defining its levels.

Example: Creating a Gender Factor

Let’s create a factor for gender with the levels "female", "male", and "transgender".

R
x <-c("female", "male", "male", "female")
print(x)

gender <-factor(x)
print(gender)

Output 

[1] "female" "male" "male" "female"
[1] female male male female
Levels: female male

Levels can also be predefined by the programmer. 

R
gender <- factor(c("female", "male", "male", "female"),
          levels = c("female", "transgender", "male"))
          
print(gender)

Output 

[1] female male male female
Levels: female transgender male

Further one can check the levels of a factor by using function levels()

2. Checking for a Factor in R

The function is.factor() is used to check whether the variable is a factor and returns "TRUE" if it is a factor. 

R
gender <- factor(c("female", "male", "male", "female"));
print(is.factor(gender))

Output 

[1] TRUE

Function class() is also used to check whether the variable is a factor and if true returns "factor". 

R
gender <- factor(c("female", "male", "male", "female"))
class(gender)

Output 

[1] "factor"

3. Accessing elements of a Factor in R

We can access the elements of a factor. If gender is a factor then gender[i] would mean accessing an i^{th} element in the factor. 

R
gender <- factor(c("female", "male", "male", "female"))
print(gender[3])

Output 

[1] male
Levels: female male

More than one element can be accessed at a time. 

R
gender <- factor(c("female", "male", "male", "female"))
print(gender[c(2, 4)])

Output 

[1] male female
Levels: female male

4. Modification of a Factor in R

After a factor is formed, its components can be modified but the new values which need to be assigned must be at the predefined level. 

Example  

R
gender <- factor(c("female", "male", "male", "female"  ))
gender[2]<-"female"
print(gender)

Output 

[1] female female male female
Levels: female male

For selecting all the elements of the factor gender except i^{th} element, gender[-i] should be used. So if you want to modify a factor and add value out of predefined levels, then first modify levels. 

R
gender <- factor(c("female", "male", "male", "female"  ))

levels(gender) <- c(levels(gender), "other")    
gender[3] <- "other"

print(gender)

Output

[1] female male other female
Levels: female male other

5. Removing Elements from a factor in R

Subtract one element at a time by using square brackets to subset the vector and remove the element.

R
gender <- factor(c("female", "male", "male", "female"  ))
print(gender[-3])

Output 

[1] female male female
Levels: female male

6. Factors in Data Frame 

A Data frame in R is similar to a 2D array, where each column represents a variable and each row represents a set of values for those variables. When working with data frames in R, we need to keep these points in mind:

  • Column names are required and cannot be empty.
  • Each row must have unique names.
  • Data in a data frame can only be of three types: factor, numeric, or character.
  • Each column must have the same number of data entries.
R
age <- c(40, 49, 48, 40, 67, 52, 53)  

salary <- c(103200, 106200, 150200,
            10606, 10390, 14070, 10220)

gender <- c("male", "male", "transgender", 
            "female", "male", "female", "transgender")

employee <- data.frame(age, salary, gender = factor(gender))  

print(employee)  

print(is.factor(employee$gender)) 

Output

age salary gender
1 40 103200 male
2 49 106200 male
3 48 150200 transgender
4 40 10606 female
5 67 10390 male
6 52 14070 female
7 53 10220 transgender
[1] TRUE

In this article, we explored the concept of factors in R, how to create and modify them, and how they are used in data frames to represent categorical data efficiently.


Next Article
Article Tags :

Similar Reads