Create Subsets of a Data frame in R Programming - subset() Function
subset() function in R Programming Language is used to create subsets of a Data frame. This can also be used to drop columns from a data frame.
Syntax:
subset(df, expr)
Parameters:
- df: Data frame used
- expr: Condition for subset
Create Subsets of Data Frames in R Programming Language
Here we will make subsets of dataframe using subset() methods in R language.
Example 1: Basic example of R - subset() Function
We are creating a data frame df
with three rows and columns. Then, we use subset()
to select only the row2
column and store it in df1
. Finally, we print both the original and modified data frames.
df<-data.frame(row1 = 0:2, row2 = 3:5, row3 = 6:8)
print ("Original Data Frame")
print (df)
df1<-subset(df, select = row2)
print("Modified Data Frame")
print(df1)
Output:

Example 2: Create Subsets of Data frame
We are creating a data frame df
with three columns (row1
, row2
and row3
). Then, we use the subset()
function to remove the row2
and row3
columns from the data frame. The modified data frame, containing only row1
, is printed after the operation.
df<-data.frame(row1 = 0:2, row2 = 3:5, row3 = 6:8)
print("Original Data Frame")
print(df)
df<-subset(df, select = -c(row2, row3))
print("Modified Data Frame")
print(df)
Output:

Example 3: Logical AND and OR using subset
We are creating a data frame df
with three columns: ID
, Name
and Age
. Then, we use the subset()
function to create two new subsets:
subset_df
filters the data whereAge
is greater than 25 andID
is less than 4.subset_df2
filters the data whereAge
is greater than 30 orID
is equal to 2.
df <- data.frame(
ID = 1:5,
Name = c("Nishant", "Vipul", "Jayesh", "Abhishek", "Shivang"),
Age = c(25, 30, 22, 35, 28)
)
print("Original Dataframe")
head(df)
subset_df <- subset(df, subset = Age > 25 & ID < 4)
subset_df2 <- subset(df, subset = Age > 30 | ID == 2)
print("Subset 1")
head(subset_df)
print("Subset 2")
head(subset_df2)
Output:

Example 4: Subsetting with Missing Values
We are creating a data frame df
with three columns: ID
, Name
and Age
, where some values are missing (NA). Then, we use the subset()
function to filter out rows where the Age
column has missing values (NA
). The resulting data frame, which contains only rows with non-missing Age
values, is printed.
df <- data.frame(
ID = 1:5,
Name = c("Nishant", "Vipul", NA, "Abhishek", NA),
Age = c(25, 30, NA, 35, NA)
)
print("Original Datafame")
head(df)
subset_df <- subset(df, subset = !is.na(Age))
print("Resultant Dataframe")
head(subset_df)
Output:

In this article, we explored how to create subsets of a data frame in R using the subset()
function. We also demonstrated how to filter rows based on conditions, select specific columns and handle missing values.