What Are the Tidyverse Packages in R Language?
When working with Data Science in R then Tidyverse packages is widely used. They were created specifically for data science tasks and follow a consistent design making them easy to use and efficient.
Understanding Tidyverse Packages in R
There are eight core Tidyverse packages namely ggplot2, dplyr, tidyr, readr, purrr, tibble, stringr and forcats that are mentioned in this article. All of these packages are loaded automatically at once with the install.packages("tidyverse") command. .
Category | Popular R Packages |
---|---|
Data Visualization and Exploration | ggplot2 |
Data Wrangling and Transformation | dplyr, tidyr, stringr, forcats |
Data Import and Management | tibble, readr |
Functional Programming | purrr |
In addition to these packages it also has some specialized packages that are not loaded automatically but need to be called. It include the DBI for relational databases, httr for web APIs, rvest for web scraping, etc. Now, let’s see the core Tidyverse packages and learn more about them:
Data Visualization and Exploration in Tidyverse in R
1. ggplot2 library
ggplot2 is an R data visualization library that is based on The Grammar of Graphics. It can create data visualizations such as bar charts, pie charts, histograms, scatterplots, error charts, etc using high-level API. It also allows you to add different types of data visualization components or layers in a single visualization.
To install ggplot2 the best method is to install the tidyverse using:
install.packages("tidyverse")
Or you can install ggplot2 using:
install.packages("ggplot2")
Example: We will be using 6 different data points for the bar plot and then with the help of the fill argument within the aes function, we will be applying the default colors to the barplot in the R programming language.
library("ggplot2")
df <-data.frame(
x=c('A', 'B', 'C', 'D', 'E', 'F'),
y=c(4, 6, 2, 9, 7, 3))
ggplot(df, aes(x, y, fill=x)) + geom_bar(stat="identity")
Output:
Data Wrangling and Transformation in Tidyverse in R
1. dplyr library
dplyr is a popular data manipulation library in R. It has five important functions that are combined naturally with the group_by() function that can help in performing these functions in groups. These functions include:
- mutate() function which can add new variables that are functions of existing variables.
- select() function that selects the variables based on their names.
- filter() function that picks selects the variables based on their values.
- summarise() function that reduces multiple values into a summary.
- arrange() function that arranges the arranges the row orderings.
To install dplyr, the best method is to install the tidyverse using:
install.packages("tidyverse")
Or you can install dplyr using:
install.packages("dplyr")
Example: We are using the dplyr
package to filter the starwars dataset, selecting only the rows where the species is "Droid" and displaying the result with print().
library(dplyr)
print(starwars %>% filter(species == "Droid"))
Output:

2. tidyr library
tidyr is an essential data cleaning library in R that helps us create tidy data where each cell contains a single value, each column represents a variable and each row is an observation. This tidy data approach saves time by reducing the need for continuous cleaning and modifying tools to handle messy data.
Tidyr functions fall into five main categories:
- Pivoting (changing data between long and wide forms)
- Nesting (grouping data into a single row with a nested data frame)
- Splitting and Combining character columns
- Rectangling (converting nested lists into tidy tibbles)
- Handling missing values.
To install tidyr, it’s best to install the tidyverse package using:
install.packages("tidyverse")
Or you can install tidyr using:
install.packages("tidyr")
Example: The gather() function in tidr will take multiple columns and collapse them into key-value pairs, duplicating all other columns as needed.
library(tidyr)
n = 10
tidy_dataframe = data.frame(
S.No = c(1:n),
Group.1 = c(23, 345, 76, 212, 88,
199, 72, 35, 90, 265),
Group.2 = c(117, 89, 66, 334, 90,
101, 178, 233, 45, 200),
Group.3 = c(29, 101, 239, 289, 176,
320, 89, 109, 199, 56))
head(tidy_dataframe)
long <- tidy_dataframe %>%
gather(Group, Frequency,
Group.1:Group.3)
head(long)
Output:

3. Stringr library
stringr is a library designed for data cleaning and preparation tasks, specifically focusing on string manipulation. Functions in stringr start with str and typically take a string vector as the first argument. Some common functions include str_detect(), str_extract(), str_match(), str_count(), str_replace() and str_subset().
To install stringr, it’s best to install the tidyverse using:
install.packages("tidyverse")
Or you can install stringr like:
install.packages("stringr")
Example: We are using the stringr
package to calculate the length of the string "GeeksforGeeks" with the str_length() function, which returns the number of characters in the string.
library(stringr)
str_length("GeeksforGeeks")
Output:
13
4. Forcats library
forcats is an R library designed to handle issues related to factors or categorical variables which are vectors that can only take a predefined set of values. It helps manage tasks like reordering these vectors or adjusting the order of their levels.
Some useful functions in forcats includes:
- fct_relevel() which lets you manually reorder a vector
- fct_reorder() which reorders a factor based on another variable
- fct_infreq() which orders factors by their frequency.
To install forcats, it’s best to install the tidyverse package using
install.packages("tidyverse")
Or you can install forcats from using:
install.packages("forcats")
Example: Below is a example of forecast library.
library(forcats)
head(starwars %>% filter(!is.na(species))
%>% count(species, sort = TRUE))
Output:

Data Import and Management in Tidyverse in R
1. readr library
readr library provides a simple method to read rectangular data such as that with file formats tsv, csv, delim, fwf, etc. It can parse many different types of data using a function that parses the total file and another that focuses on the specific column.
This column specification defines the method to convert the data in the column from a character vector to the data type that is most suited. This is done automatically by readr in most cases.
readr can read different kinds of file formats using different functions
- namely read_csv() for comma-separated files
- read_tsv() for tab-separated files
- read_table() for tabular files
- read_fwf() for fixed-width files
- read_delim() for delimited files
- read_log() for web log files
To install readr, the best method is to install the tidyverse using:
install.packages("tidyverse")
Or you can install readr using:
install.packages("readr").
Example: We are using the readr
package to read a tab-separated file ("geeksforgeeks.txt") without column names using the read_tsv() function. The data is stored in the variable myData and we print the content using print().
library(readr)
myData = read_tsv("geeksforgeeks.txt", col_names = FALSE)
print(myData)
Output:
A computer science portal for geeks.
2. tibble library
A tibble is a form of a data.frame which includes the useful parts of it and discards the parts that are not so important. Tibbles don’t change variables names or types like data.frames nor do they do partial matching but they bring problems to the forefront much sooner such as when a variable does not exist.
Therefore a code with tibbles is effective which makes it compatible with larger datasets that contain more complex objects. It also has an enhanced print() method.
To install tibble the best method is to install the tidyverse using:
install.packages("tidyverse")
Or you can install tibble using:
install.packages("tibble")
Example: We are using the tibble
package to create a data frame named data
with three columns: a
, b
and c
. The column a
contains numbers from 1 to 3, b
contains the first three letters of the alphabet and c
contains dates from the previous 3 days. We then print the data
frame using print()
library(tibble)
data <- data.frame(a = 1:3, b = letters[1:3],
c = Sys.Date() - 1:3)
print(data)
Output:

Functional Programming in Tidyverse in R
1. purrr library
Purrr is a set of tools in R designed to handle functional programming making it easier to work with functions and vectors. One useful feature is the map() function which simplifies complex for loops into cleaner, more readable code. Additionally all purrr functions are type-stable meaning they return the expected output type or throw an error if that’s not possible.
To install purrr
the best approach is to install the tidyverse package using:
install.packages("tidyverse")
Or you can install purrr using:
install.packages("purrr")
Example: We are using purrr
to split the mtcars
dataset by the cyl
column, apply a linear regression model to each subset, extract the summary and then return the R-squared values for each group.
library(purrr)
mtcars %>%
split(.$cyl) %>%
map(~ lm(mpg ~ wt, data = .)) %>%
map(summary) %>%
map_dbl("r.squared")
Output:
4: 0.508632596323146 0.4645101505505488 0.422965536496111
In this article we explored Tidyverse packages in R which provide a cohesive set of tools for data science tasks from data import and cleaning to transformation, visualization and functional programming.