Get unique values from a column in Pandas DataFrame

Last Updated : 28 Nov, 2024

In Pandas, retrieving unique values from DataFrame is used for analyzing categorical data or identifying duplicates. Let's learn how to get unique values from a column in Pandas DataFrame.

Get the Unique Values of Pandas using unique()

The.unique()method returns a NumPy array. It is useful for identifying distinct values in a column, which can be helpful when working with categorical data or detecting outliers. The order of the unique values is preserved based on their first occurrence.

Syntax: DataFrame['column_name'].unique()

Consider the following example: we are retrieving and printing the unique values from the 'B' column using the unique() method.

Python

# Import pandas package
import pandas as pd

# create a dictionary with five fields each
data = {
    'A': ['A1', 'A2', 'A3', 'A4', 'A5'],
    'B': ['B1', 'B2', 'B3', 'B4', 'B4'],
    'C': ['C1', 'C2', 'C3', 'C3', 'C3'],
    'D': ['D1', 'D2', 'D2', 'D2', 'D2'],
    'E': ['E1', 'E1', 'E1', 'E1', 'E1']}

# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
print("Pandas DataFrame:")
display(df)

# Get the unique values of 'B' column
unique_values = df['B'].unique()

# Print the unique values
print("\nUnique values in 'B' column:")
print(unique_values)

Output:

DataFrame-and-Unique-Values — Unique Values from 'B' Column in a Pandas Column

The unique values returned are ['B1', 'B2', 'B3', 'B4'].

Find the unique values in a column using nunique()

Let's use .nunique() method to get the count of unique values in each column of the above dataframe.

Python

# Get number of unique values in column 'A'
unique_values_A = df['A'].nunique()
# Print the number of unique values
print("Number of unique values in 'A' column:", unique_values_A)

# Get number of unique values in column 'B'
unique_values_B = df['B'].nunique()
# Print the number of unique values
print("\nNumber of unique values in 'B' column:", unique_values_B)

# Get number of unique values in column 'C'
unique_values_C = df['C'].nunique()
# Print the number of unique values
print("\nNumber of unique values in 'C' column:", unique_values_C)

# Get number of unique values in column 'D'
unique_values_D = df['D'].nunique()
# Print the number of unique values
print("\nNumber of unique values in 'D' column:", unique_values_D)

Output:

Number of unique values in 'A' column: 5
Number of unique values in 'B' column: 4
Number of unique values in 'C' column: 3
Number of unique values in 'D' column: 2

In addition to the .unique() method, there are other ways to retrieve unique values from a Pandas DataFrame, including:

Table of Content

Get Unique values from a Column in Pandas DataFrame using .drop_duplicates()

The .drop_duplicates() method removes duplicate values in the specified column, returning a DataFrame with only the unique values.

Syntax: DataFrame['column_name'].drop_duplicates()

Example: Get unique values from column 'C'

Python

unique_values = df['C'].drop_duplicates()

print(unique_values)

Output:

0    C1
1    C2
2    C3
Name: C, dtype: object

This method returns the unique values as a Series and preserves the index of the original DataFrame.

Extracting Unique values in Pandas DataFrame Using .value_counts()

The .value_counts() method counts the occurrences of each unique value in the column and returns the result as a Series.

Syntax: DataFrame['column_name'].value_counts()

Example: Get unique values from column 'D' along with their counts

Python

unique_values_count = df['D'].value_counts()

print(unique_values_count)

Output:

D
D2    4
D1    1
Name: count, dtype: int64

This method provides both the unique values and the frequency of each value. To extract just the unique values, you can use .index on the result.

Python

unique_values = df['D'].value_counts().index

print(unique_values)

Output:

Index(['D2', 'D1'], dtype='object', name='D')

Get Unique values from a column in Pandas DataFrame using set()

You can also use Python’s built-in set() function, which converts the column values into a set, automatically removing duplicates.

Syntax: set(DataFrame['column_name'])

Example: Get unique values from column 'D'

Python

unique_values = set(df['D'])

print(unique_values)

Output:

{'D1', 'D2'}

Using set() does not preserve the order of the unique values, but it is a quick way to get distinct values.

In short:

The .unique() method returns a NumPy array of unique values, preserving their order of appearance.
The .drop_duplicates() method returns a Series with unique values, preserving the original index.
The .value_counts() method provides both the unique values and their frequency count.
The set() function quickly returns unique values but does not preserve their order.

Conditional operation on Pandas DataFrame columns

rajput-ji

Improve

Article Tags :

Practice Tags :

python

Get unique values from a column in Pandas DataFrame

Get the Unique Values of Pandas using unique()

Find the unique values in a column using nunique()

Get Unique values from a Column in Pandas DataFrame using .drop_duplicates()

Extracting Unique values in Pandas DataFrame Using .value_counts()

Get Unique values from a column in Pandas DataFrame using set()

Similar Reads

Pandas DataFrame Practice Exercises

Pandas Dataframe Rows Practice Exercise

Pandas Dataframe Columns Practice Exercise

Pandas Series Practice Exercise

Pandas Date and Time Practice Exercise

DataFrame String Manipulation

Accessing and Manipulating Data in DataFrame

DataFrame Visualization and Exporting

Thank You!

What kind of Experience do you want to share?