Create a Pandas DataFrame from Lists
Given one or more lists, the task is to create a Pandas DataFrame from them. A DataFrame is a two-dimensional labeled data structure in Pandas similar to an Excel table where data is stored in rows and columns.
Let’s explore different methods to convert lists into DataFrames efficiently.
Using Dictionary of Lists
We create a dictionary where each key represents a column name, and its corresponding list contains the column values. Then we pass it directly to the pd.DataFrame() constructor.
import pandas as pd
names = ["Aparna", "Pankaj", "Sudhir", "Garvit"]
degrees = ["MBA", "BCA", "MTech", "MBA"]
scores = [90, 40, 80, 98]
df = pd.DataFrame({'Name': names, 'Degree': degrees, 'Score': scores})
print(df)
Output
Name Degree Score 0 Aparna MBA 90 1 Pankaj BCA 40 2 Sudhir MTech 80 3 Garvit MBA 98
Explanation:
- Each key ('Name', 'Degree', 'Score') becomes a column name.
- Each list provides data for that column.
- The pd.DataFrame() constructor automatically aligns lists by index to form rows.
Using zip() Function
The zip() function pairs elements from multiple lists into tuples, creating row-wise data combinations. We can convert these tuples into a list and pass it to the DataFrame constructor.
import pandas as pd
names = ["Aparna", "Pankaj", "Sudhir", "Geeku"]
values = [11, 22, 33, 44]
df = pd.DataFrame(list(zip(names, values)), columns=['Name', 'Value'])
print(df)
Output
Name Value 0 Aparna 11 1 Pankaj 22 2 Sudhir 33 3 Geeku 44
Explanation:
- zip(names, values) combines both lists element-wise.
- list(zip(...)) converts the zipped object into a list of tuples.
- columns parameter assigns column names to the DataFrame.
Using Multi-Dimensional List
When data is already structured row-wise, you can directly convert a list of lists into a DataFrame. Each inner list represents a row, and you can specify column names manually.
import pandas as pd
data = [['Tom', 25], ['Krish', 30], ['Nick', 26], ['Juli', 22]]
df = pd.DataFrame(data, columns=['Name', 'Age'])
print(df)
Output
Name Age 0 Tom 25 1 Krish 30 2 Nick 26 3 Juli 22
Explanation:
- Each inner list (['Tom', 25]) forms a single row.
- The columns parameter defines header names.
Changing Data Type After Creating DataFrame
After creating a DataFrame, you can easily convert any column’s data type using the .astype() method.
import pandas as pd
data = [['Tom', 'Reacher', 25], ['Krish', 'Pete', 30], ['Nick', 'Wilson', 26], ['Juli', 'Williams', 22]]
df = pd.DataFrame(data, columns=['FName', 'LName', 'Age'])
df['Age'] = df['Age'].astype(float)
print(df)
Output
FName LName Age 0 Tom Reacher 25.0 1 Krish Pete 30.0 2 Nick Wilson 26.0 3 Juli Williams 22.0
Using Index and Column Names
We can specify both custom indices (row labels) and column names during DataFrame creation. This approach is helpful when you want custom labels for rows or columns.
import pandas as pd
data = ["Aparna", "Pankaj", "Sudhir", "Garvit"]
df = pd.DataFrame(data, index=['a', 'b', 'c', 'd'], columns=['Names'])
print(df)
Output
Names a Aparna b Pankaj c Sudhir d Garvit
Creating DataFrame from Single List
You can also directly convert a simple list into a single-column DataFrame. This method is simple but limited when handling multiple attributes.
import pandas as pd
data = [1, 2, 3, 4, 5]
df = pd.DataFrame(data, columns=['Numbers'])
print(df)
Output
Numbers 0 1 1 2 2 3 3 4 4 5