pandas.concat() function in Python
pandas.concat() function concatenate two or more pandas objects like DataFrames or Series along a particular axis. It is especially useful when combining datasets either vertically (row-wise) or horizontally (column-wise). Example:
import pandas as pd
df1 = pd.DataFrame({'A': ['A0', 'A1'], 'B': ['B0', 'B1']})
df2 = pd.DataFrame({'A': ['A2', 'A3'], 'B': ['B2', 'B3']})
res = pd.concat([df1, df2])
print(res)
Output
A B 0 A0 B0 1 A1 B1 0 A2 B2 1 A3 B3
Explanation: pd.concat([df1, df2]) stacks DataFrames row-wise and keeps their original indices, causing duplicates.
Syntax
pandas.concat(objs, axis=0, join='outer', ignore_index=False, keys=None, ...)
Parameters:
Parameter | Description |
---|---|
objs | A list or tuple of pandas objects (DataFrames or Series) to concatenate |
axis | The axis along which to concatenate: 0 for rows (default), 1 for columns |
join | How to handle indexes: 'outer' (default) or 'inner' |
ignore_index | If True, the index will be reset in the result |
keys | Create a hierarchical index using passed keys |
Returns: A new pandas object (typically a DataFrame) that is the result of concatenation.
Examples
Example 1: In this, we are concatenating two DataFrames side by side (column-wise) using axis=1.
import pandas as pd
df1 = pd.DataFrame({'A': ['A0', 'A1']})
df2 = pd.DataFrame({'B': ['B0', 'B1']})
res = pd.concat([df1, df2], axis=1)
print(res)
Output
A B 0 A0 B0 1 A1 B1
Explanation: pd.concat([df1, df2], axis=1) joins DataFrames column-wise, aligning rows by index. It places the columns of df2 to the right of df1, forming a wider DataFrame.
Example 2: In this, we are concatenating two DataFrames vertically (row-wise) and resetting the index using ignore_index=True.
import pandas as pd
df1 = pd.DataFrame({'A': ['A0', 'A1']})
df2 = pd.DataFrame({'B': ['B0', 'B1']})
res = pd.concat([df1, df2], ignore_index=True)
print(res)
Output
A B 0 A0 NaN 1 A1 NaN 2 NaN B0 3 NaN B1
Explanation: pd.concat([df1, df2], ignore_index=True) stacks the DataFrames row-wise and resets the index in the result. Since the columns don’t match, pandas fills missing values with NaN.
Example 3: In this, we are concatenating two DataFrames vertically and labeling each with a group name using keys.
import pandas as pd
df1 = pd.DataFrame({'A': ['A0', 'A1']})
df2 = pd.DataFrame({'B': ['B0', 'B1']})
res = pd.concat([df1, df2], keys=['group1', 'group2'])
print(res)
Output
A B group1 0 A0 NaN 1 A1 NaN group2 0 NaN B0 1 NaN B1
Explanation: pd.concat([df1, df2], keys=['group1', 'group2']) stacks DataFrames vertically and adds a hierarchical index. The keys label each block, making it easier to identify which DataFrame each row came from.
Example 4: In this, we are concatenating two DataFrames but keeping only the common columns using join='inner' and resetting index.
import pandas as pd
df1 = pd.DataFrame({'A': ['A0'], 'B': ['B0']})
df2 = pd.DataFrame({'B': ['B1'], 'C': ['C1']})
res = pd.concat([df1, df2], join='inner', ignore_index=True)
print(res)
Output
B 0 B0 1 B1
Explanation: pd.concat([df1, df2], join='inner', ignore_index=True) combines DataFrames row-wise, keeping only the common columns ('B' here). ignore_index=True resets the index in the final result.