Pandas DataFrame.astype()-Python
DataFrame.astype() function in pandas cast a pandas object such as a DataFrame or Series to a specified data type. This is especially useful when you need to ensure that columns have the correct type, such as converting strings to integers or floats to strings.
For example:
import pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [3.5, 4.5]})
res = df.astype('string')
print(res)
print(res.dtypes)
Output
A B 0 1 3.5 1 2 4.5 A string[python] B string[python] dtype: object
Explanation: astype('string') converts all DataFrame columns to string type, ensuring that all values, whether integers or floats, are treated as strings.
Syntax
DataFrame.astype(dtype, copy=True, errors='raise')
Parameters:
- dtype: Target data type and it can be a single type or a dict of column names to types.
- copy: If True (default), returns a copy and if False, changes in-place if possible.
- error: How to handle invalid data: 'raise' (default) or 'ignore'.
Returns: A new DataFrame or Series with updated data types.
Examples
Example 1: In this, we convert column 'A' to integer, while column 'B' remains unchanged.
import pandas as pd
df = pd.DataFrame({'A': ['1', '2'], 'B': ['3.0', '4.0']})
res = df.astype({'A': int})
print(res)
print(res.dtypes)
Output
A B 0 1 3.0 1 2 4.0 A int64 B object dtype: object
Explanation: astype() method is applied to convert column 'A' from string to integer type. Column 'B' is left out, so it remains as object (string).
Example 2: In this, we convert column 'A' to integer and column 'B' to float.
import pandas as pd
df = pd.DataFrame({'A': ['1', '2'], 'B': ['3.5', '4.5']})
res = df.astype({'A': int, 'B': float})
print(res)
print(res.dtypes)
Output
A B 0 1 3.5 1 2 4.5 A int64 B float64 dtype: object
Explanation: astype() method is used with a dictionary to convert 'A' to integer and 'B' to float. Both columns are now in numeric form, suitable for calculations.
Example 3: In this, we try to convert column 'A' to integer, but due to a non-numeric value, the conversion is skipped and original types are retained.
import pandas as pd
df = pd.DataFrame({'A': ['1', 'two'], 'B': ['3.0', '4.5']})
res = df.astype({'A': int}, errors='ignore')
print(res)
print(res.dtypes)
Output
A B 0 1 3.0 1 two 4.5 A object B object dtype: object
Explanation: astype() method tries to convert column 'A' to integer, but the value 'two' is not numeric. Since errors='ignore' is used, pandas skips the conversion and retains the original data types.
Example 4:
import pandas as pd
df = pd.DataFrame({'A': ['1.1', '2.2'], 'B': ['3', '4']})
res = df.astype({'A': float}, copy=False)
print(res)
print(res.dtypes)
Output
A B 0 1.1 3 1 2.2 4 A float64 B object dtype: object
Example 4: In this, we convert column 'A' to float with copy=False to avoid creating a new object.
import pandas as pd
df = pd.DataFrame({'A': ['1.1', '2.2'], 'B': ['3', '4']})
res = df.astype({'A': float}, copy=False)
print(res)
print(res.dtypes)
Output
A B 0 1.1 3 1 2.2 4 A float64 B object dtype: object
Explanation: By setting copy=False, pandas attempts to perform the conversion without creating a new object. Column 'B' remains unchanged as object.
Related articles: