Identifying the standard deviation of a dataset
The standard deviation is derived from the variance and is simply the square root of the variance. The standard deviation is typically more intuitive because it is expressed in the same units as the dataset, for example, kilometers (km). On the other hand, the variance is typically expressed in units larger than the dataset and can be less intuitive, for example, kilometers squared (km2).
To analyze the standard deviation of a dataset, we will use the sd method from the numpy library in Python.
Getting ready
We will work with the COVID-19 cases again for this recipe.
How to do it…
We will compute the standard deviation using the numpy libary:
- Import the
numpyandpandaslibraries:import numpy as np import pandas as pd
- Load the
.csvinto a dataframe usingread_csv. Then subset the dataframe to include only relevant columns:covid_data = pd.read_csv("covid-data.csv") covid_data = covid_data[[...