From the course: Fundamentals of Data Transformation for Data Engineering

Unlock this course with a free trial

Join today to access over 25,300 courses taught by industry experts.

Wrangling unstructured data

Wrangling unstructured data

- [Instructor] In this lesson we're going to cover wrangling unstructured data with pandas. And you'll notice that there are a number of differences between how we handle things in Python and how we did in SQL. And for some of these it'll be a bit easier. So this'll be important to pay attention to, but it's a good lesson generally. And in our pandas cells, we're going to have to first import the library as you would any other Python function. But then pulling in our data source is going to look a little bit different. We're using the read_parquet function from the pandas library to read the dataset. You don't have to worry too much about that, but if you're interested in data analysis or data engineering with parquet files, that'll be something you'll use in the future. So again, same dataset that we looked at using SQL, we have a list of national parks. You can see that we have our name of the park, we have the…

Contents