From the course: Fundamentals of Data Transformation for Data Engineering
Unlock this course with a free trial
Join today to access over 25,300 courses taught by industry experts.
Wrangling unstructured data - SQL Tutorial
From the course: Fundamentals of Data Transformation for Data Engineering
Wrangling unstructured data
- [Instructor] In this lesson we're going to cover wrangling unstructured data with pandas. And you'll notice that there are a number of differences between how we handle things in Python and how we did in SQL. And for some of these it'll be a bit easier. So this'll be important to pay attention to, but it's a good lesson generally. And in our pandas cells, we're going to have to first import the library as you would any other Python function. But then pulling in our data source is going to look a little bit different. We're using the read_parquet function from the pandas library to read the dataset. You don't have to worry too much about that, but if you're interested in data analysis or data engineering with parquet files, that'll be something you'll use in the future. So again, same dataset that we looked at using SQL, we have a list of national parks. You can see that we have our name of the park, we have the…
Contents
-
-
-
-
-
(Locked)
DataFrame basics6m 14s
-
(Locked)
Wrangling unstructured data13m 41s
-
(Locked)
Select and filter10m 49s
-
(Locked)
Order and aggregate9m 47s
-
(Locked)
Advanced filters3m 53s
-
(Locked)
Data generation6m 37s
-
(Locked)
Windows5m 26s
-
(Locked)
Apply6m 14s
-
(Locked)
pandas challenge2m
-
(Locked)
pandas solution12m 45s
-
(Locked)
-