From the course: ETL in Python and SQL
Unlock the full course today
Join today to access over 24,800 courses taught by industry experts.
Introduction to data warehouses and data lakes
From the course: ETL in Python and SQL
Introduction to data warehouses and data lakes
- [Instructor] By now, you should be able to load and transform data from various sources, congratulations. Now, let's talk about the differences between databases, data warehouses, data lakes, and data lakehouses. I know this might sound confusing. and may seem like a mouthful, but these are all different ways organizations manage and store their data. Each has different characteristics and particular use cases, which we'll discuss. First, what are databases? Databases are organized collections of data that are usually controlled using a database management system, DBMS. A database management system is a software that allows you to access, interact with, and manipulate a database and its content. Databases focus on the operational or transactional data and managing day-to-day CRUD, create, read, updates, and delete operations. They do not always have historical information since they are optimized for retrieving small amounts of data. Usually, when people speak about databases, they…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
(Locked)
Introduction to data warehouses and data lakes5m 1s
-
(Locked)
Loading data into relational databases8m 1s
-
(Locked)
Data quality checks and validation with SQL3m 27s
-
(Locked)
Challenge: Transform the data and remove duplicates and nulls40s
-
(Locked)
Solution: Transform the data and remove duplicates and nulls2m 37s
-
(Locked)
-
-