From the course: Complete Guide to Analytics Engineering

Unlock this course with a free trial

Join today to access over 25,600 courses taught by industry experts.

Data lakes: An alternative storage method

Data lakes: An alternative storage method

From the course: Complete Guide to Analytics Engineering

Data lakes: An alternative storage method

- [Instructor] Picture a data warehouse as a fairly organized store, where the employees take the new groceries, remove them from their box, and set them on the shelves in predetermined rows, aisles, and sections. A crate of apples gets offloaded from the truck, brought to the produce section, the crate is opened, the apples unloaded next to other fruits. That's an oversimplification of how a traditional database works, but essentially we bring data into the warehouse, create a schema, and store it for use. There's a less common but useful data storage solution called a data lake. Instead of bringing the groceries into the store and shelving them, we bring the data into the store and leave the box wherever there is space. We don't unbox it until someone comes looking for it. This data we bring into data lakes can be semi-structured, unstructured, or structured. Its schema is not determined until it's needed. Some of the benefits for data lakes include one, storage costs for raw data…

Contents