From the course: Introduction to Data Engineering on AWS: Data Sourcing and Storage
Unlock this course with a free trial
Join today to access over 25,300 courses taught by industry experts.
Overview of data lakes - Amazon Web Services (AWS) Tutorial
From the course: Introduction to Data Engineering on AWS: Data Sourcing and Storage
Overview of data lakes
- [Instructor] Data lake is nothing but a centralized repository that allows you to store your structured and unstructured data at any scale. So we look at it as a repository where we can store structured, unstructured, or semi-structured data coming out of any kind of source. It can be a data coming in a batch mode or may also coming in real time or near real time. There is absolutely no restriction as to what kind of data you store in data lake or how you store data in data lake. Most cool feature of data lake is that you really don't have to think on the use case you want to implement before you get the data in. So you don't need to impose any structure to the data lake. And the best practice is to incrementally build your data lake. Now, let's have a quick look that what advantages data lake on cloud offers apart from the standard cloud benefits like low cost storage, scalability, reliability, et cetera. So it is…