From the course: AWS Certified Data Engineer Associate (DEA-C01) Cert Prep

Unlock this course with a free trial

Join today to access over 25,300 courses taught by industry experts.

Design considerations

Design considerations

- [Instructor] There are many design considerations when creating a data pipeline. In this lesson, we'll cover some of the most important ones. When designing a data pipeline, it's important to decouple the data bus by implementing each stage with purpose-built tools for the task at hand instead of trying to do everything with one tool, like a relational database. Decoupling results in more fault tolerant architectures and better overall performance. Before designing the pipeline, you'll need to have a good understanding of the business and technical requirements. There are trade-offs with cost and performance, so you need to design to meet the requirements while minimizing costs. For example, you'll need to identify the data sources and their storage format so you can select an appropriate collection tool and design any transformations that may be necessary before storage. The maximum latency is a business requirement that determines how fresh the data must be when it is analyzed…

Contents