From the course: Data Engineering Project: Build Streaming Ingestion Pipelines for Snowflake with AWS

Unlock the full course today

Join today to access over 24,800 courses taught by industry experts.

Overview of streaming pipeline project

Overview of streaming pipeline project

- [Narrator] In this video, we'll overview this course's project so you understand the desired end state we are working towards. We'll be using MSK or Managed Streaming for Kafka. This is AWS's fully managed, highly available Apache Kafka event streaming platform. Event streaming platforms are event-driven architectures that specialize in streaming event data such as change data capture from a data provider to a destination. Generally, a database management service of some sort, such as a lakehouse or a time series database. Other event streaming services include Amazon Kinesis, AWS's principle streaming service, or Google Datastream. At a high level in this chapter, you will create a provisioned Kafka cluster, create Kafka producers and connectors, create topics in a Kafka cluster, create a Snowflake database and associated permissions. Specifically, our final pipeline will consist of an AWS EC2 instance, which will…

Contents