From the course: Apache Spark Essential Training: Big Data Engineering
Unlock the full course today
Join today to access over 24,800 courses taught by industry experts.
Real-time use case: Problem - Apache Spark Tutorial
From the course: Apache Spark Essential Training: Big Data Engineering
Real-time use case: Problem
- [Instructor] What will a real-time data engineering solution be like? Let's design and implement an example real-time data engineering pipeline in this chapter. The business scenario for the real-time example is website analytics. An enterprise runs an ecommerce website selling multiple items, like amazon.com. This website is used across multiple countries. The enterprise wants to track visits to the website by its users. It wants to know about the visit date, country, and duration for which the users stay in the website. It also wants to track the last action before the user exited out of the website. This last action may be in the catalog page, the FAQ page, the shopping cart, or the order fulfillment page. To enable this real-time analytics, we need to build a pipeline that will receive a stream of website visit records. For each user who visits this website, the stream will have one record. The record will contain the visit date, the last action performed by the user, the…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.