From the course: Complete Guide to Data Lakes and Lakehouses

Unlock the full course today

Join today to access over 25,200 courses taught by industry experts.

Solution: Vehicle health analytics in Jupyter

Solution: Vehicle health analytics in Jupyter

From the course: Complete Guide to Data Lakes and Lakehouses

Solution: Vehicle health analytics in Jupyter

(bright music) - [Instructor] Let's dive into my solution for analyzing vehicle maintenance and alerts data. You can find this notebook in the solutions branch of the GitHub repo. I will go to each section of the code and I will explain what's happening at each step. Remember, your solution may be different to mine, and that's okay. We start by connecting to Dremio as we did before, and then we bring the vehicle health logs table into a Pandas DataFrame. This is also just the same as we did before. And here we have our DataFrame. We can see the alerts and maintenance history columns are nested. Feel free to explore the structure of these nested columns. You could do that here or in Dremio, or even directly on the source file since we have access to it. Next, let's analyze the maintenance frequency. The maintenance history is a nested list within our DataFrame, so we need to flatten that. We use json_normalize from Pandas to flatten the maintenance history column. The explode function…

Contents