From the course: Complete Guide to Data Lakes and Lakehouses
Unlock the full course today
Join today to access over 25,200 courses taught by industry experts.
Advanced product reviews analytics
From the course: Complete Guide to Data Lakes and Lakehouses
Advanced product reviews analytics
- [Instructor] Now that we have our data frame ready, let's run some basic statistics to have an overview of the data we are dealing with. As you can see, we have a total of 50 reviews with a mean of 4.2, which is pretty good, and then we see the distribution of reviews per vehicle model. Now, we are ready to run some more advanced analytics. The first one is going to be wordclouds. In this section we will pre-process review text, and then aggregate them by vehicle model to visualize the most common words in the reviews. This can help us see what aspects of the vehicles customers are talking about the most. Let's start by installing the workcloud library. And now, let me walk you through the actual workloads creation. First, we import some libraries that we will need later. The vehicle_models variable post a list of unique vehicle models from the product reviews data frame. These models will be excluded from the word clouds to focus on other words in the reviews. Then the…
Contents
-
-
-
-
-
-
-
-
-
-
-
Dremio walkthrough3m 7s
-
(Locked)
Executing queries and creating virtual datasets4m 36s
-
(Locked)
Creating complex virtual datasets using SQL3m 20s
-
(Locked)
Connecting Dremio to Apache Superset2m 41s
-
(Locked)
Creating a marketing dashboard9m
-
(Locked)
Connecting Dremio to Jupyter Notebook3m 36s
-
(Locked)
Advanced product reviews analytics8m 8s
-
(Locked)
Solution: Vehicle health analytics in Jupyter3m 27s
-
-
-