From the course: Machine Learning and AI Foundations: Clustering and Association
Unlock the full course today
Join today to access over 24,800 courses taught by industry experts.
Which variables should be used with k-means?
From the course: Machine Learning and AI Foundations: Clustering and Association
Which variables should be used with k-means?
- [Instructor] We have to talk about a very important topic. It's a question that I often get. What variables should I be using when I perform my cluster analysis. So let's take a look at these initial variables. They're really at the heart of the matter. These are total spend variables that had been built from a lot of transactional data. So I have one row per customer. And then I have several variables that represent how much each customer spent in seven different product families. What if we go ahead and proceed right now and analyze these variables. What's going to happen? Well, the big screen TVs sold in the entertainment department cost a lot more money than the software or the video games. So if you go ahead and proceed right now what's going to happen is entertainment sales is going to dominate the solution. So just as we've seen in hierarchical, for instance where it automatically will transform for you, we have to somehow get those variables in a form where they all have the…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
How does k-means work?2m 3s
-
(Locked)
Which variables should be used with k-means?2m 46s
-
(Locked)
Interpreting a box plot6m 49s
-
(Locked)
Running a k-means cluster analysis3m 28s
-
(Locked)
Interpreting cluster analysis output5m 42s
-
(Locked)
What does silhouette mean?2m 20s
-
(Locked)
Which cases should be used with k-means?4m 44s
-
(Locked)
Finding optimum value for k: k = 35m 7s
-
(Locked)
Finding optimum value for k: k = 45m 51s
-
(Locked)
Finding optimum value for k: k = 55m 3s
-
(Locked)
What the best solution?3m 56s
-
-
-
-
-
-
-