From the course: Python for Data Science and Machine Learning Essential Training Part 2
Unlock this course with a free trial
Join today to access over 25,300 courses taught by industry experts.
Cluster analysis with the K-means method - Python Tutorial
From the course: Python for Data Science and Machine Learning Essential Training Part 2
Cluster analysis with the K-means method
- [Instructor] K-means clustering is an unsupervised machine learning algorithm that you can use to predict subgroups within a dataset. With K-means clustering, you usually have an idea of how many subgroups are appropriate. Use cases for K-means clustering include market price and cost modeling, insurance claim fraud detection, hedge fund classification, and customer segmentation. The K-means clustering algorithm is a simple unsupervised algorithm that's used for quickly predicting groupings from within an unlabeled dataset. Predictions are based on the number of centroids present as represented by the unit k and nearest mean values, given a Euclidean distance measurement between observations. Some things to keep in mind when using K-means. You'll need to scale your variables and you'll want to go ahead and look at a scatter plot or the data table to estimate the appropriate number of centroids to use for the k…