From the course: Machine Learning with Python: Decision Trees
Unlock this course with a free trial
Join today to access over 25,300 courses taught by industry experts.
How do classification trees measure impurity? - Python Tutorial
From the course: Machine Learning with Python: Decision Trees
How do classification trees measure impurity?
- [Instructor] Classification trees are built using a process known as recursive partitioning. The objective of recursive partitioning is to create child partitions that are purer than their parents. Classification tree algorithms use a mathematical formula to quantify the degree of impurity within a partition. Two of the most commonly used measures of impurity are entropy and Gini. Entropy is the preferred measure of impurity for the C5.0 decision tree algorithm. It is a concept that is borrowed from information theory. And when applied to decision trees, represents a quantification of the level of randomness or disorder that exists within a partition. A partition with low entropy is one in which most of the items have the same value or outcome, while a partition with high entropy is one that has no dominant outcome. The entropy of a partition S, with C possible outcomes or labels, is calculated as shown here. To…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.