From the course: Python for Data Science and Machine Learning Essential Training Part 1
Unlock this course with a free trial
Join today to access over 25,300 courses taught by industry experts.
Transforming data set distributions - Python Tutorial
From the course: Python for Data Science and Machine Learning Essential Training Part 1
Transforming data set distributions
- [Instructor] The term data transformation refers to the practice of changing data from its original state into a different format. This often includes turning raw data into a format that is clean and ready for use. In this coding demonstration, we're going to explore a variety of beneficial data transformations and look into those scenarios in which they're necessary. We'll focus on two specific data transformation techniques, normalization and standardization. Normalization, also known as min-max scaling is a method where data values are adjusted and scaled to fall within a range of zero to one. This technique maintains the original distribution of values without altering their ranges. On the other hand, standardization is a technique that re-scales data so that it has a mean value of zero and a standard deviation of one. This effectively normalizes the distribution of the data. Keep in mind that in machine learning, not every data set necessitates normalization. It's only required…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.