Validating data quality with AWS tools - Amazon SageMaker Tutorial

From the course: Machine Learning with SageMaker by Pearson

Start my 1-month free trial Buy for my team

Validating data quality with AWS tools

“

Here, on the right side of our screen, we see the lifecycle of an ML model. Collect data, ingest it, prepare it, validate it, store it, explore it, train, monitor, archive, delete. And around we go. Now we are talking about validation of our data. We want to ensure consistency and reliability in our models, address anomalies, outliers, missing data, and prevent biases. Biases are going to be reflected in the predictions that our ML models provide. Data Wrangler, DataBrew, we've talked about these extensively in the previous lessons. Also, Model Monitor, we haven't talked about this as much, but it can be used to detect data drift and biases. So we are using Data Wrangler and Data Brew in order to clean our data, explore it, find those outliers, find the missing data, and impute and normalize where necessary. Model Monitor can then be run as our model is up and running in production. We can use Model Monitor to detect drift and any biases that could be creeping in. RCF, or Random Cut…

Unlock this course with a free trial

Join today to access over 25,200 courses taught by industry experts.

Validating data quality with AWS tools - Amazon SageMaker Tutorial

From the course: Machine Learning with SageMaker by Pearson

Validating data quality with AWS tools

Download courses and learn on the go

Contents

Start learning today.

Explore Business Topics

Explore Creative Topics

Explore Technology Topics