Training and data processing in SageMaker pipelines demonstration - Amazon SageMaker Tutorial

From the course: Machine Learning with SageMaker by Pearson

Start my 1-month free trial Buy for my team

Training and data processing in SageMaker pipelines demonstration

“

Building on the pipeline demo that we just had in the previous lesson, I'm going to show you how to integrate data processing as the first step of the pipeline. In the previous example, I assumed that the data being fed into the model training step was already clean. In this example, what I'm going to do over in our repository, Let me hop in there real quick. In the data sets pipeline dirty directory, there are 10 lines in this file that are missing values. Lines 10 through 14 are missing the age. And down at the bottom, lines 4980 to 4984 are missing the age. The reason I chose five at the beginning and five at the end is because this data is actually going to be split into training data and validation data by our preprocessing script. So we're going to set this up. We're going to create a pipeline just like we did in the previous example. The difference being we're going to have a preprocessing script that is going to be passed to our pipeline and then executed on an instance that…

Unlock this course with a free trial

Join today to access over 25,200 courses taught by industry experts.

Training and data processing in SageMaker pipelines demonstration - Amazon SageMaker Tutorial

From the course: Machine Learning with SageMaker by Pearson

Training and data processing in SageMaker pipelines demonstration

Download courses and learn on the go

Contents

Start learning today.

Explore Business Topics

Explore Creative Topics

Explore Technology Topics