From the course: Machine Learning with SageMaker by Pearson

Unlock this course with a free trial

Join today to access over 25,200 courses taught by industry experts.

Preventing overfitting and underfitting

Preventing overfitting and underfitting

So, what is overfitting and what is underfitting? Overfitting is when a model performs well on trained data, but poorly on unseen data. So, when we're creating our models, we're feeding in a bunch of data, maybe we have some test data that is associated with that training data, and we train it. And as it's being trained, it's doing internal validation and testing, and it is performing well on that training data, and then you hand it some unseen data, and it does not do well. That is overfitting. Usually happens when the model memorizes data patterns instead of generalizing. So what can we do about that? That's what we're going to talk about during this particular presentation. You have excess complexity in your model, maybe some features that aren't actually needed, or replication of data, too small amount of data, things like that. Talk about that here in just a bit. Underfitting, performing poorly on both training and validation data, usually happens when the model is too simple…

Contents