Copy
View this email in your browser
Hey there, 
 
We're starting Module 2 of the Machine Learning Zoomcamp!
 
Over the next two weeks, we will explore the topic of machine learning for regression.

You'll build a car price prediction model while learning linear regression, feature engineering, and regularization.
Homework 1: Deadline Extension
Before we go into the details of the second module, let's praise those who submitted the first homework:

1150+ people submitted the homework 🎉

We've extended the deadline for Homework 1 to 2 October 2025, 01:00 CET. Take this time to submit your homework if you haven't already.

Each submission moves you one step closer to retaining the course material. And the leaderboard is a fun way to motivate you to keep going!

Now, to the learning.
Submit Homework 1
What You’ll Learn During Module 2
  1. Car price prediction project
  2. Data preparation
  3. Exploratory data analysis
  4. Setting up the validation framework
  5. Linear regression
  6. Linear regression: vector form
  7. Training linear regression: Normal equation
  8. Baseline model for car price prediction project
  9. Root mean squared error
  10. Using RMSE on validation data
  11. Feature engineering
  12. Categorical variables
  13. Regularization
  14. Tuning the model
  15. Using the model
  16. Car price prediction project summary
  17. Explore more
  18. Homework
All the course materials are available in the GitHub repo. Each module has its own folder (e.g., 01-intro03-classification), and cohort-specific homework is in cohorts/2025.

Lectures are pre-recorded and available in the YouTube playlist. If new workshops or updated videos are added, we’ll announce them. If you don’t see an announcement, assume everything you need is already there.

Join Module 2
Homework for Module 2
Your assignment for this module is to:
  • Download and filter the dataset
  • Identify missing values and compute median horsepower
  • Split the data into train/validation/test sets
  • Compare imputation strategies
  • Tune the regularization strength and select the best value
  • Check model stability across different random seeds
  • Train the final model and evaluate on the test set
The deadline for this homework is October 7, 2025, 1:00 AM CET.
Submit Homework 2
How the Homework Assignments Work
The process is simple: solve the tasks locally, publish your solution to a public GitHub repo (or similar), and submit the link through the form.

Each homework comes with a strict deadline listed on the schedule. After the deadline passes, the form closes.

Submissions also appear on the leaderboard.

You can also gain extra points by sharing your learning publicly with the hashtag #mlzoomcamp and tagging Alexey Grigorev or DataTalksClub.
Got stuck while starting the homework?
Having questions is a normal part of the learning process.
 
Here's what to do:
  1. Check the FAQ
  2. If you don't find an answer to your question in the FAQ, ask it on the ML Zoomcamp Slack channel. We will update the FAQ with answers.
Join Our Slack Channel
💡 Pro Tips
  • Expect to spend about 6-7 hours on this module if you're a beginner, including the materials and the homework assignment.
  • Break it into smaller chunks, don't wait until the last day.
Hundreds of learners are starting this module alongside you. Be sure to collaborate and share knowledge.
🔗 Quick Links
You've Got This!
This is Module 2 of 10: completing the homework brings you one step closer to earning your final certificate.

It's okay if you don't get everything right away; most students don't.

Every attempt gets you closer to mastering ML.
Good luck with learning!

Alexey and the DataTalks.Club Team
Twitter
LinkedIn
YouTube
Website
Copyright © 2025 DataTalks, All rights reserved.


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.