From the course: Python for AI Projects: From Data Exploration to Impact

Unlock this course with a free trial

Join today to access over 25,300 courses taught by industry experts.

Building Classification Pipelines in Python

Building Classification Pipelines in Python - Python Tutorial

From the course: Python for AI Projects: From Data Exploration to Impact

Building Classification Pipelines in Python

- [Instructor] Now for the core of this tutorial, building scikit-learn pipelines for multi-class classification to predict products for each user. We'll start with logistic regression, then move to random forest, and LightGBM by tweaking our pipeline code. This fast-paced section covers data cleaning, model fitting, cross-validation, hyperparameter tuning, and deployment prep, with fully reusable code for your own ML projects. We'll review the code for our Python function, train_multiclass_logistic_regression_model. Since the data lacks period splits, we create a stratified train validation split to handle class imbalance. We label in code product name and evaluate model performance using a standard scikit-learn classification report, which includes precision, recall and F1-score. This function also returns a dictionary of artifacts, which we'll use downstream. The function is well-documented with parameter…

Contents