GitHub - sahana-github/MLOPS-Complete-ML-Pipeline: This project covers the end to end understanding for creating an ML pipeline and working around it using DVC for experiment tracking and data versioning

MLOps Project - Vehicle Insurance Data Pipeline

Welcome to this MLOps project, designed to demonstrate a robust pipeline for managing vehicle insurance data. This project aims to impress recruiters and visitors by showcasing the various tools, techniques, services, and features that go into building and deploying a machine learning pipeline for real-world data management. Follow along to learn about project setup, data processing, model deployment, and CI/CD automation! 📁 Project Setup and Structure Step 1: Project Template

Start by executing the template.py file to create the initial project template, which includes the required folder structure and placeholder files.

Step 2: Package Management

Write the setup for importing local packages in setup.py and pyproject.toml files.
Tip: Learn more about these files from crashcourse.txt.

Step 3: Virtual Environment and Dependencies

Create a virtual environment and install required dependencies from requirements.txt:

conda create -n vehicle python=3.10 -y
conda activate vehicle
pip install -r requirements.txt

Verify the local packages by running:

pip list

📊 MongoDB Setup and Data Management Step 4: MongoDB Atlas Configuration

Sign up for MongoDB Atlas and create a new project.
Set up a free M0 cluster, configure the username and password, and allow access from any IP address (0.0.0.0/0).
Retrieve the MongoDB connection string for Python and save it (replace <password> with your password).

Step 5: Pushing Data to MongoDB

Create a folder named notebook, add the dataset, and create a notebook file mongoDB_demo.ipynb.
Use the notebook to push data to the MongoDB database.
Verify the data in MongoDB Atlas under Database > Browse Collections.

📝 Logging, Exception Handling, and EDA Step 6: Set Up Logging and Exception Handling

Create logging and exception handling modules. Test them on a demo file demo.py.

Step 7: Exploratory Data Analysis (EDA) and Feature Engineering

Analyze and engineer features in the EDA and Feature Engg notebook for further processing in the pipeline.

📥 Data Ingestion Step 8: Data Ingestion Pipeline

Define MongoDB connection functions in configuration.mongo_db_connections.py.
Develop data ingestion components in the data_access and components.data_ingestion.py files to fetch and transform data.
Update entity/config_entity.py and entity/artifact_entity.py with relevant ingestion configurations.
Run demo.py after setting up MongoDB connection as an environment variable.

Setting Environment Variables

Set MongoDB URL:

# For Bash
export MONGODB_URL="mongodb+srv://<username>:<password>...."
# For Powershell
$env:MONGODB_URL = "mongodb+srv://<username>:<password>...."

Note: On Windows, you can also set environment variables through the system settings.

🔍 Data Validation, Transformation & Model Training Step 9: Data Validation

Define schema in config.schema.yaml and implement data validation functions in utils.main_utils.py.

Step 10: Data Transformation

Implement data transformation logic in components.data_transformation.py and create estimator.py in the entity folder.

Step 11: Model Training

Define and implement model training steps in components.model_trainer.py using code from estimator.py.

🌐 AWS Setup for Model Evaluation & Deployment Step 12: AWS Setup

Log in to the AWS console, create an IAM user, and grant AdministratorAccess.

Set AWS credentials as environment variables.

# For Bash
export AWS_ACCESS_KEY_ID="YOUR_AWS_ACCESS_KEY_ID"
export AWS_SECRET_ACCESS_KEY="YOUR_AWS_SECRET_ACCESS_KEY"

Configure S3 Bucket and add access keys in constants.__init__.py.

Step 13: Model Evaluation and Pushing to S3

Create an S3 bucket named my-model-mlopsproj in the us-east-1 region.
Develop code to push/pull models to/from the S3 bucket in src.aws_storage and entity/s3_estimator.py.

🚀 Model Evaluation, Model Pusher, and Prediction Pipeline Step 14: Model Evaluation & Model Pusher

Implement model evaluation and deployment components.
Create Prediction Pipeline and set up app.py for API integration.

Step 15: Static and Template Directory

Add static and template directories for web UI.

🎯 Project Workflow Summary

Data Ingestion ➔ Data Validation ➔ Data Transformation
Model Training ➔ Model Evaluation ➔ Model Deployment

This README provides a structured walkthrough of the MLOps project, showcasing the end-to-end pipeline, cloud integration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
config		config
notebook		notebook
src		src
static/css		static/css
template		template
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
crashcourse.txt		crashcourse.txt
demo.py		demo.py
docker-compose.yml		docker-compose.yml
projectflow.txt		projectflow.txt
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
template.py		template.py

License

sahana-github/MLOPS-Complete-ML-Pipeline

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages