A machine learning project following the MLE-Star methodology (Model design, Learning pipeline, Evaluation, Systematic testing, Training optimization, Analysis validation, Refinement deployment).
ml-experiment/
├── data/ # Data storage
│ ├── raw/ # Original, immutable data
│ ├── processed/ # Cleaned and processed data
│ └── external/ # Third-party datasets
├── models/ # Trained models and artifacts
├── notebooks/ # Jupyter notebooks for exploration
│ ├── 01_model_design.ipynb
│ ├── 02_training_pipeline.ipynb
│ ├── 03_model_evaluation.ipynb
│ ├── 04_hyperparameter_tuning.ipynb
│ ├── 05_model_analysis.ipynb
│ └── 06_deployment.ipynb
├── src/ # Source code
│ ├── data/ # Data processing modules
│ ├── features/ # Feature engineering
│ ├── models/ # Model definitions and training
│ ├── visualization/ # Visualization utilities
│ └── api/ # API endpoints for serving
├── tests/ # Unit tests
├── configs/ # Configuration files
├── outputs/ # Generated outputs
│ ├── models/ # Saved models
│ ├── figures/ # Generated plots
│ └── reports/ # Analysis reports
├── requirements.txt # Python dependencies
├── config.yaml # Main configuration
└── README.md # This file
- Define problem statement and success metrics
- Select appropriate ML algorithms and architectures
- Design model architecture and components
- Implement data preprocessing and feature engineering
- Create training and validation pipelines
- Set up data loaders and transformation pipelines
- Define evaluation metrics and validation strategies
- Implement comprehensive model evaluation
- Create performance monitoring and reporting
- Implement unit tests for all components
- Create integration tests for pipelines
- Add data validation and model testing
- Implement hyperparameter tuning
- Optimize training procedures and schedules
- Add model selection and ensemble methods
- Perform model interpretability analysis
- Validate model assumptions and behavior
- Generate comprehensive analysis reports
- Refine model based on analysis results
- Prepare model for deployment
- Create deployment infrastructure and monitoring
-
Install Dependencies
pip install -r requirements.txt
-
Configure Project Edit
configs/config.yamlwith your specific settings. -
Run MLE-Star Workflow
# Initialize project (already done) claude-flow automation mle-star status # Run individual stages claude-flow automation mle-star stage model_design claude-flow automation mle-star stage learning_pipeline # Or run complete workflow claude-flow automation mle-star run
This project is configured to use pytorch as the primary ML framework.
{{#if (eq mlFramework "pytorch")}} PyTorch Configuration:
- Version: 1.9.0+
- GPU Support: Available if CUDA is installed
- Key Components: torch, torchvision, torch.nn, torch.optim
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader{{/if}}
{{#if (eq mlFramework "tensorflow")}} TensorFlow Configuration:
- Version: 2.6.0+
- GPU Support: Available if CUDA is installed
- Key Components: tf.keras, tf.data, tf.nn
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers{{/if}}
{{#if (eq mlFramework "scikit-learn")}} Scikit-Learn Configuration:
- Version: 1.0.0+
- CPU-optimized machine learning
- Key Components: sklearn.model_selection, sklearn.metrics
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.ensemble import RandomForestClassifier{{/if}}
# Model design stage
claude-flow automation mle-star stage model_design --framework pytorch
# Training pipeline stage
claude-flow automation mle-star stage learning_pipeline
# Model evaluation
claude-flow automation mle-star stage evaluation_setup# Run all stages
claude-flow automation mle-star run --continue-on-error
# Check status
claude-flow automation mle-star status
# Validate environment
claude-flow automation mle-star validate# Deploy as API
claude-flow automation mle-star deploy --service api
# Deploy with Docker
claude-flow automation mle-star deploy --service dockerThe main configuration is in configs/config.yaml. Key settings include:
- Data paths: Input and output data directories
- Model parameters: Architecture and hyperparameters
- Training settings: Batch size, epochs, learning rate
- Evaluation metrics: Performance measurement criteria
- Data Exploration →
notebooks/01_model_design.ipynb - Pipeline Development →
notebooks/02_training_pipeline.ipynb - Model Training →
src/models/train.py - Evaluation →
notebooks/03_model_evaluation.ipynb - Optimization →
notebooks/04_hyperparameter_tuning.ipynb - Analysis →
notebooks/05_model_analysis.ipynb - Deployment →
notebooks/06_deployment.ipynb
Run tests with:
pytest tests/Generate coverage report:
pytest --cov=src tests/- Follow MLE-Star methodology for all changes
- Add tests for new functionality
- Update documentation
- Run validation before submitting
[Specify your license here]
Generated with MLE-Star methodology for systematic ML development.