From the course: GenAIOps Foundations
The machine learning lifecycle
From the course: GenAIOps Foundations
The machine learning lifecycle
- [Narrator] Let's begin the course by reviewing the basic concepts of the machine learning lifecycle. Machine learning is like software development. In software development, we go through a structured process of requirements analysis, design, development, testing, deployment, and operations. Similarly, machine learning also has a structured lifecycle that goes from concept to production. A machine learning application goes through a journey as it evolves. It starts from a concept where the goals for machine learning are determined, then the model is built. The model is integrated into the application. The model and the application are then deployed and used. Continuous improvement happens to the model over time. This is a cycle or iterative process where the steps are repeated over many times. During the iterative process, the model is continuously refined for better performance and lower errors or bias. Refinements are triggered by the availability of new training data, new use cases found, and model degradations noticed over time. Let's review this lifecycle with a diagram. We begin the ML lifecycle with identifying the requirements. The requirements are typically for the application, which will use machine learning. Within this, the parts of the requirements that apply to the model are identified. Then workflow design happens for the application within which the use of ML models are brought in. This identifies the inputs available to the model and outputs expected from the model. To build a model, we may need training data. Training data is collected from available resources. Raw training data goes through feature engineering to get it ready for model training. Then model training kicks in. The model is created based on the requirements and trained on the training data. The created model is usually managed to control its versions and evolution. The model then goes through test and evaluation to validate if it performs as per expectations. If the performance is not satisfactory, the model is retrained. This cycle continues until satisfactory results are obtained. The model is then deployed in production along with the application and used for serving. Data is collected from real model use in production. This helps in diagnostics as well as identifying errors. Production data may be used for training if new use cases are identified. This goes through a labeling process. The labeled data is then added to the training data set. Now the cycle can repeat with feature engineering, training, evaluation, and deployment done with the updated training data.