In this Blog we going to discuss one the following topics:
- MLOps Stacks requiremnt
- AI Model Life Cycle
- How we can use Katonics MLOps Platform
As we have discussed about the MLOps already Click to read
We should also know about what set of tools and technologies are used in MLOps stacks.
We will discuss about MLOps life cycle as it is not a single technology or platform rather than a set of technology working together in a systemize way to achieve goal.
MLOps Stack Requirements
- Data gathering and preparation
In data gathering and preparation, platform should be able to gather batches as well as streaming data to automate the process. You have to build pipeline that would engage any batch or streaming data, do the cleaning and transmission to prepare it for model to fit.
- Source Control
It is a tool which we need for saving the versions of many entities. In MLOps platform we experiment with lot of data, so we need to have version control of it so that reproducibility can happen i.e so, at any point of time we can go back and check what data you have used for what.
- Experimentation
The MLOps platform should be data scientist friendly. It should work out of the box with popular machine learning frameworks like PyTorch, Keras, Tensorflow, Scikit-learn and many more.
- Hyperparameter Tuning
Choosing the correct hyperparameter for machine learning and deep learning model is one of best ways to extract the last use of model, therefore we need Hyperparameter Tuning framework to select the best combination that delivers the best performance.
- Distributed Model Training
The MLOps platform should be able to automatically train the model using different sets of data. Training a model is resource intensive, so to serve this platform provide a service which can scale itself and do distributed model training. Platform also need CI/CD tools for scheduling, task queuing.
- Auto ML
The platform should have Auto ML, which help you to iterate over hundreds of algorithm and automatically gives you best accuracy model.
- Deployment
Automated deployment using container service like Docker and kubernetes should be there in a platform.
- Model Registry
It is a repository for hosting all the important artifacts for production ready models like training data, model type, key artifact etc.
- Feature Store
It is use to facilitate collaboration between teams, reduce duplication, reduce feature engineering cost per model, and in turn increases the productivity of ML engineers.
- Monitoring
In monitoring we check the real time performance view of deployed models. Alert mechanism upon seeing data drift or deviation in performance of model.
- Governance
We require Governance services to control access implementation policy and track model activity. Capability to track end-to-end trail of ML assests.
- Compilance and Audit Services
The compilance and audit service includes authorization and authentication. Detect and correct any biased or unethical outcomes.
The AI Model Life Cycle
The AI Model life Cycle represents how data from the source are trained to get insightful information from them.
The data from the sources like:
- SaaS
- Data Warehouse
- External systems
The data is given to the automated pipeline, which includes the following processes to train data :
- Data extraction
- Data preparation
- Feature engineering
- Model Training
- Model evaluation
- Model Validation
The offline data processed through the experimentation and prototyping which includes:
- Problem definition
- Data selection
- Data exploration
- Feature engineering
- Model prototyping
- Model validation
After this the data is stored in the source repository which is connected to the automated pipeline.
The resultant model is stored in model registry and the end result is an API which we can use in our web applications.
How to use katonics MLOps Platform
Katonic is a collection of cloud-native tools for all of the stages of MDLC (data exploration, feature preparation, model training/tuning, model serving, model testing, and model versioning). Katonic provides a Unified UI and has tooling that allows these traditionally separate tools to work seamlessly together. An important part of this tooling is the pipeline system, which allows users to build integrated end-to-end pipelines that connect all components of their MDLC.
Katonic is for both data scientists and data engineers looking to build production-grade machine learning implementations. Katonic can be run either locally in your development environment or on a production cluster. Often pipelines will be developed locally and migrated once the pipelines are ready. Katonic provides a unified system—leveraging Kubernetes for containerization and scalability, for the portability and repeatability of its pipelines.
To know how you can use the Katonics MLOps platform here is a End to end Katonic Walkthrough
Hope you find this blog insightful!! ✨
Top comments (0)