Nitesh Thapliyal

Posted on Apr 14, 2022

MLOps Stack Requirements

#mlops #katonics #devops #ai

In this Blog we going to discuss one the following topics:

MLOps Stacks requiremnt
AI Model Life Cycle
How we can use Katonics MLOps Platform

As we have discussed about the MLOps already Click to read
We should also know about what set of tools and technologies are used in MLOps stacks.
We will discuss about MLOps life cycle as it is not a single technology or platform rather than a set of technology working together in a systemize way to achieve goal.

MLOps Stack Requirements

- Data gathering and preparation

In data gathering and preparation, platform should be able to gather batches as well as streaming data to automate the process. You have to build pipeline that would engage any batch or streaming data, do the cleaning and transmission to prepare it for model to fit.

- Source Control

It is a tool which we need for saving the versions of many entities. In MLOps platform we experiment with lot of data, so we need to have version control of it so that reproducibility can happen i.e so, at any point of time we can go back and check what data you have used for what.

- Experimentation

The MLOps platform should be data scientist friendly. It should work out of the box with popular machine learning frameworks like PyTorch, Keras, Tensorflow, Scikit-learn and many more.

- Hyperparameter Tuning

Choosing the correct hyperparameter for machine learning and deep learning model is one of best ways to extract the last use of model, therefore we need Hyperparameter Tuning framework to select the best combination that delivers the best performance.

- Distributed Model Training

The MLOps platform should be able to automatically train the model using different sets of data. Training a model is resource intensive, so to serve this platform provide a service which can scale itself and do distributed model training. Platform also need CI/CD tools for scheduling, task queuing.

- Auto ML

The platform should have Auto ML, which help you to iterate over hundreds of algorithm and automatically gives you best accuracy model.

- Deployment

Automated deployment using container service like Docker and kubernetes should be there in a platform.

- Model Registry

It is a repository for hosting all the important artifacts for production ready models like training data, model type, key artifact etc.

- Feature Store

It is use to facilitate collaboration between teams, reduce duplication, reduce feature engineering cost per model, and in turn increases the productivity of ML engineers.

- Monitoring

In monitoring we check the real time performance view of deployed models. Alert mechanism upon seeing data drift or deviation in performance of model.

- Governance

We require Governance services to control access implementation policy and track model activity. Capability to track end-to-end trail of ML assests.

- Compilance and Audit Services

The compilance and audit service includes authorization and authentication. Detect and correct any biased or unethical outcomes.

The AI Model Life Cycle

The AI Model life Cycle represents how data from the source are trained to get insightful information from them.

The data from the sources like:

SaaS
Data Warehouse
External systems

The data is given to the automated pipeline, which includes the following processes to train data :

Data extraction
Data preparation
Feature engineering
Model Training
Model evaluation
Model Validation

The offline data processed through the experimentation and prototyping which includes:

Problem definition
Data selection
Data exploration
Feature engineering
Model prototyping
Model validation

After this the data is stored in the source repository which is connected to the automated pipeline.

The resultant model is stored in model registry and the end result is an API which we can use in our web applications.

How to use katonics MLOps Platform

Katonic is a collection of cloud-native tools for all of the stages of MDLC (data exploration, feature preparation, model training/tuning, model serving, model testing, and model versioning). Katonic provides a Unified UI and has tooling that allows these traditionally separate tools to work seamlessly together. An important part of this tooling is the pipeline system, which allows users to build integrated end-to-end pipelines that connect all components of their MDLC.

Katonic is for both data scientists and data engineers looking to build production-grade machine learning implementations. Katonic can be run either locally in your development environment or on a production cluster. Often pipelines will be developed locally and migrated once the pipelines are ready. Katonic provides a unified system—leveraging Kubernetes for containerization and scalability, for the portability and repeatability of its pipelines.

To know how you can use the Katonics MLOps platform here is a End to end Katonic Walkthrough

Hope you find this blog insightful!! ✨

DEV Community

MLOps Stack Requirements

MLOps Stack Requirements

- Data gathering and preparation

- Source Control

- Experimentation

- Hyperparameter Tuning

- Distributed Model Training

- Auto ML

- Deployment

- Model Registry

- Feature Store

- Monitoring

- Governance

- Compilance and Audit Services

The AI Model Life Cycle

How to use katonics MLOps Platform

Top comments (0)

Read next

Test Intelligence in the Era of AI: Opportunities and Challenges

AGI Explained: The Future of Artificial Intelligence

SOAP vs REST API: Understanding the Differences

Qwen2.5 Coder — The Future of Local Code Generation! 🎉