top of page
  • Writer's pictureDaniel Chernenkov

Kubeflow Intro — Machine Learning on Kubernetes

If you’ve been following the tech news, it’s a little hard to miss the buzz around Kubernetes and Kubeflow. 

In this short guide, we are going to discuss some of Kubernetes’s and Kubeflow’s basic concepts, what sort of problems they solve, and how you can use them in your development endeavors.

As the data grows — as also the time for our models to be built, as part of our solution we have started to look around for new way to distribute our building blocks for Machine Learning models and give us the ability to build highly available, scalable software that’s also easy to manage and update.

What is Kubeflow and what is it used for

Kubeflow was initially created to provide a more straightforward way to run TensorFlow jobs on Kubernetes, and it was based on a TensorFlow Extended pipeline. Then, it was extended to support various architectures and clouds to be used as a machine learning pipeline framework. It’s the machine learning toolkit for Kubernetes, in fewer words.

Kubeflow was launched as a response to two huge IT trends that were beginning to earn the spotlight: cloud-native architectures and data science/machine learning. Kubeflow runs on Kubernetes clusters locally or in the cloud, harnessing the power of training machine learning models on multiple devices and speeding up the process.

Kubeflow’s tools can help a developer/engineer easily build machine learning models, analyze their performance, refine hyper-parameters, deploy models to production, and manage to compute power.

Kubeflow is comprised of three main features that simplify machine learning: composability, scalability, and portability. Of course, developers can build containerized machine learning pipelines on Kubernetes without Kubeflow — but it helps standardize the process and make it more efficient. Kubeflow also makes it easier to configure implementations to use hardware accelerators without tweaking the code.

Kubeflow can be used for the multi-cloud framework, monitoring tools, workflow management, documentation, and model deployment.

Kubeflow integrates Tensorboard, which provides all the tools needed to visualize the machine learning process and prevent failure. The ability to monitor the training model process enables developers and engineers to refine the model’s parameters in real-time, save resources, accelerate building time via rapid iteration.

There are multiple manners to deploy a model via Kubeflow, starting with the KFserving tool, which runs on multiple machine learning frameworks like PyTorch, SciKit Learn, and TensorFlow.

How Kubeflow elevates an ML workflow

Constructing, training and deploying ML systems is an iterative process consisting of several stages. One needs to evaluate the output of various stages of the ML workflow and apply changes to the model and parameters when necessary. The diagram below showcases how Kubeflow contributes to each stage:

The next diagram shows an actual example of a specific ML workflow that can be used to train and serve a model trained on the MNIST dataset:


In the race to build highly available, scalable software that’s also easy to manage and update, containers are quickly becoming a viable option for more and more companies across many different types of industries. As more organizations look toward container-driven infrastructure, open-source tools like Kubernetes (K8s) have rapidly become industry standards for automating containerized applications’ deployment, management, and scaling.

This onslaught of interest in K8s has spawned the creation of newer platforms like Kubeflow, which aims to make the process of managing K8s even easier by abstracting most of the work away from users and letting them focus on their data science projects.


1 view0 comments


bottom of page