The rise of machine learning has brought new challenges to software delivery, requiring adaptations of traditional DevOps practices. Models are no longer code—they depend on data and parameters that constantly change.
We spoke with Denys Tovstohan about how MLOps addresses these challenges, the tools and practices involved, and the cultural shifts needed for effective collaboration between DevOps engineers and data scientists.
Background & experience:
Over 10 years of experience in software engineering and DevOps. Denys leads MLOps initiatives, specialising in Python development, cloud infrastructure, and implementing scalable, efficient, and forward-looking machine learning solutions.
How do you see the role of DevOps practices in working with ML models?
Denys Tovstohan: When we talk about the role of MLOps, it's essentially an extension of existing DevOps practices and the ability to apply them in the world of machine learning.
In classical DevOps, everything is more deterministic: if we have the same code and the same input data, we'll always get the same result — the artefact remains unchanged as long as the code doesn't change.
In MLOps, the situation is different. Not only do the code changes, but the data and model parameters do as well. These three factors together influence the outcome. Even without changing the code, a model's behaviour can vary simply because the data has changed. That's why MLOps deals with a much broader set of challenges.
An MLOps' job involves continuously monitoring changes in parameters, data, and model performance to determine whether the model is improving or degrading. This requires ongoing experimentation, comparison, and retraining, since the external environment is constantly evolving.
In real life, we can't do the same things every day under the same conditions. Each new day brings different weather, temperature, and challenges. The same applies to model behaviour. Data and circumstances change, which leads to model drift when the model begins to degrade and behave differently from what was initially expected.
MLOps practices build on DevOps principles but adapt them for machine learning. For example, the tool Kubeflow is based on Kubernetes, which is now considered the de facto standard in the DevOps field.
This field is developing rapidly and resembles the early DevOps era around 2013–2016, when the demand for specialists grew sharply, but best practices were still being formed. The same is happening now with MLOps. I'm confident that by 2030, the demand for MLOps professionals will only continue to grow.
Today, the toolset has become more stable — Kubernetes and Docker have become standards, while new approaches such as AgentOps are adding layers of automation and intelligence. We now see the emergence of AI agents capable of validating pipeline logic and even "asking questions" about model performance or correctness.
An MLOps specialist stands at the intersection of DevOps and data science. On the one hand, he understands how to establish and manage infrastructure and push the model into production. On the other hand, he understands how the model behaves, from building and training to real-time monitoring of model behaviour. They can build pipelines, launch training processes, track metrics, and automate model updates.
However, MLOps doesn't replace DevOps — just as a good developer can't replace a great data scientist. MLOps is the next level that unites both worlds. DevOps ensures stability and delivery, while MLOps accounts for the behavioural dynamics of models and adapts to a constantly changing environment.
What are the biggest challenges in integrating MLOps into an existing CI/CD pipeline?
DT: As models become more complex and the surrounding ecosystem becomes increasingly dynamic, the challenges will only continue to grow.
1. Non-deterministic models: As I mentioned earlier, the main challenge lies in the fact that DevOps is based on deterministic artefacts — stable and predictable outputs. If we have the same code and the same data, we'll get the same result every time.
In MLOps, however, artefacts (i.e., models) are non-deterministic, and that's where the complexity begins. ML models are large and complex — they can weigh tens or even hundreds of gigabytes, and they depend simultaneously on code, data, and parameters. Any change in one of these components can affect the final result and the model's behaviour.
Therefore, we can't simply take existing DevOps practices and apply them "as is". MLOps introduces a third critical component: continuous training.
Because the external environment changes over time, models begin to degrade (model drift), and they must be retrained on new data to maintain performance and reliability.
2. The need for continuous training and monitoring: Another vital aspect is continuous monitoring — constant tracking and logging of model performance. This allows teams to detect changes, anomalies, or even potential attacks early. If a model starts producing incorrect results, the system must have alerts and metrics configured to immediately notify the team about issues.
3. Constant adaptation of security policies: Security policies need to be adapted for each model version individually. Unlike traditional code, you can't just deploy a firewall or run a vulnerability scanner — a retrained model might behave differently every time. The constantly evolving environment requires continuous updates to policies and monitoring systems.
A good example can be found in large language models like ChatGPT. For instance, after the release of version GPT-5, it turned out that it performed worse than GPT-4 and completely failed on security and safety metrics. This shows that models react to environmental and data shifts in unpredictable ways — something traditional DevOps pipelines aren't designed to handle.
There are also more serious ethical and security cases. For example, a scientist once tricked a language model through a game scenario into generating detailed instructions for life-threatening objects at home. Such incidents underline the importance of having real-time cybersecurity monitoring and safety mechanisms that can detect and respond to abnormal model behaviour.
4. Evolving model behaviour influenced by users and the environment: User behaviour itself is evolving. People are starting to treat AI models not just as tools but as companions, advisors, or even psychological supports. This creates a new, dynamic environment that machine learning systems must continuously adapt to.
What tools do you consider the most promising for a unified software supply chain?
DT: There is no single universal tool that can be used to build a unified software supply chain. Everything depends on the technologies a company already uses in its infrastructure and development processes.
For example, if we talk about cloud providers such as Google Cloud or AWS, it makes sense to use tools that are natively supported within their ecosystems.
Google Cloud's Vertex AI, for instance, is built on MLOps principles and deeply integrated with Kubernetes. To be more specific, MLOps is an extension of DevOps — it uses the same tools, but in a new way, adapted for machine learning. So, if a company already operates within the Google Cloud ecosystem, that becomes a natural choice.
On the other hand, if we take Databricks as an example, it has its own ecosystem that uses MLflow under the hood for experiment tracking and model management. In that case, leveraging MLflow makes perfect sense, since it's already integrated into the platform.
It's always a combination of technologies working together:
The ultimate goal is to integrate these components seamlessly, enabling faster development cycles, automated experimentation, and reliable traceability and governance of models.
How does the shift in collaboration culture between DevOps engineers and data scientists affect the speed and quality of ML solution delivery?
DT: The collaboration culture between DevOps and data scientists still differs significantly. In DevOps, the approach is built around deterministic artefacts — pieces of code that can be reliably deployed and are expected to behave consistently across environments. For DevOps teams, the key priorities are deployment, security, CI/CD, logging, and stability.
For data scientists, the picture is entirely different. A machine learning model depends heavily on the quality of data, which constantly changes. You can't simply "package" a model as code and hand it over to DevOps — it won't behave as predictably as a traditional application.
In most cases, data scientists work locally — they create, train, and test models on their machines or in cloud notebooks (such as Google Colab). But deploying a model to production often involves cloud migration of data pipelines and infrastructure, in addition to integration, monitoring, and retraining — all of which extend far beyond classical DevOps practices.
The problem is that DevOps doesn't see the data context. From their perspective, everything looks fine — the deployment succeeded, firewalls are in place, and logging and monitoring are functioning correctly. However, from the model's point of view, things might be going wrong: data patterns could have changed, or new types of input could appear that the model doesn't recognise. In other words, the infrastructure remains stable, but the model degrades.
That's why there's a growing need for professionals who understand both worlds — DevOps and data science. Such a person can explain where intervention is needed, which steps can be automated, how to organise retraining workflows, what data to store, and how to track changes over time.
The biggest challenge lies in working with data. We all go about our daily routines — having coffee, buying things, going to work, and so on. But in its raw form, data has no value. Only after it's processed, cleaned, and integrated through data engineering does it become useful for models. The ability to handle data pipelines — processing, storage, and retraining — is precisely what differentiates MLOps from DevOps.
So, DevOps engineers and data scientists alone often can't fully understand each other. There needs to be a bridge — the MLOps role, which connects both sides. If a model starts performing poorly, the MLOps specialist investigates whether the issue lies in the data, code, parameters, or security policies.
For DevOps, the main goal is a stable and secure system, while data scientists care about the quality of the model's output. Sometimes a model can generate unsafe or inaccurate results — and DevOps might not even notice because the system is technically "working." This highlights the importance of closer collaboration between the two sides.
What are your thoughts on the cultural shift toward MLOps roles?
DT: The cultural shift is already happening. Large companies are increasingly involving DevOps engineers in ML-related projects, especially due to the shortage of experienced MLOps specialists.
In my opinion, most MLOps professionals today come either from a data science background or from software engineering. DevOps engineers rarely transition into MLOps — their workload in maintaining infrastructure is already substantial. In contrast, developers or data scientists who want a deeper understanding of infrastructure often move into MLOps roles.
I came into MLOps from a software development background. I've always enjoyed the DevOps side — configuration, automation, and system design. And yes, programming knowledge is required at a higher level than in traditional DevOps, because you need to understand not only code, but also model logic and data behaviour.
Agile is a software development approach that emphasises iterative progress, collaboration, and quick delivery of features. DevOps, on the other hand, focuses on the collaboration between development and operations teams to ensure continuous integration, delivery, and reliable deployment of software.
Yes, MLOps requires coding. You'll need programming skills (typically Python) for building models, writing automation scripts, deploying pipelines, and managing infrastructure.
DevOps focuses on automating the SDLC. Machine Learning Operations, MLops, extends DevOps principles to machine learning systems, which have unique challenges:
Automated testing ensures the quality of code, data integrity, and model performance. It also helps to catch errors early and prevents broken models from reaching production.
The breadth of knowledge and understanding that ELEKS has within its walls allows us to leverage that expertise to make superior deliverables for our customers. When you work with ELEKS, you are working with the top 1% of the aptitude and engineering excellence of the whole country.
Right from the start, we really liked ELEKS’ commitment and engagement. They came to us with their best people to try to understand our context, our business idea, and developed the first prototype with us. They were very professional and very customer oriented. I think, without ELEKS it probably would not have been possible to have such a successful product in such a short period of time.
ELEKS has been involved in the development of a number of our consumer-facing websites and mobile applications that allow our customers to easily track their shipments, get the information they need as well as stay in touch with us. We’ve appreciated the level of ELEKS’ expertise, responsiveness and attention to details.