Atomiqx Technologies Co.

Book A Consultation


Jin Cong Ho

In this series, the author shares his hands-on journey building real-time ML pipelines using available open-source tools and documents the difficulties along the way.

How much value can you generate by updating models with real-time streams? Photo by Todd Trapani on Unsplash

In Machine learning is going real-time, author Chip Huyen classifies two levels of real-time machine learning systems:

  • Level 1: ML systems that can make predictions in real-time (online predictions)
  • Level 2: ML systems that can continuously learn from new data and update the model in real-time (online learning)

Systems in the first level are capable to make relevant/timely predictions, such as ad ranking, fall detection, fraud detection, etc. Predicting the stock value in real-time is more valuable than after a month, right? But for Level 1 ML systems, the technical challenges are mostly solved by maturing open-source tools such as Apache Flink (stream processing), Kafka (event-driven architecture), model compression, Kubeflow (workflow management) and Seldon (autoscaling model serving). All these tools have clear development roadmaps to continuously improve model inference speed and easing pipeline development.

The harder part, however, is implementing Level 2 ML systems that can do online learning and update models with new data in real-time. There are little discussion and consensus in the MLOps community on how to build them yet. In fact, SIG MLOps from CDFoundation lists online-learning as ‘research required’ until 2024. So, how easy it is to build a real-time machine learning pipeline with current open-source tools?

Wait, before we further the road, why we want to do online-learning at all? In recent years, the maturity of stream processors allows DataOps to (1) process real-time data in a scalable manner (2) unify batch and stream processing, moving from Lamba architecture to the simpler Kappa architecture. They provide constant streams of real-time data into our pipeline and makes us question: can we extract more value from these data in real-time?

The answer (you’ve probably guessed it): yes, we can. Online-learning updates our ML models more frequently to handle cases such as trending contents on social-medias, rare events (Black Friday) and even during cold-start (new user behaviours). It shines when you need to improve model performance in responding to dynamic data fast. As such, it’s worth asking: how much value you can generate by increasing the frequency of model updates?

Okay, now that we understand the benefits, how should we design real-time ML pipelines? There are a few industry references available online and, unsurprisingly, these companies possess a huge amount of real-time user interaction data. Before looking at them, it’s worth reviewing offline ML pipelines because evolving them to online still involves the same processes, except that it becomes a long-running continuous loop — ingestion, training and deployment.

(from: https://www.oreilly.com/library/view/building-machine-learning/9781492053187/ch01.html; Online ML Pipelines update model continuously, although may have some delay in deployment to validate models)

During Flink Forward Virtual 2020, Weibo (social media platform) shared the design of WML, their real-time ML architecture and pipeline. Essentially, they’ve integrated previously separated offline and online model training into a unified pipeline with Apache Flink. The talk focused on how Flink is used to generating sample data for training by joining offline data (social posts, user profile, etc) with multiple streams of real-time interaction events (clickstream, read stream, etc) and extracted multimedia content feature. No specific model architecture was mentioned, but they are in progress to online training DNN.

Source

    Leave a Reply

    Your email address will not be published. Required fields are marked *