So far in our MLOps journey, we have created ML research and ML model-building pipelines as well as saved them in serialized form. Saving models this way allows us to now take that serialized ML model and load it into an application.

We will now take the saved ML model…

Photo by Joshua Newton on Unsplash

In previous articles, we gained the basics of MLOps and set up our orchestrator. Now we will put together our Python applications and interactions with Databricks Spark.

Why Databricks Spark?

We will be using Databricks Spark as a general platform to run our code. In some cases, we will Spark to run code…

Photo by Gustavo Espíndola on Unsplash

In our first article, we introduced the basics of MLOps, now we will talk about our core application in our tech stack, Airflow. Airflow will be the central orchestrator for all batch-related tasks.

Swapping technologies

This tech stack is designed for flexibility and scalability. There should be no issues using alternative tooling…

Photo by Sigmund on Unsplash

MLOps?

MLOps (Machine Learning Operations) is the practice of combining the lessons learned from DevOps for the productionisation of machine learning. Its role is to fill the gap between the data scientist and the machine learning consumers.

Machine Learning? Data Science?

Machine Learning can be understood as the process of applying a set of techniques…

Photo by Vincentiu Solomon on Unsplash

“Do not collect weapons or practice with weapons beyond what is useful.” Miyamoto Musashi, Dokkodo

Students of the Ichi school Way of Strategy should train from the start with the (normal) sword and the long sword in either hand. This is a truth: when you sacrifice your life, you must…

When working on multiple Python projects it's common to run into issues with Python versioning, and package management. I am going to introduce two projects to help you tackle these common issues. I’m not going to take about the Conda project, simply because in my experience 90% of the time…

Photo by Fikri Rasyid on Unsplash

When starting a new project, it's a good idea to evaluate your data storage needs. I’m going to shy away from the term database and instead, I’ll use the term data store because oftentimes labels are loaded with baggage that will distract us. …

History

Data Engineering is a relatively new concept, although the skills have been around for some time. If you Google around you will find that the skills, tools, and job responsibilities will vary significantly. My approach is a broad, modern approach to the data engineering role. Many hyperspecialized roles also exist…

Photo by Phil Hearing on Unsplash

If you believe it, they believe it.

267th Ferengi Rule of Acquisition

All war is deception

Sun Tzu

The dangers of testing

Let's face it testing software can be hard. Even with the best intentions, our tests can easily break. This phenomenon is called brittle unit tests.

Brittle Unit tests

Unit testing has a very bad reputation…

Photo by Joshua Sortino on Unsplash

Note: I have avoided discussing the many possible Spark options available on the market and instead, I am focusing on Databricks, and this is because they offer a very good easy to use product and they are vendor-neutral. …

Brian Lipp

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store