A Beginner's Guide to MLOps Stacks on Databricks

Equipped with an almost excessive amount of diagrams!

Feb 18, 2025

MLOps Stacks is a template using Databricks Asset Bundles (aka DABs) to implement an MLOps workflow. It is easily customizable, but if you are not familiar with DABs or MLOps, it can get overwhelming quite quickly. There are a lot of folders. A lot of files. But by the end of this blog, you will understand how to use this template for your own use case.

Instantiating your first MLOps Stack is quite easy. I would recommend creating a basic one using the instructions here and walk through this blog. You can also take a look at the template directly in the public Github repository: databricks/mlops-stacks.

Your project bundle is controlled by the databricks.yml file. This contains all of the configurations (aka what the STAGE workspace is, what the DEV workspace is, what the `prod` catalog is, etc.). This file also points to all of the workflows in the project. Surprise! These configurations are also written in yaml. And the workflow configurations point to the notebooks in the project.

This walkthrough follows the “Deploy code” approach, which is generally recommended by Databricks. This means that the code moves from development to staging and then production. The model is trained in each environment. However, there are certain scenarios where “Deploy model” works better, like if your model training process was quite expensive. This is why the “Deploy model” approach is more common in LLMOps, but more on this in a later blog post.

Let’s walk through a theoretical example of how this would work.

Development

Step One: Exploratory Data Analysis

In the DEV environment, we explore new data alongside existing production data stored in the `dev` catalog. Perhaps, during this exploration, there is a discovery! There is a eureka moment! Now, we need to train and tune a new model.

Step Two: Model Training

We are still in the DEV environment, but now we can use MLFlow to keep track of everything. If you are unfamiliar with MLFlow, I would recommend taking a look at this demo notebook linked here. But shortly, MLFlow is a way for you to track your experiments and within that, your runs (each run = iteration of training). If that concept is still confusing, take a look at the docs here.

Using MLFlow, we can log key metrics, parameters, and artifacts across different runs, enabling us compare and contrast different trained models and pick the best model. Finally, once we are satisfied with the model that we have created, we can register the model in the `dev` catalog.

Step Three: Push Changes

Let’s update the repository now! Within a Databricks workspace, we can create a temporary branch (here we call it `dev`) and merge the updated notebooks.

Note: if you are following along via the MLOps Stacks template, you can see that there is a logical separation in the template structure:

feature_engineering
- contains feature transformations
- In this example, we are using the Databricks Feature Store, which is a centralized repository for managing and serving features
monitoring
- contains code for model monitoring
- we separate this from validation as monitoring is on-going in production while validation occurs pre-deployment. In this template, ModelValidation.py is the second part of the Model Training workflow
validation
- contains code for model validation
deployment
- contains serving endpoint setup and configuration
training
- contains model training logic (e.g. all of the MLFlow)

This separation follows the principal of separation of concerns— each directory has a specific responsibility in the ML lifecycle. It also makes it easier to:

have different teams work on different aspects of the lifecycle
maintain and update specific parts of the pipeline
reuse components across different projects
implement proper testing of each component

And now, after the branch is created, we can commit the code directly to the `dev` branch. This will now move us to the next part of the process.

Staging

Step One: Pull Request

Once our code is merged into the `dev` branch, we can open a Pull Request to merge the new changes to our `main` branch. It is time to test all of these changes.

Databricks Asset Bundles allows you to easily maintain Databricks resources, but you will still need a way to automate and run the workflows. I have personally used a lot of Github Actions, but you can set up Azure DevOps Pipelines, GitLab Pipelines, etc.

Step Two: Unit and Integration Tests

Opening a PR immediately triggers multiple workflows (via Github Actions or whatever process you have set up) for Unit Tests and Integration Tests.

The Integration Tests workflow (link to Github Actions workflow) uses Databricks Asset Bundle (DAB) commands to create the assets in the Staging environment and trigger the workflow to run the necessary notebooks:

# Validate the bundle configuration
databricks bundle validate -t test 

# Creates all necessary assets in stage environment
databricks bundle deploy -t test

# Executes feature engineering pipeline
databricks bundle run write_feature_table_job -t test 

# Executes model training pipeline
databricks bundle run model_training_job -t test

The `-t test` flag specifies that these commands should target the STAGE environment, as defined in the databricks.yml file.

We are running two jobs: the feature engineering workflow (link), which computes all of the features and stores them in the Databricks Feature Store, and the model training workflow (link), which trains the model with the Feature Store and then validates the model.

These test can be expanded even further to cover all aspects of the model workflows:

Feature Engineering tests
- verify the data transformation pipeline
- For example, test that the data types are expected, the missing values are handled properly, the feature values are within an expected range, and the Databricks Feature Store itself is working properly
Model Training tests
- verify the model training process
- For example, ensure that the resources are utilized properly; metrics and parameters are correctly logged in MLFlow; and the model can be saved and loaded
Model Validation tests
- verify the model’s performance and behavior
- For example, make sure the model predictions are expected, compare the model’s performance against previous versions, and check if the model is meeting the expected thresholds
Model Deployment tests
- verify that the model can be deployed and served
- For example, test performance under expected traffic
Model Inference tests
- verify the model’s behavior during prediction
- For example, ensure that the inference speeds meets requirements, monitor resource consumption, and verify that data is being logged to the Inference Tables correctly
Model Monitoring tests
- verify the monitoring system is working
- For example, test the alerting system and validate the visualizations in the Dashboards

After a successful job run, you can successfully merge the `dev` branch to the `main` branch. Let’s move to the next and final part of the process.

Production

Step One: Release Branch

After successfully validating the changes in the STAGE environment, we can now create a `release` branch to update the changes in the PROD environment. This branching strategy allows us to provide a clear snapshot of what is being deployed to production, enables hotfixes if issues arise without disrupting the main branch, and creates a history.

The creation of a `release` branch triggers a production deployment workflow in your CI/CD system.

Step Two: Create Assets

This is similar to what happened before:

# Validate the bundle configuration 
databricks bundle validate -t prod

# Deploy assets the PROD environment
databricks bundle deploy -t prod

Instead of running any specific jobs, like we did in the STAGE environment, we are simply deploying the assets (link to Github Actions workflow) because they are scheduled. This schedule is defined in each workflow configuration.

Now, with assets deployed to production and workflow scheduled, we have completed the full MLOps lifecycle implementation.

Next Steps

Ready to start implementing?

Create your first MLOps Stack using the template repository.
Review the example notebooks in the template and understand how each component works.
Adapt the workflow configurations to match your needs.
Set up your CI/CD pipeline using Github Actions or your preferred tool.

In future posts, we will dive deeper on LLMOps and Best Practices. If you have any questions about implementing MLOps Stacks, feel free to reach out via the comments.

I will leave you with a poll:

Databricksters

Discussion about this post