Model Management with MLflow, Azure, and Docker

Author:Murphy | View: 25730 | Time: 2025-03-23 11:23:31

In the first article, we explored Docker's powerful ability to package applications and their dependencies into portable containers, ensuring consistency across various environments.

Building on that foundation, this article introduces MLflow, an important tool for experiment tracking and model management in machine learning workflows. We will demonstrate how to deploy and use MLflow within a Docker container to ensure portability and avoid issues related to dependencies. The containerized MLflow server will be deployed on Azure, for better scalability, remote access, and importantly team collaboration.

What is MLflow

MLflow is an open-source platform that simplifies managing the machine learning lifecycle, from experiment tracking to model deployment. It offers a stable framework to log experiments, manage code, and track model versions, ensuring that your workflow is reproducible among the team and organized.

MLflow can be integrated into various stages of your machine learning pipeline. It provides four main components:

MLflow Tracking: This is the most widely used feature, allowing you to log and query experiments. It tracks useful details like code versions, datasets, configurations, hyperparameters, evaluation metrics, and results. You can access this information via a user-friendly web interface.
MLflow Projects: A packaging format that makes your code reproducible across platforms. MLflow Projects integrate with version control systems like Git, making it easy to track and manage dependencies.
MLflow Models: This feature standardizes how you package machine learning models. It ensures that models can be easily deployed across different environments without compatibility issues.
MLflow Model Registry: A centralized model repository that manages the entire lifecycle of your models, including versioning, approval workflows, and deployment processes.

In this article, we will focus on MLflow Tracking, which is the core feature that allows data scientists to track experiments efficiently.

The global architecture of MLflow Tracking

MLflow Tracking provides a system to log and manage all the critical data associated with your machine learning experiments. Here's how the architecture works:

MLflow Tracking Server:

This is the core component. It's a web-based service that logs your experiments and makes them accessible through a user-friendly interface. You can access this interface via a URL to track experiments from anywhere. Each time you run an experiment (e.g., training a model), the server logs:

Code: The version of the script or code used.
Hyperparameters: Training parameters like learning rate, batch size, etc.
Metrics: Performance metrics such as accuracy, loss, and precision.
Results: Outputs like model files or prediction results.

We will use Azure WepApp to deploy this server.

Backend Store (Database):

The metadata (configuration, parameters, metrics, and logs ) are stored in a backend database. This is the system's brain, that helps to keep a structured history of all experiments. You can easily query, compare, and analyze past runs. We will use Azure SQL Database.

Artifact Store:

The artifact store is the folder where files generated during your experiments are saved. We will use Azure Blob Storage for this. Artifacts include:

Models: The trained models.
Plots: Graphs or charts created during training.
Data Files: Additional output files generated during the run.

To summarize, in our case, we will use Azure SQL Database for the backend store, Azure Blob Storage for the artifact store, and deploy the MLflow Tracking Server using Azure Container Instances and Azure WebApp to ensure scalability and ease of management.

Getting Hands-On with Practice

You can clone this folder to find all the necessary scripts for this tutorial.

Step 1: Create a Dockerfile

To host the MLflow server, we start by creating a Docker container using a Dockerfile. Here's an example configuration:

# Use Miniconda as the base image
FROM continuumio/miniconda3

# Set environment variables
ENV DEBIAN_FRONTEND=noninteractive

# Install necessary packages
RUN apt-get update -y && 
    apt-get install -y --no-install-recommends curl apt-transport-https gnupg2 unixodbc-dev

# Add Microsoft SQL Server ODBC Driver 18 repository and install
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - && 
    curl https://packages.microsoft.com/config/debian/11/prod.list > /etc/apt/sources.list.d/mssql-release.list && 
    apt-get update && 
    ACCEPT_EULA=Y apt-get install -y msodbcsql18 mssql-tools18

# Add mssql-tools to PATH
RUN echo 'export PATH="$PATH:/opt/mssql-tools18/bin"' >> ~/.bash_profile && 
    echo 'export PATH="$PATH:/opt/mssql-tools18/bin"' >> ~/.bashrc

# define default server env variables
ENV MLFLOW_SERVER_HOST 0.0.0.0
ENV MLFLOW_SERVER_PORT 5000
ENV MLFLOW_SERVER_WORKERS 1

# Set the working directory
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install Python dependencies specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Make sure the launch.sh script is executable
RUN chmod +x /app/launch.sh

# Expose port 5000 for MLflow
EXPOSE 5000

# Set the entrypoint to run the launch.sh script
ENTRYPOINT ["/app/launch.sh"]

This Dockerfile creates a container that runs an MLflow server. It installs necessary tools, including the Microsoft SQL Server ODBC driver, sets up the environment, and installs Python dependencies. It then copies our files in the app folder into the container, exposes port 5000 (mandatory for MlFlow), and runs a launch.sh script to start the MLflow server.

The launch.sh contains only the command that launches the mlflow server.

Step 2: Build and Run the Docker Container Locally

Build the Docker Image in the same directory where your Dockerfile is :

docker build . -t mlflowserver
# if your are on mac, use :
# docker build - platform=linux/amd64 -t mlflowserver:latest .

Run the Docker container:

docker run -it -p 5000:5000 mlflowserver

After running these commands, the MLflow server starts locally, and you can access the MLflow UI by navigating to http://localhost:5000. This confirms the server is successfully deployed on your local machine. However, at this stage, while you can log experiments to MLflow, none of the results, artifacts, or metadata will be saved in the SQL database or artifact store, as those have not been configured yet. Additionally, the URL is only accessible locally, meaning your data science team cannot access it remotely.

Step 3: Set Up Azure Resources

Start by creating an Azure account and grabbing your Subscription ID from the Azure Portal.

To deploy your MLflow server and make it accessible to your team, follow these simplified steps:

Clone the Repository: Clone this folder to your local machine.
Run the Deployment Script: Execute the deploy.sh script as a shell script. Make sure to update the Subscription ID variable in the script before running it.

While Azure offers a graphical interface for setting up resources, this guide simplifies the process by using the deploy.sh script to automate everything with a single command.

Here's a breakdown of what deploy.sh script does step-by-step:

1.Login and Set Subscription: First, log into your Azure account and set the correct subscription where all your resources will be deployed (retrieve the subscription ID from the Azure Portal).

az login az account set --subscription $SUBSCRIPTION_ID

2.Create a Resource Group: Create a Resource Group to organize all the resources you'll deploy for MLflow.

az group create --name $RG_NAME --location

3.Set Up Azure SQL Database: Create an Azure SQL Server and an SQL Database where MLflow will store all experiment metadata.

az sql server create 
    --name $SQL_SERVER_NAME 
    --resource-group $RG_NAME 
    --location $RG_LOCATION 
    --admin-user $SQL_ADMIN_USER 
    --admin-password $SQL_ADMIN_PASSWORD

az sql db create 
    --resource-group $RG_NAME 
    --server $SQL_SERVER_NAME 
    --name $SQL_DATABASE_NAME 
    --service-objective S0

4.Configure SQL Server Firewall: Allow access to the SQL Server from other Azure services by creating a firewall rule.

az sql server firewall-rule create 
    --resource-group $RG_NAME 
    --server $SQL_SERVER_NAME 
    --name AllowAllAzureIPs 
    --start-ip-address 0.0.0.0 
    --end-ip-address 0.0.0.0

5.Create Azure Storage Account: Set up an Azure ** Storage** Account and a Blob Container to store artifacts (e.g., models, experiment results).

az storage account create 
    --resource-group $RG_NAME 
    --location $RG_LOCATION 
    --name $STORAGE_ACCOUNT_NAME 
    --sku Standard_LRS

az storage container create 
    --name $STORAGE_CONTAINER_NAME 
    --account-name $STORAGE_ACCOUNT_NAME

6.Create Azure Container Registry (ACR): Create an Azure Container Registry (ACR) to store the Docker image of your MLflow server.

az acr create 
    --name $ACR_NAME 
    --resource-group $RG_NAME 
    --sku Basic 
    --admin-enabled true

7.Build and Push Docker Image to ACR: Build your Docker image for the MLflow server and push it to the Azure Container Registry. For that, you need first to retrieve the ACR Username and Password and to log into ACR.

export ACR_USERNAME=$(az acr credential show --name $ACR_NAME --query "username" --output tsv)
export ACR_PASSWORD=$(az acr credential show --name $ACR_NAME --query "passwords[0].value" --output tsv)

docker login $ACR_NAME.azurecr.io 
    --username "$ACR_USERNAME" 
    --password "$ACR_PASSWORD"

# Push the images
docker tag $DOCKER_IMAGE_NAME $ACR_NAME.azurecr.io/$DOCKER_IMAGE_NAME:$DOCKER_IMAGE_TAG
docker push $ACR_NAME.azurecr.io/$DOCKER_IMAGE_NAME:$DOCKER_IMAGE_TAG

8.Create App Service Plan: Set up an App Service Plan to host your MLflow server on Azure.

az appservice plan create 
    --name $ASP_NAME 
    --resource-group $RG_NAME 
    --sku B1 
    --is-linux 
    --location $RG_LOCATION

9.Deploy Web App with MLflow Container: Create a Web App that uses your Docker image from ACR to deploy the MLflow server.

az webapp create 
    --resource-group $RG_NAME 
    --plan $ASP_NAME 
    --name $WEB_APP_NAME 
    --deployment-container-image-name $ACR_NAME.azurecr.io/$DOCKER_IMAGE_NAME:$DOCKER_IMAGE_TAG

10.Configure Web App to Use Container Registry: Set up your Web App to pull the MLflow Docker image from ACR, and configure environment variables.

az webapp config container set 
    --name $WEB_APP_NAME 
    --resource-group $RG_NAME 
    --docker-custom-image-name $ACR_NAME.azurecr.io/$DOCKER_IMAGE_NAME:$DOCKER_IMAGE_TAG 
    --docker-registry-server-url https://$ACR_NAME.azurecr.io 
    --docker-registry-server-user $ACR_USERNAME 
    --docker-registry-server-password $ACR_PASSWORD 
    --enable-app-service-storage true

az webapp config appsettings set 
    --resource-group $RG_NAME 
    --name $WEB_APP_NAME 
    --settings WEBSITES_PORT=$MLFLOW_PORT

az webapp log config 
    --name $WEB_APP_NAME 
    --resource-group $RG_NAME 
    --docker-container-logging filesystem

11.Set Web App Environment Variables: Set the necessary environment variables for MLflow, such as storage access, SQL backend, and port settings.


echo "Retrive artifact, access key, connection string"
export STORAGE_ACCESS_KEY=$(az storage account keys list --resource-group $RG_NAME --account-name $STORAGE_ACCOUNT_NAME --query "[0].value" --output tsv)
export STORAGE_CONNECTION_STRING=`az storage account show-connection-string --resource-group $RG_NAME --name $STORAGE_ACCOUNT_NAME --output tsv`
export STORAGE_ARTIFACT_ROOT="https://$STORAGE_ACCOUNT_NAME.blob.core.windows.net/$STORAGE_CONTAINER_NAME"

#Setting environment variables for artifacts and database
az webapp config appsettings set 
    --resource-group $RG_NAME 
    --name $WEB_APP_NAME 
    --settings AZURE_STORAGE_CONNECTION_STRING=$STORAGE_CONNECTION_STRING
az webapp config appsettings set 
    --resource-group $RG_NAME 
    --name $WEB_APP_NAME 
    --settings BACKEND_STORE_URI=$BACKEND_STORE_URI
az webapp config appsettings set 
    --resource-group $RG_NAME 
    --name $WEB_APP_NAME 
    --settings MLFLOW_SERVER_DEFAULT_ARTIFACT_ROOT=$STORAGE_ARTIFACT_ROOT

#Setting environment variables for the general context
az webapp config appsettings set 
    --resource-group $RG_NAME 
    --name $WEB_APP_NAME 
    --settings MLFLOW_SERVER_PORT=$MLFLOW_PORT
az webapp config appsettings set 
    --resource-group $RG_NAME 
    --name $WEB_APP_NAME 
    --settings MLFLOW_SERVER_HOST=$MLFLOW_HOST
az webapp config appsettings set 
    --resource-group $RG_NAME 
    --name $WEB_APP_NAME 
    --settings MLFLOW_SERVER_FILE_STORE=$MLFLOW_FILESTORE
az webapp config appsettings set 
    --resource-group $RG_NAME 
    --name $WEB_APP_NAME 
    --settings MLFLOW_SERVER_WORKERS=$MLFLOW_WORKERS

Once the deploy.sh script has completed, you can verify that all your Azure services have been created by checking the Azure portal.

Go to the App Services section to retrieve the URL of your MLflow web application.

Your MLflow Tracking URL should now be live and ready to receive experiments from your data science team.

Step 3 : Log an Experiment with Scikit-Learn and MLflow

Here's a Python script demonstrating how to log an experiment using MLflow with a simple scikit-learn model, such as logistic regression. Ensure that you update the script with your MLflow tracking URI:

import os
import mlflow
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import joblib

# Load Iris dataset
iris = load_iris()

# Split dataset into X features and Target variable
X = pd.DataFrame(data = iris["data"], columns= iris["feature_names"])
y = pd.Series(data = iris["target"], name="target")

# Split our training set and our test set 
X_train, X_test, y_train, y_test = train_test_split(X, y)

# Set your variables for your environment
EXPERIMENT_NAME="experiment1"

# Set tracking URI to your Heroku application
mlflow.set_tracking_uri("set your mlflow tracking URI")
# mlflow.set_tracking_uri("http://localhost:5000")

# Set experiment's info 
mlflow.set_experiment(EXPERIMENT_NAME)

# Get our experiment info
experiment = mlflow.get_experiment_by_name(EXPERIMENT_NAME)

# Call mlflow autolog
mlflow.sklearn.autolog()
with open("test.txt", "w") as f:
        f.write("hello world!")

with mlflow.start_run(experiment_id = experiment.experiment_id):
    # Specified Parameters 
    c = 0.1

    # Instanciate and fit the model 
    lr = LogisticRegression(C=c)
    lr.fit(X_train.values, y_train.values)

    # Store metrics 
    predicted_qualities = lr.predict(X_test.values)
    accuracy = lr.score(X_test.values, y_test.values)

    # Print results 
    print("LogisticRegression model")
    print("Accuracy: {}".format(accuracy))

    # Log Metric 
    mlflow.log_metric("Accuracy", accuracy)

    # Log Param
    mlflow.log_param("C", c)
    mlflow.log_artifact('test.txt')

By running this script, you should be able to log your models, metrics, and artifacts to MLflow. Artifacts will be stored in Azure Blob Storage, while metadata will be saved in the Azure SQL Database.

Step 4: Check the results

1Check MLflow Tracking: Visit your MLflow tracking URL to find your experiment, run names, and all associated metrics and model parameters

Check MLflow Artifacts: Access the artifacts in the MLflow UI and verify their presence in Azure Blob Storage

You and your team can now submit experiments to MLflow, track them via the tracking URI, and retrieve model information or files from Azure Storage. In the next tutorial, we will explore how to create an API to read models stored in Azure Storage.

Conclusion

You've successfully set up MLflow with Azure for tracking and managing your Machine Learning experiments. Keep in mind that depending on your computer and operating system, you might encounter some issues with Docker, MLflow, or Azure services. If you run into trouble, don't hesitate to reach out for help.

Next, we'll explore how to use MLflow models stored in Azure Blob Storage to create an API, completing the automation workflow.

Thank you for reading!