Integrate MLflow in Microsoft Fabric for Effective ML Management

As machine learning solutions grow more complex, managing experiments and reproducing results becomes increasingly challenging. Data scientists often train multiple models, tweak parameters, and test different datasets—making it difficult to keep track of what worked and why.

This is where platforms like Microsoft Fabric and MLflow play an important role. Microsoft Fabric provides a unified analytics environment for data engineering, data science, and analytics workloads, while MLflow helps organize and manage the machine learning lifecycle—from experiment tracking to model management.

In this blog, we will walk through how these two technologies work together to create reproducible and well-tracked machine learning workflows, making it easier to train, compare, and manage models within the Fabric ecosystem.

Understanding the Machine Learning Workflow in Microsoft Fabric

Modern machine learning projects are not limited to training a model. They involve several stages—from preparing the data to deploying models and generating insights. Microsoft Fabric provides a structured environment that supports the entire machine learning lifecycle in a unified platform.

Let’s walk through each stage of this workflow.


1. Store Data

Every machine learning project starts with data.

In Microsoft Fabric, the Store data stage represents the ingestion and storage of raw datasets in the platform. Data can come from various sources such as:

  • Databases
  • Data lakes
  • APIs
  • External storage systems

Fabric’s unified data architecture allows teams to store and manage data efficiently before it is used for analysis or model training.

At this stage, the goal is to ensure that data is accessible, organized, and ready for processing.


2. Exploratory Data Analysis (EDA)

Once the data is stored, the next step is Exploratory Data Analysis (EDA).

EDA helps data scientists understand the structure and quality of the dataset. Typical tasks include:

  • Inspecting data distributions
  • Identifying missing values
  • Detecting anomalies
  • Understanding relationships between variables

In Fabric notebooks, this stage is usually performed using libraries such as Pandas, Matplotlib, or Seaborn. Proper exploration ensures that the dataset is suitable for model development and reduces the risk of biased or inaccurate models.


3. Develop and Train the Model

After exploring the data, the next step is model development and training.

During this stage, data scientists:

  • Select appropriate algorithms
  • Split datasets into training and testing sets
  • Train models using machine learning frameworks

Common libraries used in Fabric notebooks include:

  • scikit-learn
  • PySpark ML
  • TensorFlow or PyTorch

Experiment tracking tools like MLflow are typically integrated at this stage to log parameters, metrics, and artifacts from each model training run.

Tracking experiments ensures that every training attempt can be reproduced and compared later.


4. Evaluate and Score the Model

Once a model is trained, it must be evaluated to determine how well it performs.

This stage involves measuring performance using evaluation metrics such as:

  • R² Score
  • Mean Absolute Error (MAE)
  • Mean Squared Error (MSE)
  • Accuracy (for classification problems)

Evaluation helps determine whether the model generalizes well to unseen data.

Fabric allows you to compare multiple experiment runs, making it easier to identify which model configuration produces the best results.


5. Apply the Best Model

After evaluation, the best-performing model is selected and applied.

This step typically involves:

  • Saving the trained model
  • Registering the model in a model registry
  • Deploying the model for predictions

Within Microsoft Fabric, the selected model can be reused for batch predictions, real-time scoring, or integration into analytics workflows.

This step bridges the gap between model experimentation and real business value.


6. Expose Insights

Machine learning models are only valuable when their results can be consumed by users.

The Expose insight stage focuses on delivering predictions and insights through:

  • Dashboards
  • Reports
  • Applications
  • APIs

In many organizations, predictions generated by models are visualized through analytics tools such as Power BI, enabling stakeholders to make data-driven decisions.


Why This Workflow Matters

The workflow shown in the diagram demonstrates how Microsoft Fabric enables a structured and repeatable machine learning process.

By integrating data storage, experimentation, model tracking, and analytics in a single platform, teams can:

  • Reduce complexity in ML pipelines
  • Improve collaboration between data engineers and data scientists
  • Track experiments and model performance efficiently
  • Deploy models faster into production environments

When combined with experiment tracking tools like MLflow, this workflow becomes even more powerful, enabling reproducible and transparent machine learning development.

Load Data into a DataFrame

Now that the notebook environment is ready, the next step is to load the dataset and prepare it for model training. In this example, we will work with the diabetes dataset.

After loading the dataset, we will convert it into a Pandas DataFrame, which is one of the most widely used data structures in Python for handling tabular data. A DataFrame organizes information into rows and columns, making it easier to explore, clean, and prepare the data before training a machine learning model.

To begin, add a new code cell in your notebook. You can do this by selecting the + Code icon located below the output of the latest cell. Once the new cell is created, you can enter the code required to load the dataset and convert it into a Pandas DataFrame.

Code: # Azure storage access info for open dataset diabetes

blob_account_name = “azureopendatastorage”

blob_container_name = “mlsamples”

blob_relative_path = “diabetes”

blob_sas_token = r”” # Blank since container is Anonymous access

# Set Spark config to access  blob storage

wasbs_path = f”wasbs://%s@%s.blob.core.windows.net/%s” % (blob_container_name, blob_account_name, blob_relative_path)

spark.conf.set(“fs.azure.sas.%s.%s.blob.core.windows.net” % (blob_container_name, blob_account_name), blob_sas_token)

print(“Remote blob path: ” + wasbs_path)

# Spark read parquet, note that it won’t load any data yet by now

df = spark.read.parquet(wasbs_path)

Click the + Code icon located below the output of the current cell to insert a new code cell in the notebook. Then, paste or type the following code into the newly created cell and run it.

display(df)

The dataset is currently loaded as a Spark DataFrame. However, since we will be using scikit-learn for model training, the input data needs to be in a Pandas DataFrame format.

import pandas as pd

df = df.toPandas()

df.head()

Train a Machine Learning Model

Now that the dataset has been successfully loaded and prepared, the next step is to train a machine learning model. In this stage, we will use the data to build a model that can predict a quantitative measure of diabetes progression.

To accomplish this, we will train a regression model using the scikit-learn library. During the training process, we will also integrate MLflow to track important details such as parameters, evaluation metrics, and model artifacts. This allows us to monitor experiments, compare model performance, and maintain a reproducible machine learning workflow.

from sklearn.model_selection import train_test_split

X, y = df[[‘AGE’,’SEX’,’BMI’,’BP’,’S1′,’S2′,’S3′,’S4′,’S5′,’S6′]].values,

df[‘Y’].values X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=0)

Add another new code cell to the notebook, enter the following code in it, and run it:

import mlflow

experiment_name = “experiment-diabetes”

mlflow.set_experiment(experiment_name)

The code initializes an MLflow experiment named experiment-diabetes. All model training runs, along with their parameters, metrics, and artifacts, will be logged and tracked under this experiment using MLflow. This makes it easier to monitor, compare, and manage different model training runs within Microsoft Fabric.

from sklearn.linear_model import LinearRegression

with mlflow.start_run():

mlflow.autolog() model = LinearRegression()

model.fit(X_train, y_train)

mlflow.log_param(“estimator”, “LinearRegression”)

The code trains a regression model using the Linear Regression algorithm. During the training process, MLflow automatically logs key information such as parameters, evaluation metrics, and model artifacts. In addition to the automatic logging, a custom parameter named estimator is also recorded with the value LinearRegression, making it easier to identify the algorithm used for this particular experiment run.

The code trains a regression model using the Decision Tree Regressor algorithm. During the training process, MLflow automatically records important details such as parameters, evaluation metrics, and model artifacts. In addition to this automatic logging, a custom parameter named estimator is logged with the value DecisionTreeRegressor, allowing you to easily identify the algorithm used for this specific experiment run when reviewing results.

Use MLflow to Search and View Your Experiments

After training and tracking models, you can use MLflow to explore and analyze the experiments that have been logged. MLflow provides functionality to retrieve experiment details, view different runs, and compare model performance.

By querying the experiment data, you can list all available experiments, retrieve a specific experiment by name, and examine the runs associated with it. This makes it easier to track model training history, review logged parameters and metrics, and compare the performance of different models within the same experiment.

import mlflow

experiments = mlflow.search_experiments()

for exp in experiments:

    print(exp.name)

To retrieve a specific experiment, you can access it by using its name in MLflow. This allows you to fetch the experiment details and work with its associated runs and metadata.

experiment_name = “experiment-diabetes”

exp = mlflow.get_experiment_by_name(experiment_name)

print(exp)

By using the experiment name, you can retrieve all the runs associated with that experiment in MLflow. This allows you to view the different training jobs, along with their logged parameters, metrics, and other details.

mlflow.search_runs(exp.experiment_id)

To make it easier to compare different runs and their outputs, you can configure the search results in MLflow to return them in a specific order. For example, the following code retrieves the runs sorted by start_time in descending order and limits the output to the two most recent runs.

mlflow.search_runs(exp.experiment_id, order_by=[“start_time DESC”], max_results=2)

Finally, you can visualize the evaluation metrics of multiple models side by side to make it easier to compare their performance. By plotting these metrics, you can quickly identify which model performs better based on the selected evaluation criteria.

import matplotlib.pyplot as plt

df_results = mlflow.search_runs(exp.experiment_id, order_by=[“start_time DESC”], max_results=2)[[“metrics.training_r2_score”, “params.estimator”]]

fig, ax = plt.subplots()

ax.bar(df_results[“params.estimator”], df_results[“metrics.training_r2_score”])

ax.set_xlabel(“Estimator”)

ax.set_ylabel(“R2 score”)

ax.set_title(“R2 score by Estimator”)

for i, v in enumerate(df_results[“metrics.training_r2_score”]):

    ax.text(i, v, str(round(v, 2)), ha=’center’, va=’bottom’, fontweight=’bold’)

plt.show()

The output should appear similar to the following image, showing a comparison of the evaluation metrics for the trained models.

Conclusion

In this blog, we explored how to build a simple yet effective machine learning workflow using Microsoft Fabric together with MLflow. Starting from data preparation, we walked through loading the diabetes dataset, converting it into a Pandas DataFrame, and preparing it for model training. We then trained multiple regression models using scikit-learn and leveraged MLflow to automatically track parameters, metrics, and model artifacts during each experiment run.

By using MLflow within Microsoft Fabric, it becomes much easier to organize experiments, compare model performance, and maintain a clear history of training runs. This approach helps ensure that machine learning experiments remain reproducible, transparent, and easier to manage, especially when multiple models and configurations are involved.

We also saw how experiment tracking enables us to retrieve runs, analyze results, and visualize model performance to identify the best-performing algorithm. Once the optimal model is identified, it can be saved and integrated into downstream analytics workflows, helping organizations turn data into actionable insights.

Overall, the integration of Microsoft Fabric with MLflow provides a powerful and unified platform for managing the machine learning lifecycle—from data exploration and experimentation to model evaluation and deployment. As machine learning projects continue to grow in complexity, adopting structured experiment tracking and reproducible workflows will be key to building reliable and scalable ML solutions.

Happy Reading!

Leave a Reply

Your email address will not be published. Required fields are marked *