
In the world of data analytics, asking questions of your data shouldn’t require writing complex SQL queries. Microsoft Fabric’s AI Skills and Data Agents make this possible, turning natural language questions into actionable insights. This blog walks through loading the AdventureWorks dataset, creating an AI skill, and querying it programmatically—all while handling real-world challenges like ambiguous terms such as “active customers.”
Create a New workspace with Name “Fabric DataAgent”
Configure the workspace with Fabric Capacity:
Workspace with Name Fabric DataAgent Created:
Create a LH
I have given a name with DataagentAIMF for this Lakehouse.
Once LH created , as you already aware of SQL Analytics Endpoint would be creates as like below image:
From here, I will ingest data via New Notebook as per the following script.
Once you open the notebook below the interface, you can observe:
Below is the Notebook script to ingest the Adventure works csv files and load the tables into Lakehouse.
Script:
import pandas as pd
from tqdm.auto import tqdm
base = “https://synapseaisolutionsa.z13.web.core.windows.net/data/AdventureWorks“
# load list of tables
df_tables = pd.read_csv(f”{base}/adventureworks.csv”, names=[“table”])
for table in (pbar := tqdm(df_tables[‘table’].values)):
pbar.set_description(f”Uploading {table} to lakehouse”)
# download
df = pd.read_parquet(f”{base}/{table}.parquet“)
# save as lakehouse table
spark.createDataFrame(df).write.mode(‘overwrite’).saveAsTable(table)
Script placed in notebook cell as below:
In the query editor, copy and paste the following code. Select the Run all button to execute the query. After the query is completed, you will see the results.
Uploaded all the tables : check the below snapshot FYR:
You can see the tables that has been available now in Lakehouse Tables with DataAgentAIMF:
Now, let’s call the workload (New Items) from the created workspace Data Agents (Preview).
I have given the Data Agent name :MFDataAgent
The data agent interface looks the following way:
Let’s try to add the data source from Lakehouse that we just ingested with Adventures works tables as below:
Fetching all the tables:
- DimCustomer
- DimDate
- DimGeography
- DimProduct
- DimProductCategory
- DimPromotion
- DimReseller
- DimSalesTerritory
- FactInternetSales
- FactResellerSales
We can select the tables and start asking questions about the data with simple and natural sentences
I just highlight the tables selected and important instructions so that we can take it as handy:
This time i just selected for testing with one table, FactInternetSales, and write a question.
Question like, “What is the most sold Products?”
Output: It starts analyzing the lakehouse tables from the database and finds the output:
Desired output result from the above question:
Even you can also see the SQL query for the generated output: no need to write it on your own , it is auto-generated. Cool, right?
Output: other than SQL Query, you can also see the below window with output as per the query that we would asked.
Next query:
how many active customers did we have June 1st, 2013?
Output: Once you submit the query, output follows as below:
Note: Part of the problem is that “active customer” doesn’t have a formal definition. More instructions in the notes to the model text box might help, but users might frequently ask this question. You need to make sure that the AI handles the question correctly
Use the Data agent programmatically
Both instructions and examples were added to the Data agent. As testing proceeds, more examples and instructions can improve the AI skill even further. Work with your colleagues to see if you provided examples and instructions that cover the kinds of questions they want to ask.
You can use the AI skill programmatically within a Fabric notebook. To determine whether or not the AI skill has a published URL value.
Once you published it, you can see the below wizard with publish data agent
Generate the published url as below:
%pip install “openai==1.70.0”
you can open the notebook back and execute the above command in code cell
The code below is a Python template designed to programmatically query a Microsoft Fabric Data Agent (AI Skill) using the Azure OpenAI Assistants API.
It demonstrates how to leverage the familiar OpenAI SDK to send natural language questions to a secure data endpoint hosted within Microsoft Fabric.
Summary of the Code
1. Custom Authentication
It defines a custom class, FabricOpenAI, that inherits from the standard openai.OpenAI client.
The key purpose is to handle authentication securely for the Fabric environment.
It overrides the request preparation to automatically inject the Fabric environment’s AAD token (retrieved via configs.driver_aad_token) into the request header as a Bearer token for every API call. This is essential for secure communication.
import requests
import json
import pprint
import typing as t
import time
import uuid
from openai import OpenAI
from openai._exceptions import APIStatusError
from openai._models import FinalRequestOptions
from openai._types import Omit
from openai._utils import is_given
from synapse.ml.mlflow import get_mlflow_env_config
from sempy.fabric._token_provider import SynapseTokenProvider
base_url = “https://<generic published base URL value>”
question = “What datasources do you have access to?”
configs = get_mlflow_env_config()
# Create OpenAI Client
class FabricOpenAI(OpenAI):
def __init__(
self,
api_version: str =”2024-05-01-preview”,
**kwargs: t.Any,
) -> None:
self.api_version = api_version
default_query = kwargs.pop(“default_query”, {})
default_query[“api-version”] = self.api_version
super().__init__(
api_key=””,
base_url=base_url,
default_query=default_query,
**kwargs,
)
def _prepare_options(self, options: FinalRequestOptions) -> None:
headers: dict[str, str | Omit] = (
{**options.headers} if is_given(options.headers) else {}
)
options.headers = headers
headers[“Authorization”] = f”Bearer {configs.driver_aad_token}”
if “Accept” not in headers:
headers[“Accept”] = “application/json”
if “ActivityId” not in headers:
correlation_id = str(uuid.uuid4())
headers[“ActivityId”] = correlation_id
return super()._prepare_options(options)
# Pretty printing helper
def pretty_print(messages):
print(“—Conversation—“)
for m in messages:
print(f”{m.role}: {m.content[0].text.value}”)
print()
fabric_client = FabricOpenAI()
# Create assistant
assistant = fabric_client.beta.assistants.create(model=”not used”)
# Create thread
thread = fabric_client.beta.threads.create()
# Create message on thread
message = fabric_client.beta.threads.messages.create(thread_id=thread.id, role=”user”, content=question)
# Create run
run = fabric_client.beta.threads.runs.create(thread_id=thread.id, assistant_id=assistant.id)
# Wait for run to complete
while run.status == “queued” or run.status == “in_progress”:
run = fabric_client.beta.threads.runs.retrieve(
thread_id=thread.id,
run_id=run.id,
)
print(run.status)
time.sleep(2)
# Print messages
response = fabric_client.beta.threads.messages.list(thread_id=thread.id, order=”asc”)
pretty_print(response)
# Delete thread
fabric_client.beta.threads.delete(thread_id=thread.id)
Use the + Code icon below the cell output to add a new code cell to the notebook, enter the following code in it and replace the URL. Click on ▷ Run button and review the output
you can see the complete status on your executed cell as below:
Conclusion: Conversational Analytics is Now a Reality
The journey we’ve completed – from creating a Fabric workspace and loading the AdventureWorks dataset into a Lakehouse, to deploying and programmatically querying an AI Data Agent—demonstrates a seismic shift in data analytics.
The Microsoft Fabric Data Agent empowers business users and data professionals alike to bypass the steep learning curve of SQL. It delivers on the promise of conversational analytics, making data access as simple as asking a question.
Happy Reading!!
