Use the MLflow python API to drive better model development
MLflow is a fantastic way to speed up your machine learning model development process through its powerful experimentation component. This allows data scientists to record the best algorithms and parameter combinations and quickly iterate model development.
This blog is intended to show users how to get the most out of MLflow experiments. We will focus on the start_run()
and its parameters, which can improve your experimentation process. In addition, we will cover the search_runs()
function, which provides a comprehensive view of your experimentation history and allows for greater flexibility in analysis.
If you are new to MLflow, I suggest you take a look at MLflow place, documentationa few blog posts or video tutorials before jumping into this blog.
mlflow.start_run()
Most of these cheats are parameters of the start_run()
function. We call this function to start our experiment run and it becomes the active run where we can log parameters, metrics, and other information.
This is the feature I use the most in MLflow and the one that offers the most instant value to users.
1.run_id
He run_id
is a UUID that is specific to each experiment run. Once a run has started, it is not possible to override properties such as model type or parameter values. However, you can use the run_id
to record additional values retrospectively, such as metrics, labels, or a description.
# Start MLflow run for this experiment# End any existing runs
mlflow.end_run()
with mlflow.start_run() as run:
# Turn autolog on to save model artifacts, requirements, etc.
mlflow.autolog(log_models=True)
print(run.info.run_id)
diabetes_X = diabetes.data
diabetes_y = diabetes.target
# Split data into test training sets, 3:1 ratio
(
diabetes_X_train,
diabetes_X_test,
diabetes_y_train,
diabetes_y_test,
) = train_test_split(diabetes_X, diabetes_y, test_size=0.25, random_state=42)
alpha = 0.9
solver = "cholesky"
regr = linear_model.Ridge(alpha=alpha, solver=solver)
regr.fit(diabetes_X_train, diabetes_y_train)
diabetes_y_pred = regr.predict(diabetes_X_test)
# Log desired metrics
mlflow.log_metric("mse", mean_squared_error(diabetes_y_test, diabetes_y_pred))
mlflow.log_metric(
"rmse", sqrt(mean_squared_error(diabetes_y_test, diabetes_y_pred))
)
In this case, we may also want to record our coefficient of determination (r²) value for this run:
with mlflow.start_run(run_id="3fcf403e1566422493cd6e625693829d") as run:
mlflow.log_metric("r2", r2_score(diabetes_y_test, diabetes_y_pred))
He run_id
can be extracted by print(run.info.run_id)
from the previous execution, or by consulting mlflow.search_runs()
but more on that later.
2. experiment_id
You can configure the experiment in which you want a run to be recorded in different ways in MLflow. The first command sets the experiment for all subsequent runs to “mlflow_sdk_test”.
mlflow.set_experiment("/mlflow_sdk_test")
This can also be configured on a run-by-run basis via the experiment_id
parameter.
my_experiment = mlflow.set_experiment("/mlflow_sdk_test")
experiment_id = my_experiment.experiment_id
This value can be reused when passed to start_run()
:
# End any existing runs
mlflow.end_run()with mlflow.start_run(experiment_id=experiment_id):
# Turn autolog on to save model artifacts, requirements, etc.
mlflow.autolog(log_models=True)
print(run.info.run_id)
diabetes_X = diabetes.data
diabetes_y = diabetes.target
# Split data into test training sets, 3:1 ratio
(
diabetes_X_train,
diabetes_X_test,
diabetes_y_train,
diabetes_y_test,
) = train_test_split(diabetes_X, diabetes_y, test_size=0.25, random_state=42)
alpha = 0.8
solver = "cholesky"
regr = linear_model.Ridge(alpha=alpha, solver=solver)
regr.fit(diabetes_X_train, diabetes_y_train)
diabetes_y_pred = regr.predict(diabetes_X_test)
# Log desired metrics
mlflow.log_metric("mse", mean_squared_error(diabetes_y_test, diabetes_y_pred))
mlflow.log_metric(
"rmse", sqrt(mean_squared_error(diabetes_y_test, diabetes_y_pred))
)
mlflow.log_metric("r2", r2_score(diabetes_y_test, diabetes_y_pred))
3. run_name
When you specify the name of your run, you have more control over the naming process than relying on the default names generated by MLflow. This allows you to establish a consistent naming convention for your experiment runs, similar to how you might manage other resources in your environment.
# Start MLflow run for this experiment# End any existing runs
mlflow.end_run()
# Explicitly name runs
today = dt.today()
run_name = "Ridge Regression " + str(today)
with mlflow.start_run(run_name=run_name) as run:
# Turn autolog on to save model artifacts, requirements, etc.
mlflow.autolog(log_models=True)
print(run.info.run_id)
diabetes_X = diabetes.data
diabetes_y = diabetes.target
# Split data into test training sets, 3:1 ratio
(
diabetes_X_train,
diabetes_X_test,
diabetes_y_train,
diabetes_y_test,
) = train_test_split(diabetes_X, diabetes_y, test_size=0.25, random_state=42)
alpha = 0.5
solver = "cholesky"
regr = linear_model.Ridge(alpha=alpha, solver=solver)
regr.fit(diabetes_X_train, diabetes_y_train)
diabetes_y_pred = regr.predict(diabetes_X_test)
# Log desired metrics
mlflow.log_metric("mse", mean_squared_error(diabetes_y_test, diabetes_y_pred))
mlflow.log_metric(
"rmse", sqrt(mean_squared_error(diabetes_y_test, diabetes_y_pred))
)
mlflow.log_metric("r2", r2_score(diabetes_y_test, diabetes_y_pred))
However, keep in mind that run_name
is not a unique constraint in MLflow. This means that you could have multiple experiments (with unique run IDs) that share the same name.
This means that every time you run a new run on a with statement, it will create a new experiment with the same name, instead of adding details to this run.
This nicely leads us to the next parameter.
4. nested
You may be familiar with nested experiment runs if you ran the scikit-learn functionGridSearchCV
to perform hyperparameter optimization.
Nested experiments look like the following in MLflow:
Note that the metrics here are saved to the parent run, which returns the best values recorded by the child runs. The secondary runtime values themselves are blank.
While nested experiments are great for evaluating and recording parameter combinations to determine the best model, they also serve as a great logical container for organizing your work. With the ability to group experiments, you can compartmentalize individual data science investigations and keep your experiments page organized and uncluttered.
# End any existing runs
mlflow.end_run()# Explicitly name runs
run_name = "Ridge Regression Nested"
with mlflow.start_run(run_name=run_name) as parent_run:
print(parent_run.info.run_id)
with mlflow.start_run(run_name="Child Run: alpha 0.1", nested=True):
# Turn autolog on to save model artifacts, requirements, etc.
mlflow.autolog(log_models=True)
diabetes_X = diabetes.data
diabetes_y = diabetes.target
# Split data into test training sets, 3:1 ratio
(
diabetes_X_train,
diabetes_X_test,
diabetes_y_train,
diabetes_y_test,
) = train_test_split(diabetes_X, diabetes_y, test_size=0.25, random_state=42)
alpha = 0.1
solver = "cholesky"
regr = linear_model.Ridge(alpha=alpha, solver=solver)
regr.fit(diabetes_X_train, diabetes_y_train)
diabetes_y_pred = regr.predict(diabetes_X_test)
# Log desired metrics
mlflow.log_metric("mse", mean_squared_error(diabetes_y_test, diabetes_y_pred))
mlflow.log_metric(
"rmse", sqrt(mean_squared_error(diabetes_y_test, diabetes_y_pred))
)
mlflow.log_metric("r2", r2_score(diabetes_y_test, diabetes_y_pred))
If you need to add to this nested run, specify the parent run run_id
in subsequent runs as a parameter, adding more child runs.
# End any existing runs
mlflow.end_run()with mlflow.start_run(run_id="61d34b13649c45699e7f05290935747c") as parent_run:
print(parent_run.info.run_id)
with mlflow.start_run(run_name="Child Run: alpha 0.2", nested=True):
# Turn autolog on to save model artifacts, requirements, etc.
mlflow.autolog(log_models=True)
diabetes_X = diabetes.data
diabetes_y = diabetes.target
# Split data into test training sets, 3:1 ratio
(
diabetes_X_train,
diabetes_X_test,
diabetes_y_train,
diabetes_y_test,
) = train_test_split(diabetes_X, diabetes_y, test_size=0.25, random_state=42)
alpha = 0.2
solver = "cholesky"
regr = linear_model.Ridge(alpha=alpha, solver=solver)
regr.fit(diabetes_X_train, diabetes_y_train)
diabetes_y_pred = regr.predict(diabetes_X_test)
# Log desired metrics
mlflow.log_metric("mse", mean_squared_error(diabetes_y_test, diabetes_y_pred))
mlflow.log_metric(
"rmse", sqrt(mean_squared_error(diabetes_y_test, diabetes_y_pred))
)
mlflow.log_metric("r2", r2_score(diabetes_y_test, diabetes_y_pred))
One thing to note about this approach is that your metrics will now be logged on every child run.
5.mlflow.search_runs()
This hack is using the search_runs()
function.
This function allows us to programmatically query the experimentation GUI, and the results are returned in a tabular format that is easy to understand and manipulate.
In the example below, we can select specific fields from our experiment runs and load them into a Pandas dataframe. Notice that the available columns far exceed those available in the experiments GUI!
# Create DataFrame of all runs in *current* experiment
df = mlflow.search_runs(order_by=["start_time DESC"])# Print a list of the columns available
# print(list(df.columns))
# Create DataFrame with subset of columns
runs_df = df[
[
"run_id",
"experiment_id",
"status",
"start_time",
"metrics.mse",
"tags.mlflow.source.type",
"tags.mlflow.user",
"tags.estimator_name",
"tags.mlflow.rootRunId",
]
].copy()
runs_df.head()
Since this is a Pandas DataFrame, we can add columns that can be useful for analysis:
# Feature engineering to create some additional columns
runs_df["start_date"] = runs_df["start_time"].dt.date
runs_df["is_nested_parent"] = runs_df[["run_id","tags.mlflow.rootRunId"]].apply(lambda x: 1 if x["run_id"] == x["tags.mlflow.rootRunId"] else 0, axis=1)
runs_df["is_nested_child"] = runs_df[["run_id","tags.mlflow.rootRunId"]].apply(lambda x: 1 if x["tags.mlflow.rootRunId"] is not None and x["run_id"] != x["tags.mlflow.rootRunId"]else 0, axis=1)
runs_df
If we want to aggregate the result set to provide information for runs over time, we can use:
pd.DataFrame(runs_df.groupby("start_date")["run_id"].count()).reset_index()
The tags.estimator_name automatic field allows us to review how many runs have been tested for each algorithm.
pd.DataFrame(runs_df.groupby("tags.estimator_name")["run_id"].count()).reset_index()
Since this is a DataFrame, we can export the data for any reporting requirements to provide the necessary visibility to users who may not have access to the workspace and compare between workspaces.
These are just a few examples of how to extend the use of MLflow functions and parameters in your experimentation process, but there are many more available in the python API.
Hopefully, this post has inspired you to explore some of the available functions and parameters and see if they can benefit your model development process. For additional information, see the API documentation and experiment with different configurations to find the one that best suits your needs.
If you’re currently using any functions or parameters that I haven’t mentioned in this post, please let me know in the comments!
All the code can be found in my GitHub repository.