Beginner's Guide to Machine Learning Testing with DeepChecks

Author's image | canva

Deep checks is a Python package that provides a wide variety of built-in checks to detect problems with model performance, data distribution, data integrity, and more.

In this tutorial, we will learn about DeepChecks and use it to validate the dataset and test the trained machine learning model to generate a complete report. We will also learn how to test models in specific tests instead of generating full reports.

Why do we need machine learning tests?

Machine learning testing is essential to ensure the reliability, fairness, and security of ai models. It helps verify model performance, detect bias, improve security against adversarial attacks, especially on large language models (LLM), ensure regulatory compliance, and enable continuous improvement. Tools like Deepchecks provide a comprehensive testing solution that addresses all aspects of ai and ML validation, from research to production, making them invaluable for developing robust and reliable ai systems.

Getting started with DeepChecks

In this getting started guide, we will load the dataset and perform a data integrity test. This critical step ensures that our data set is reliable and accurate, paving the way for successful model training.

We will start by installing the DeepChecks Python package using the `pip` command.

!pip install deepchecks --upgrade

Import essential Python packages.
Load the dataset using the pandas library, which consists of 569 samples and 30 functions. He Cancer classification The data set is derived from digitized images of fine needle aspirations (FNACs) of breast masses, where each feature represents a characteristic of the cell nuclei present in the image. These characteristics allow us to predict whether the cancer is benign or malignant.
Split the data set into training and testing using the target column 'benign_0__mal_1'.

import pandas as pd
from sklearn.model_selection import train_test_split

# Load Data
cancer_data = pd.read_csv("/kaggle/input/cancer-classification/cancer_classification.csv")
label_col="benign_0__mal_1"
df_train, df_test = train_test_split(cancer_data, stratify=cancer_data(label_col), random_state=0)

Create the DeepChecks dataset by providing additional metadata. Since our data set has no categorical features, we leave the argument empty.

from deepchecks.tabular import Dataset

ds_train = Dataset(df_train, label=label_col, cat_features=())
ds_test =  Dataset(df_test,  label=label_col, cat_features=())

Run the data integrity test on the train data set.

from deepchecks.tabular.suites import data_integrity

integ_suite = data_integrity()
integ_suite.run(ds_train)

The report generation will take a few seconds.

The data integrity report contains test results on:

Feature-feature correlation
Feature-tag correlation
Unique value in column
Special characters
mixed nulls
Mixed data types
Mismatched string
Data Duplicates
Rope length out of limits
Conflicting tags
Detection of atypical samples

Testing machine learning models

Let's train our model and then run a model evaluation suite to learn more about the model's performance.

Load essential Python packages.
Create three machine learning models (logistic regression, random forest classifier, and Gaussian NB).
Put them together using the voting classifier.
Fit the ensemble model on the training data set.

from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import VotingClassifier

# Train Model
clf1 = LogisticRegression(random_state=1,max_iter=10000)
clf2 = RandomForestClassifier(n_estimators=50, random_state=1)
clf3 = GaussianNB()

V_clf = VotingClassifier(
    estimators=(('lr', clf1), ('rf', clf2), ('gnb', clf3)),
    voting='hard')

V_clf.fit(df_train.drop(label_col, axis=1), df_train(label_col));

After the training phase is complete, run the DeepChecks model evaluation suite using the training and testing data sets and the model.

from deepchecks.tabular.suites import model_evaluation

evaluation_suite = model_evaluation()
suite_result = evaluation_suite.run(ds_train, ds_test, V_clf)
suite_result.show()

The model evaluation report contains test results on:

Unused Features: Train Dataset
Unused Features: Test Data Set
Train test performance
Prediction drift
Comparison of simple models
Model inference time: train data set
Model inference time: test data set
Confusion Matrix Report: Train Dataset
Confusion matrix report: test data set

There are other tests available in the suite that were not run due to the ensemble model type. If you ran a simple model like logistic regression, you may have gotten a complete report.

If you want to use a model evaluation report in a structured format, you can always use the `.to_json()` function to convert your report to JSON format.

Additionally, you can also save this interactive report as a web page using the .save_as_html() function.

Running the single check

If you don't want to run the entire model evaluation test suite, you can also test your model in a single check.

For example, you can check the bias of the labels by providing the training and testing data set.

from deepchecks.tabular.checks import LabelDrift
check = LabelDrift()
result = check.run(ds_train, ds_test)
result

As a result, you will get a distribution graph and a drift score.

You can even extract the value and methodology from the drift score.

{'Drift score': 0.0, 'Method': "Cramer's V"}

Conclusion

The next step in your learning journey is to automate the machine learning testing process and track performance. You can do it with GitHub Actions by following the Deep checks in CI/CD guide.

In this beginner's guide, we have learned how to generate data validation and machine learning evaluation reports using DeepChecks. If you are having trouble running the code, I suggest you take a look at the Machine learning testing with DeepChecks Kaggle Notebook and launch it yourself.

Abid Ali Awan (@1abidaliawan) is a certified professional data scientist who loves building machine learning models. Currently, he focuses on content creation and writing technical blogs on data science and machine learning technologies. Abid has a master's degree in technology management and a bachelor's degree in telecommunications engineering. His vision is to build an artificial intelligence product using a graph neural network for students struggling with mental illness.

Beginner's Guide to Machine Learning Testing with DeepChecks

Technical Terrence Team

Xeodis offers more than 2000 assets to its users

Leave a Reply Cancel reply

Recommended.

Solana Posts More Volume Than Ethereum, Layers 2: $200 Incoming?

More talented students need accelerated pathways

3 questions: perfecting robot perception and mapping | MIT News

Pennsylvania House Passes 'Bitcoin Rights' Bill with Bipartisan Support

Scammers Leverage Malicious ETH RPC Nodes to Target imToken Wallet

Categories

Important Links

Beginner's Guide to Machine Learning Testing with DeepChecks

Why do we need machine learning tests?

Getting started with DeepChecks

Testing machine learning models

Running the single check

Conclusion

Related

Technical Terrence Team

Xeodis offers more than 2000 assets to its users

Leave a Reply Cancel reply

Recommended.

Solana Posts More Volume Than Ethereum, Layers 2: $200 Incoming?

More talented students need accelerated pathways

3 questions: perfecting robot perception and mapping | MIT News

Pennsylvania House Passes 'Bitcoin Rights' Bill with Bipartisan Support

Scammers Leverage Malicious ETH RPC Nodes to Target imToken Wallet

Categories

Important Links

Get daily news updates to your inbox!