Extract calibrated explanations from machine learning models.

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Calibrated Explanations

'calibrated-explanations' is a Python package for the Calibrated Explanations method, supporting both classification and regression. The proposed method is based on Venn-Abers (classification) and Conformal Predictive Systems (regression) and has the following characteristics:

Fast, reliable, stable and robust feature importance explanations.
Calibration of the underlying model to ensure that predictions reflect reality.
Uncertainty quantification of the prediction from the underlying model and the feature importance weights.
Rules with straightforward interpretation in relation to the feature weights.
Possibility to generate counterfactual rules with uncertainty quantification of the expected predictions achieved.
Conjunctional rules conveying joint contribution between features.

Install

First, you need a Python environment installed with pip.

'calibrated-explanations' can be installed from PyPI:

pip install calibrated-explanations

The dependencies are:

Getting started

The notebooks folder contains a number of notebooks illustrating different use cases for 'calibrated_explanations'. The following are commented and should be a good start:

Let us illustrate how we may use calibrated_explanations to generate explanations from a classifier trained on a dataset from www.openml.org, which we first split into a training and a test set using train_test_split from sklearn, and then further split the training set into a proper training set and a calibration set:

from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split

dataset = fetch_openml(name="qsar-biodeg")

X = dataset.data.values.astype(float)
y = dataset.target.values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=2, stratify=y)

X_prop_train, X_cal, y_prop_train, y_cal = train_test_split(X_train, y_train,
                                                            test_size=0.25)

We now fit a model on our data.

from sklearn.ensemble import RandomForestClassifier

rf = RandomForestClassifier(n_jobs=-1)

rf.fit(X_prop_train, y_prop_train)

Lets extract explanations for our test set using the calibrated_explanations.

from calibrated_explanations import CalibratedExplainer, __version__
print(__version__)

explainer = CalibratedExplainer(rf, X_cal, y_cal)

if __version__ >= "0.0.8":
    factual_explanations = explainer.get_factuals(X_test)
else:
    factual_explanations = explainer(X_test)

Once we have the explanations, we can plot them using plot_regular or plot_uncertainty. You can also add and remove conjunctive rules.

factual_explanations.plot_regular()
factual_explanations.plot_uncertainty()

factual_explanations.add_conjunctive_factual_rules().plot_regular()
factual_explanations.remove_conjunctive_rules().plot_regular()

An alternative to factual rules is to extract counterfactual rules. From version 0.0.8, get_counterfactuals can be called to get counterfactual rules with an appropriate discretizer automatically assigned. An alternative is to first change the discretizer to entropy (for classification) and then call the CalibratedExplainer object as above.

if __version__ >= "0.0.8":
    counterfactual_explanations = explainer.get_counterfactuals(X_test)
else:
    explainer.set_discretizer('entropy')
    counterfactual_explanations = explainer(X_test)

Counterfactuals are visualized using the plot_counterfactuals. Adding or removing conjunctions is done as before.

counterfactual_explanations.plot_counterfactuals()
counterfactual_explanations.add_conjunctive_counterfactual_rules().plot_counterfactuals()
counterfactual_explanations.remove_counterfactual_rules().plot_counterfactuals()

calibrated_explanations supports multiclass which is demonstrated in demo_multiclass. That notebook also demonstrates how both feature names and target and categorical labels can be added to improve the interpretability.

Extracting explanations for regression is very similar to how it is done for classification.

dataset = fetch_openml(name="house_sales", version=3)

X = dataset.data.values.astype(float)
y = dataset.target.values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=1)

X_prop_train, X_cal, y_prop_train, y_cal = train_test_split(X_train, y_train,
                                                            test_size=0.25)

Let us now fit a RandomForestRegressor from sklearn to the proper training set:

from sklearn.ensemble import RandomForestRegressor

rf = RandomForestRegressor()
rf.fit(X_prop_train, y_prop_train)

Define a CalibratedExplainer object using the new model and data. The mode parameter must be explicitly set to regression. Regular and uncertainty plots work in the same way as for classification.

explainer = CalibratedExplainer(rf, X_cal, y_cal, mode='regression')

if __version__ >= '0.0.8':
    factual_explanations = explainer.get_factuals(X_test)
else:
    factual_explanations = explainer(X_test)

factual_explanations.plot_regular()
factual_explanations.plot_uncertainty()

factual_explanations.add_conjunctive_factual_rules().plot_regular()
factual_explanations.remove_conjunctive_rules().plot_regular()

From version 0.0.8, the get_counterfactuals will work exactly the same as for classification. Otherwise, the discretizer must be set explicitly and the 'decile' discretizer is recommended. Counterfactual plots work in the same way as for classification.

if __version__ >= '0.0.8':
    counterfactual_explanations = explainer.get_counterfactuals(X_test)
else:
    explainer.set_discretizer('decile')
    counterfactual_explanations = explainer(X_test)

counterfactual_explanations.plot_counterfactuals()
counterfactual_explanations.add_conjunctive_counterfactual_rules().plot_counterfactuals()
counterfactual_explanations.remove_counterfactual_rules().plot_counterfactuals()

Regression offers many more options but to learn more about them, see the demo_regression or the demo_probabilistic_regression notebooks.

Development

This project has tests that can be executed using pytest. Just run the following command from the project root.

pytest

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.3.3

May 25, 2024

0.3.2

Apr 14, 2024

0.3.1

Feb 23, 2024

0.3.1a0 pre-release

Feb 23, 2024

0.3.0

Jan 2, 2024

0.2.3

Nov 4, 2023

0.2.2

Oct 3, 2023

0.2.1

Sep 20, 2023

0.2.0

Sep 19, 2023

0.1.1

Sep 14, 2023

0.1.0

Sep 4, 2023

0.0.24

Aug 29, 2023

0.0.22

Aug 27, 2023

0.0.20

Aug 24, 2023

0.0.18

Aug 17, 2023

0.0.16

Aug 14, 2023

0.0.14

Aug 11, 2023

0.0.12

Aug 10, 2023

This version

0.0.10

Aug 9, 2023

0.0.8

Jul 24, 2023

0.0.6

Jul 20, 2023

0.0.6a0 pre-release

Jul 20, 2023

0.0.4

Jul 14, 2023

0.0.2

Jul 14, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

calibrated_explanations-0.0.10.tar.gz (28.1 kB view hashes)

Uploaded Aug 9, 2023 Source

Built Distribution

calibrated_explanations-0.0.10-py3-none-any.whl (24.7 kB view hashes)

Uploaded Aug 9, 2023 Python 3

Hashes for calibrated_explanations-0.0.10.tar.gz

Hashes for calibrated_explanations-0.0.10.tar.gz
Algorithm	Hash digest
SHA256	`4de1447c32d89275cb42d419e736564f115ecaa24f11fe5b136d70d5714c636d`
MD5	`5b950cf9a9c19f7aa0877282406a6f6f`
BLAKE2b-256	`0df3e84a70f6ca38a2188eb1e8feca8202ba6ca25819a41a0793e74786affb8a`

Hashes for calibrated_explanations-0.0.10-py3-none-any.whl

Hashes for calibrated_explanations-0.0.10-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9b0748997a6e86fc9a9ad7b7fc11caaa96fc286390c3a96af9795720a4f1dc60`
MD5	`1211838bdbf61c896a847309bcc4c3a3`
BLAKE2b-256	`96bafb9c982758b485f18db6e5e9e45a1846aaa76eb5a97a97277a6c7a9e852a`