Skip to main content

Python documentation generator

Project description

CircleCI

py-ciu

Explainable Machine Learning through Contextual Importance and Utility

The py-ciu library provides methods to generate post-hoc explanations for machine learning-based classifiers. It is model agnostic and answers the following questions, given a classification decision:

  • How important is a specific feature or feature combination for the classification decision? (Contextual Importance, CI)

  • How typical is a specific feature or feature combination for the given class? (Contextual Utility, CU)

Usage

Install py-ciu:

pip install py-ciu

Import the library:

from ciu import determine_ciu

For the sake of the example, let us also import a data generator, create a synthetic data set, and train a model:

from sklearn.ensemble import RandomForestClassifier
from ciu_tests.loan_data_generator import generate_data

data = generate_data()
train_data = data['train'][1]
test_data = data
test_data_encoded = data['test'][1].drop(['approved'], axis=1)
random_forest = RandomForestClassifier(
    n_estimators=1000,
    random_state=42
)

labels = train_data[['approved']].values.ravel()
data = train_data.drop(['approved'], axis=1)
random_forest.fit(data, labels)

Then we classify the case we want to explain and determine the prediction index for the class we are interested in:

feature_names = [
    'age', 'assets', 'monthly_income', 'gender_female', 'gender_male',
    'gender_other', 'job_type_fixed', 'job_type_none', 'job_type_permanent'
]

case = test_data_encoded.values[0]
example_prediction = random_forest.predict([test_data_encoded.values[0]])
example_prediction_prob = random_forest.predict_proba([test_data_encoded.values[0]])
prediction_index = 0 if example_prediction[0] > 0.5 else 1

print(feature_names)
print(f'Case: {case}; Prediction {example_prediction}; Probability: {example_prediction_prob}')

Now, we can call py-ciu's determine_ciu function. The function takes the following parameters:

  • case_data: A dictionary that contains the data of the case.

  • predictor: The prediction function of the black-box model py-ciu should call.

  • min_maxs: A dictionary that contains the feature name (key) and a list of minimal, maximal value, plus a value that indicates if the value has to be an integer ('feature_name': [min, max, is_int]).

  • samples (optional): The number of samples py-ciu will generate. Defaults to 1000.

  • prediction_index (optional): In case the model returns several predictions, it is possible to provide the index of the relevant prediction. Defaults to None.

  • category_mapping (optional): A mapping of one-hot encoded categorical variables to lists of categories and category names. Defaults to None.

  • feature_interactions (optional): A list of {key: list} tuples of features whose interactions should be evaluated. Defaults to [].

We configure the CIU parameters and call the CIU function:

category_mapping = {
    'gender': ['gender_female', 'gender_male', 'gender_other'],
    'job_type': ['job_type_fixed', 'job_type_none', 'job_type_permanent']
}

feature_interactions = [{'assets_income': ['assets', 'monthly_income']}]

ciu = determine_ciu(
    test_data_encoded.iloc[0, :].to_dict(),
    random_forest.predict_proba,
    {
        'age': [20, 70, True],
        'assets': [-20000, 150000, True],
        'monthly_income': [0, 20000, True],
        'gender_female': [0, 1, True],
        'gender_male': [0, 1, True],
        'gender_other': [0, 1, True],
        'job_type_fixed': [0, 1, True],
        'job_type_none': [0, 1, True],
        'job_type_permanent': [0, 1, True]
    },
    1000,
    prediction_index,
    category_mapping,
    feature_interactions
)

The function returns a ciu object, from which we retrieve the CIU metrics:

print(ciu.ci, ciu.cu)

We can also auto-generate CIU plots:

ciu.plot_ci()
ciu.plot_cu()

Moreover, we can generate textual explanations based on CIU:

print(ciu.text_explain())

Take a look at the examples directory to learn more.

Authors

License

The library is released under the BSD Clause-2 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py-ciu-0.0.2.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

py_ciu-0.0.2-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file py-ciu-0.0.2.tar.gz.

File metadata

  • Download URL: py-ciu-0.0.2.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.5

File hashes

Hashes for py-ciu-0.0.2.tar.gz
Algorithm Hash digest
SHA256 b4bd8817bf428e5196fcec6a168f11307bf479996c5a1c8e0da91681f2633858
MD5 c1fc025aebe93372d858823ac09b7647
BLAKE2b-256 fb4f863db7696a0e00c5558aa2d8d4e819e929349fc637649c07415221a82d16

See more details on using hashes here.

File details

Details for the file py_ciu-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: py_ciu-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.5

File hashes

Hashes for py_ciu-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 afb5f88434555c7c34df62f2f3c2e75f3591e74a46240f1113c5fa9b52f4769f
MD5 21b2acf9521e9cd02ba576a1ebdc21a6
BLAKE2b-256 b584e3f95e9b4383f4a3a389d43a486a4b0c8b1b0867e1d15e668965fc329315

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page