Skip to main content

CLANA is a toolkit for classifier analysis.

Project description

DOI PyPI version Python Support Documentation Status Build Status Coverage Status

clana

clana is a toolkit for classifier analysis. One key contribution of clana is Confusion Matrix Ordering (CMO) as explained in chapter 5 of Analysis and Optimization of Convolutional Neural Network Architectures. It is a technique that can be applied to any multi-class classifier and helps to understand which groups of classes are most similar.

Installation

The recommended way to install clana is:

$ pip install clana --user

If you want the latest version:

$ git clone https://github.com/MartinThoma/clana.git; cd clana
$ pip install -e . --user

Usage

$ clana --help
Usage: clana [OPTIONS] COMMAND [ARGS]...

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  distribution   Get the distribution of classes in a dataset.
  get-cm         Calculate the confusion matrix (CSV inputs).
  get-cm-simple  Calculate the confusion matrix (one label per...
  visualize      Optimize confusion matrix.

The visualize command gives you images like this:

Confusion Matrix after Confusion Matrix Ordering of the WiLI-2018 dataset

MNIST example

$ cd docs/
$ python mnist_example.py  # creates `train-pred.csv` and `test-pred.csv`
$ clana get-cm --gt gt-train.csv  --predictions train-pred.csv --n 10
2019-09-14 09:47:30,655 - root - INFO - cm was written to 'cm.json'
$ clana visualize --cm cm.json --zero_diagonal
Score: 13475
2019-09-14 09:49:41,593 - root - INFO - n=10
2019-09-14 09:49:41,593 - root - INFO - ## Starting Score: 13475.00
2019-09-14 09:49:41,594 - root - INFO - Current: 13060.00 (best: 13060.00, hot_prob_thresh=100.0000%, step=0, swap=False)
[...]
2019-09-14 09:49:41,606 - root - INFO - Current: 9339.00 (best: 9339.00, hot_prob_thresh=100.0000%, step=238, swap=False)
Score: 9339
Perm: [0, 6, 5, 8, 3, 2, 1, 7, 9, 4]
2019-09-14 09:49:41,639 - root - INFO - Classes: [0, 6, 5, 8, 3, 2, 1, 7, 9, 4]
Accuracy: 93.99%
2019-09-14 09:49:41,725 - root - INFO - Save figure at '/home/moose/confusion_matrix.tmp.pdf'
2019-09-14 09:49:41,876 - root - INFO - Found threshold for local connection: 398
2019-09-14 09:49:41,876 - root - INFO - Found 9 clusters
2019-09-14 09:49:41,877 - root - INFO - silhouette_score=-0.012313948323292875
    1: [0]
    1: [6]
    1: [5]
    1: [8]
    1: [3]
    1: [2]
    1: [1]
    2: [7, 9]
    1: [4]

This gives

Label Manipulation

Prepare a labels.csv which has to have a header row:

$ clana visualize --cm cm.json --zero_diagonal --labels mnist/labels.csv

Data distribution

$ clana distribution --gt gt.csv --labels labels.csv [--out out/] [--long]

prints one line per label, e.g.

60% cat (56789 elements)
20% dog (12345 elements)
 5% mouse (1337 elements)
 1% tux (314 elements)

If --out is specified, it creates a horizontal bar chart. The first bar is the most common class, the second bar is the second most common class, ...

It uses the short labels, except --long is added to the command.

Metrics

$ clana metrics --gt gt.csv --preds preds.csv

gives the following metrics by

  • Line 1: Accuracy
  • Line 2: Precision
  • Line 3: Recall
  • Line 4: F1-Score
  • Line 5: Mean accuracy

Visualizations

See visualizations

Development

Check tests with tox.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clana-0.3.9.tar.gz (18.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clana-0.3.9-py3-none-any.whl (21.0 kB view details)

Uploaded Python 3

File details

Details for the file clana-0.3.9.tar.gz.

File metadata

  • Download URL: clana-0.3.9.tar.gz
  • Upload date:
  • Size: 18.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.8

File hashes

Hashes for clana-0.3.9.tar.gz
Algorithm Hash digest
SHA256 231fc461ec32c87ad96c2177f7b9928022918af5601a3ad4afd5749bc6e180c1
MD5 2471f20c4e2a46a6766eb943b6fb67b8
BLAKE2b-256 df04f89da00483edb3194533f6aacc6ccdddb2e948fec771310973792665b326

See more details on using hashes here.

File details

Details for the file clana-0.3.9-py3-none-any.whl.

File metadata

  • Download URL: clana-0.3.9-py3-none-any.whl
  • Upload date:
  • Size: 21.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.8

File hashes

Hashes for clana-0.3.9-py3-none-any.whl
Algorithm Hash digest
SHA256 39a9453296557f7648c552878f6fb1c68ac4b4d3503cbe5f19926351e5832012
MD5 2c5195bb46a5de0022cf1e1ce8894fd7
BLAKE2b-256 f11881e1524988f355accf996e7df236bf26b1d0dbafcc230a7d9705fb0042c2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page