Skip to main content

Library for bayesian active learning.

Project description

Bayesian Active Learning (Baal)

CircleCI Documentation Status Gitter

BaaL is an active learning library developed at ElementAI. This repository contains techniques and reusable components to make active learning accessible for all.

Read the documentation at https://baal.readthedocs.io.

Our paper can be read on arXiv. It includes tips and tricks to make active learning usable in production.

You can also read our blog post.

Installation and requirements

BaaL requires Python>=3.6.

To install baal using pip: pip install baal

To install baal from source: pip install -e .

For requirements please see: requirements.txt.

What is Active Learning?

Active learning is a special case of machine learning in which a learning algorithm is able to interactively query the user (or some other information source) to obtain the desired outputs at new data points (to understand the concept in more depth, refer to our tutorial).

BaaL Framework

At the moment BaaL supports the following methods to perform active learning.

  • Monte-Carlo Dropout (Gal et al. 2015)
  • MCDropConnect (Mobiny et al. 2019)

Please see our Roadmap below.

The Monte-Carlo Dropout method is a known approximation for Bayesian neural networks. In this method, the dropout layer is used both in training and test time. By running the model multiple times whilst randomly dropping weights, we calculate the uncertainty of the prediction using one of the uncertainty measurements in heuristics.py.

The framework consists of four main parts, as demonstrated in the flowchart below:

  • ActiveLearningDataset
  • Heuristics
  • ModelWrapper
  • ActiveLearningLoop

To get started, wrap your dataset in our ActiveLearningDataset class. This will ensure that the dataset is split into training and pool sets. The pool set represents the portion of the training set which is yet to be labelled.

We provide a lightweight object ModelWrapper similar to keras.Model to make it easier to train and test the model. If your model is not ready for active learning, we provide Modules to prepare them.

For example, the MCDropoutModule wrapper changes the existing dropout layer to be used in both training and inference time and the ModelWrapper makes the specifies the number of iterations to run at training and inference.

In conclusion, your script should be similar to this:

dataset = ActiveLearningDataset(your_dataset)
dataset.label_randomly(INITIAL_POOL)  # label some data
model = MCDropoutModule(your_model)
model = ModelWrapper(model, your_criterion)
active_loop = ActiveLearningLoop(dataset,
                                 get_probabilities=model.predict_on_dataset,
                                 heuristic=heuristics.BALD(shuffle_prop=0.1),
                                 ndata_to_label=NDATA_TO_LABEL)
for al_step in range(N_ALSTEP):
    model.train_on_dataset(dataset, optimizer, BATCH_SIZE, use_cuda=use_cuda)
    if not active_loop.step():
        # We're done!
        break

For a complete experiment, we provide experiments/ to understand how to write an active training process. Generally, we use the ActiveLearningLoop provided at src/baal/active/active_loop.py. This class provides functionality to get the predictions on the unlabeled pool after each (few) epoch(s) and sort the next set of data items to be labeled based on the calculated uncertainty of the pool.

Roadmap (Subject to change depending on the community.)

  • Initial FOSS release with MCDropout (Gal et al. 2015)
  • MCDropConnect (Mobiny et al. 2019)
  • Bayesian layers (Shridhar et al. 2019)
  • Unsupervised methods
  • NNGP (Panov et al. 2019)
  • SWAG (Zellers et al. 2018)

Re-run our Experiments

nvidia-docker build [--target base_baal] -t baal .
nvidia-docker run --rm baal python3 experiments/vgg_mcdropout_cifar10.py 

Use BaaL for YOUR Experiments

Simply clone the repo, and create your own experiment script similar to the example at experiments/vgg_experiment.py. Make sure to use the four main parts of BaaL framework. Happy running experiments

Dev install

Simply build the Dockerfile as below:

git clone git@github.com:ElementAI/baal.git
nvidia-docker build [--target base_baal] -t baal-dev .

Now you have all the requirements to start contributing to BaaL. YEAH!

Contributing!

To contribute, see CONTRIBUTING.md.

Who We Are!

"There is passion, yet peace; serenity, yet emotion; chaos, yet order."

At ElementAI, the BaaL team tests and implements the most recent papers on uncertainty estimation and active learning. The BaaL team is here to serve you!

How to cite

If you used BaaL in one of your project, we would greatly appreciate if you cite this library using this Bibtex:

@misc{atighehchian2019baal,
  title={BaaL, a bayesian active learning library},
  author={Atighehchian, Parmida and Branchaud-Charron, Frederic and Freyberg, Jan and Pardinas, Rafael and Schell, Lorne},
  year={2019},
  howpublished={\url{https://github.com/ElementAI/baal/}},
}

Licence

To get information on licence of this API please read LICENCE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

baal-1.2.1.tar.gz (35.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

baal-1.2.1-py3-none-any.whl (45.5 kB view details)

Uploaded Python 3

File details

Details for the file baal-1.2.1.tar.gz.

File metadata

  • Download URL: baal-1.2.1.tar.gz
  • Upload date:
  • Size: 35.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0.post20200714 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for baal-1.2.1.tar.gz
Algorithm Hash digest
SHA256 ea7b20ce5e5dac9f2d70ef3de999a1f724c0d7fdeeb7d6171d56cc2305eed9d1
MD5 c362b35f0173f7faa553bdcf678ef18d
BLAKE2b-256 4cfd42c695ec83753081857fa4bb94f80d12d55b4349a207489c0e9c77081b09

See more details on using hashes here.

File details

Details for the file baal-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: baal-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 45.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0.post20200714 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for baal-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9135ae99aeaad5533eb5b394d277f85f4bd01d52e166d7f2b1161e61c7f82245
MD5 3264ab4d6c5ce576c355b85d4a827c3e
BLAKE2b-256 8e1901644be185f3dcb0166bc703d6efc3ca9cb90923f13c75f96af34022387c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page