Bayesian Tuning and Bandits

These details have not been verified by PyPI

Project links

Homepage

Project description

BTB An open source project from Data to AI Lab at MIT.

A simple, extensible backend for developing auto-tuning systems.

License: MIT
Development Status: Pre-Alpha
Documentation: https://HDI-Project.github.io/BTB
Homepage: https://github.com/HDI-Project/BTB

Overview

BTB ("Bayesian Tuning and Bandits") is a simple, extensible backend for developing auto-tuning systems such as AutoML systems. It provides an easy-to-use interface for tuning models and selecting between models.

It is currently being used in several AutoML systems:

ATM, a distributed, multi-tenant AutoML system for classifier tuning
MIT's system for the DARPA Data-driven discovery of models (D3M) program
AutoBazaar, a flexible, general-purpose AutoML system

Try it out now!

If you want to quickly discover BTB, simply click the button below and follow the tutorials!

Install

Requirements

BTB has been developed and tested on Python 3.5, 3.6 and 3.7

Also, although it is not strictly required, the usage of a virtualenv is highly recommended in order to avoid interfering with other software installed in the system where BTB is run.

Install with pip

The easiest and recommended way to install BTB is using pip:

pip install baytune

This will pull and install the latest stable release from PyPi.

If you want to install from source or contribute to the project please read the Contributing Guide.

Quickstart

In this short tutorial we will guide you through the necessary steps to get started using BTB to select between models and tune a model to solve a Machine Learning problem.

In particular, in this example we will be using BTBSession to perform solve the Wine classification problem by selecting between the DecisionTreeClassifier and the SGDClassifier models from scikit-learn while also searching for their best hyperparameter configuration.

Prepare a scoring function

The first step in order to use the BTBSession class is to develop a scoring function.

This is a Python function that, given a model name and a hyperparameter configuration, evaluates the performance of the model on your data and returns a score.

from sklearn.datasets import load_wine
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import f1_score, make_scorer
from sklearn.model_selection import cross_val_score
from sklearn.tree import DecisionTreeClassifier


dataset = load_wine()
models = {
    'DTC': DecisionTreeClassifier,
    'SGDC': SGDClassifier,
}

def scoring_function(model_name, hyperparameter_values):
    model_class = models[model_name]
    model_instance = model_class(**hyperparameter_values)
    scores = cross_val_score(
        estimator=model_instance,
        X=dataset.data,
        y=dataset.target,
        scoring=make_scorer(f1_score, average='macro')
    )
    return scores.mean()

Define the tunable hyperparameters

The second step is to define the hyperparameters that we want to tune for each model as Tunables.

from btb.tuning import Tunable
from btb.tuning import hyperparams as hp

tunables = {
    'DTC': Tunable({
        'max_depth': hp.IntHyperParam(min=3, max=200),
        'min_samples_split': hp.FloatHyperParam(min=0.01, max=1)
    }),
    'SGDC': Tunable({
        'max_iter': hp.IntHyperParam(min=1, max=5000, default=1000),
        'tol': hp.FloatHyperParam(min=1e-3, max=1, default=1e-3),
    })
}

Start the searching process

Once you have defined a scoring function and the tunable hyperparameters specification of your models, you can start the searching for the best model and hyperparameter configuration by using the btb.BTBSession.

All you need to do is create an instance passing the tunable hyperparameters scpecification and the scoring function.

from btb import BTBSession

session = BTBSession(
    tunables=tunables,
    scorer=scoring_function
)

And then call the run method indicating how many tunable iterations you want the BTBSession to perform:

best_proposal = session.run(20)

The result will be a dictionary indicating the name of the best model that could be found and the hyperparameter configuration that was used:

{
    'id': '826aedc2eff31635444e8104f0f3da43',
    'name': 'DTC',
    'config': {
        'max_depth': 21,
        'min_samples_split': 0.044010284821858835
    },
    'score': 0.907229308339589
}

How does BTB perform?

We have a comprehensive benchmarking framework that we use to evaluate the performance of our Tuners. For every release, we perform benchmarking against 100's of challenges, comparing tuners against each other in terms of number of wins. We present the latest leaderboard from latest release below:

Number of Wins on latest Version

tuner	with ties	without ties
`Ax.optimize`	220	32
`BTB.GCPEiTuner`	139	2
`BTB.GCPTuner`	252	90
`BTB.GPEiTuner`	208	16
`BTB.GPTuner`	213	24
`BTB.UniformTuner`	177	1
`HyperOpt.tpe`	186	6
`SMAC.HB4AC`	180	4
`SMAC.SMAC4HPO_EI`	220	31
`SMAC.SMAC4HPO_LCB`	205	16
`SMAC.SMAC4HPO_PI`	221	35

Detailed results from which this summary emerged are available here.
If you want to compare your own tuner, follow the steps in our benchmarking framework here.
If you have a proposal for tuner that we should include in our benchmarking get in touch with us at dailabmit@gmail.com.

Citing BTB

If you use BTB, please consider citing the following paper:

@article{smith2019mlbazaar,
  author = {Smith, Micah J. and Sala, Carles and Kanter, James Max and Veeramachaneni, Kalyan},
  title = {The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development},
  journal = {arXiv e-prints},
  year = {2019},
  eid = {arXiv:1905.08942},
  pages = {arxiv:1904.09535},
  archivePrefix = {arXiv},
  eprint = {1905.08942},
}

History

0.3.12 - 2020-09-08

In this release BTB includes two new tuners, GCP and GCPEi. which use a GaussianProcessRegressor meta-model from sklearn.gaussian_process applying copulas.univariate.Univariate transformations to the input data and afterwards reverts it for the predictions.

Resolved Issues

Issue #15: Implement a GaussianCopulaProcessRegressor.
Issue #205: Separate datasets from MLChallenge.
Issue #208: Implement GaussianCopulaProcessMetaModel.

0.3.11 - 2020-06-12

With this release we fix the AX.optimize tuning function by casting the values of the hyperparameters to the type of value that they represent.

Resolved Issues

Issue #201: Fix AX.optimize malfunction.

0.3.10 - 2020-05-29

With this release we integrate a new tuning library, SMAC, with our benchmarking process. A new leaderboard including this library has been generated. The following two tuners from this library have been added:

SMAC4HPO: Bayesian optimization using a Random Forest model of pyrfr.
HB4AC: Uses Successive Halving for proposals.

Internal improvements

Renamed btb_benchmark/tuners to btb_benchmark/tuning_functions.
Ready to use tuning functions from btb_benchmark/tuning_functions.

Resolved Issues

Issue #195: Integrate SMAC for benchmarking.

0.3.9 - 2020-05-18

With this release we integrate a new tuning library, Ax, with our benchmarking process. A new leaderboard including this library has been generated.

Resolved Issues

Issue #194: Integrate Ax for benchmarking.

0.3.8 - 2020-05-08

This version adds a new functionality which allows running the benchmarking framework on a Kubernetes cluster. By doing this, the benchmarking process can be executed distributedly, which reduces the time necessary to generate a new leaderboard.

Internal improvements

btb_benchmark.kubernetes.run_dask_function: Run dask function inside a pod using the given config.
btb_benchmark.kubernetes.run_on_kubernetes: Start a Dask Cluster using dask-kubernetes and run a function.
Documentation updated.
Jupyter notebooks with examples on how to run the benchmarking process and how to run it on kubernetes.

0.3.7 - 2020-04-15

This release brings a new benchmark framework with public leaderboard. As part of our benchmarking efforts we will run the framework at every release and make the results public. In each run we compare it to other tuners and optimizer libraries. We are constantly adding new libraries for comparison. If you have suggestions for a tuner library we should include in our compraison, please contact us via email at dailabmit@gmail.com.

Resolved Issues

Issue #159: Implement more MLChallenges and generate a public leaderboard.
Issue #180: Update BTB Benchmarking module.
Issue #182: Integrate HyperOPT with benchmarking.
Issue #184: Integrate dask to bencharking.

0.3.6 - 2020-03-04

This release improves BTBSession error handling and allows Tunables with cardinality equal to 1 to be scored with BTBSession. Also, we provide a new documentation for this version of BTB.

Internal Improvements

Improved documentation, unittests and integration tests.

Resolved Issues

Issue #164: Improve documentation for v0.3.5+.
Issue #166: Wrong erro raised by BTBSession on too many errors.
Issue #170: Tuner has no scores attribute until record is run once.
Issue #175: BTBSession crashes when record is not performed.
Issue #176: BTBSession fails to select a proper Tunable when normalized_scores becomse None.

0.3.5 - 2020-01-21

With this release we are improving BTBSession by adding private attributes, or not intended to be public / modified by the user and also improving the documentation of it.

Internal Improvements

Improved docstrings, unittests and public interface of BTBSession.

Resolved Issues

Issue #162: Fix session with the given comments on PR 156.

0.3.4 - 2019-12-24

With this release we introduce a BTBSession class. This class represents the process of selecting and tuning several tunables until the best possible configuration fo a specific scorer is found. We also have improved and fixed some minor bugs arround the code (described in the issues below).

New Features

BTBSession that makes BTB more user friendly.

Internal Improvements

Improved unittests, removed old dependencies, added more MLChallenges and fixed an issue with the bound methods.

Resolved Issues

Issue #145: Implement BTBSession.
Issue #155: Set defaut to None for CategoricalHyperParam is not possible.
Issue #157: Metamodel _MODEL_KWARGS_DEFAULT becomes mutable.
Issue #158: Remove mock dependency from the package.
Issue #160: Add more Machine Learning Challenges and more estimators.

0.3.3 - 2019-12-11

Fix a bug where creating an instance of Tuner ends in an error.

Internal Improvements

Improve unittests to use spec_set in order to detect errors while mocking an object.

Resolved Issues

Issue #153: Bug with tunner logger message that avoids creating the Tunner.

0.3.2 - 2019-12-10

With this release we add the new benchmark challenge MLChallenge which allows users to perform benchmarking over datasets with machine learning estimators, and also some new features to make the workflow easier.

New Features

New MLChallenge challenge that allows performing crossvalidation over datasets and machine learning estimators.
New from_dict function for Tunable class in order to instantiate from a dictionary that contains information over hyperparameters.
New default value for each hyperparameter type.

Resolved Issues

Issue #68: Remove btb.tuning.constants module.
Issue #120: Tuner repr not helpful.
Issue #121: HyperParameter repr not helpful.
Issue #141: Imlement propper logging to the tuning section.
Issue #150: Implement Tunable from_dict.
Issue #151: Add default value for hyperparameters.
Issue #152: Support None as a choice in CategoricalHyperPrameters.

0.3.1 - 2019-11-25

With this release we introduce a benchmark module for BTB which allows the users to perform a benchmark over a series of challenges.

New Features

New benchmark module.
New submodule named challenges to work toghether with benchmark module.

Resolved Issues

Issue #139: Implement a Benchmark for BTB

0.3.0 - 2019-11-11

With this release we introduce an improved BTB that has a major reorganization of the project with emphasis on an easier way of interacting with BTB and an easy way of developing, testing and contributing new acquisition functions, metamodels, tuners and hyperparameters.

New project structure

The new major reorganization comes with the btb.tuning module. This module provides everything needed for the tuning process and comes with three new additions Acquisition, Metamodel and Tunable. Also there is an update to the Hyperparamters and Tuners. This changes are meant to help developers and contributors to easily develop, test and contribute new Tuners.

New API

There is a slightly new way of using BTB as the new Tunable class is introduced, that is meant to be the only requiered object to instantiate a Tuner. This Tunable class represents a collection of HyperParams that need to be tuned as a whole, at once. Now, in order to create a Tuner, a Tunable instance must be created first with the hyperparameters of the objective function.

New Features

New Hyperparameters that allow an easier interaction for the final user.
New Tunable class that manages a collection of Hyperparameters.
New Tuner class that is a python mixin that requieres of Acquisition and Metamodel as parents. Also now works with a single Tunable object.
New Acquisition class, meant to implement an acquisition function to be inherit by a Tuner.
New Metamodel class, meant to implement everything that a certain model needs and be inherit by the Tuner.
Reorganization of the selection module to follow a similar API to tuning.

Resolved Issues

Issue #131: Reorganize the project structure.
Issue #133: Implement Tunable class to control a list of hyperparameters.
Issue #134: Implementation of Tuners for the new structure.
Issue #140: Reorganize selectors.

0.2.5

Bug Fixes

Issue #115: HyperParameter subclass instantiation not working properly

0.2.4

Internal Improvements

Issue #62: Test for None in HyperParameter.cast instead of HyperParameter.__init__

Bug fixes

Issue #98: Categorical hyperparameters do not support None as input
Issue #89: Fix the computation of avg_rewards in BestKReward

0.2.3

Bug Fixes

Issue #84: Error in GP tuning when only one parameter is present bug
Issue #96: Fix pickling of HyperParameters
Issue #98: Fix implementation of the GPEi tuner

0.2.2

Internal Improvements

Updated documentation

Bug Fixes

Issue #94: Fix unicode param_type caused error on python 2.

0.2.1

Bug fixes

Issue #74: ParamTypes.STRING tunables do not work

0.2.0

New Features

New Recommendation module
New HyperParameter types
Improved documentation and examples
Fully tested Python 2.7, 3.4, 3.5 and 3.6 compatibility
HyperParameter copy and deepcopy support
Replace print statements with logging

Internal Improvements

Integrated with Travis-CI
Exhaustive unit testing
New implementation of HyperParameter
Tuner builds a grid of real values instead of indices
Resolve Issue #29: Make args explicit in __init__ methods
Resolve Issue #34: make all imports explicit

Bug Fixes

Fix error from mixing string/numerical hyperparameters
Inverse transform for categorical hyperparameter returns single item

0.1.2

Issue #47: Add missing requirements in v0.1.1 setup.py
Issue #46: Error on v0.1.1: 'GP' object has no attribute 'X'

0.1.1

First release.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.5.0

Jul 28, 2023

0.4.0

Dec 30, 2020

0.3.13.dev0 pre-release

Nov 20, 2020

This version

0.3.12

Sep 8, 2020

0.3.12.dev0 pre-release

Jul 27, 2020

0.3.11

Jun 12, 2020

0.3.10

May 29, 2020

0.3.9

May 18, 2020

0.3.9.dev0 pre-release

May 13, 2020

0.3.8

May 8, 2020

0.3.7

Apr 15, 2020

0.3.6

Mar 4, 2020

0.3.6.dev1 pre-release

Feb 25, 2020

0.3.6.dev0 pre-release

Feb 12, 2020

0.3.5

Jan 21, 2020

0.3.4

Dec 24, 2019

0.3.3

Dec 11, 2019

0.3.2

Dec 10, 2019

0.3.1

Nov 25, 2019

0.3.0

Nov 12, 2019

0.2.5

Mar 15, 2019

0.2.4

Jan 21, 2019

0.2.3

Nov 14, 2018

0.2.2

Oct 11, 2018

0.2.1

Jun 5, 2018

0.2.0

Jun 4, 2018

0.1.2

May 3, 2018

0.1.1

Apr 28, 2018

0.1.0

Apr 26, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

baytune-0.3.12.tar.gz (114.3 kB view details)

Uploaded Sep 8, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

baytune-0.3.12-py2.py3-none-any.whl (43.1 kB view details)

Uploaded Sep 8, 2020 Python 2Python 3

File details

Details for the file baytune-0.3.12.tar.gz.

File metadata

Download URL: baytune-0.3.12.tar.gz
Upload date: Sep 8, 2020
Size: 114.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.12

File hashes

Hashes for baytune-0.3.12.tar.gz
Algorithm	Hash digest
SHA256	`e63a94d4861ee3f0199104b4050fe6ce1ba31b2903336a2d2bc442898e87e62b`
MD5	`2951de8f62a6b60e29fbf308bd861d43`
BLAKE2b-256	`7777e5d0de45bd5c7e26fabf37a5bcd8f49defaf0f58a86f19f6d7714c2f81c7`

See more details on using hashes here.

File details

Details for the file baytune-0.3.12-py2.py3-none-any.whl.

File metadata

Download URL: baytune-0.3.12-py2.py3-none-any.whl
Upload date: Sep 8, 2020
Size: 43.1 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.12

File hashes

Hashes for baytune-0.3.12-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`faa6356bac2e96b48d58084a9cd9f854e30cab29b31a34fe51c6dd004ce28180`
MD5	`fb559bff0287fd45799e688648d8eaa9`
BLAKE2b-256	`581006ca0a3e5aba1460d29b4d8319145dd9e3f7cc16efe4e21d054cc8d00b44`

See more details on using hashes here.

baytune 0.3.12

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Overview

Try it out now!

Install

Requirements

Install with pip

Quickstart

Prepare a scoring function

Define the tunable hyperparameters

Start the searching process

How does BTB perform?

Number of Wins on latest Version

More tutorials

Citing BTB

History

0.3.12 - 2020-09-08

Resolved Issues

0.3.11 - 2020-06-12

Resolved Issues

0.3.10 - 2020-05-29

Internal improvements

Resolved Issues

0.3.9 - 2020-05-18

Resolved Issues

0.3.8 - 2020-05-08

Internal improvements

0.3.7 - 2020-04-15

Resolved Issues

0.3.6 - 2020-03-04

Internal Improvements

Resolved Issues

0.3.5 - 2020-01-21

Internal Improvements

Resolved Issues

0.3.4 - 2019-12-24

New Features

Internal Improvements

Resolved Issues

0.3.3 - 2019-12-11

Internal Improvements

Resolved Issues

0.3.2 - 2019-12-10

New Features

Resolved Issues

0.3.1 - 2019-11-25

New Features

Resolved Issues

0.3.0 - 2019-11-11

New project structure

New API

New Features

Resolved Issues

0.2.5

Bug Fixes

0.2.4

Internal Improvements

Bug fixes

0.2.3

Bug Fixes

0.2.2

Internal Improvements

Bug Fixes

0.2.1

Bug fixes

0.2.0

New Features

Internal Improvements

Bug Fixes

0.1.2

0.1.1

Project details

Verified details