Skip to main content

Framework that helps to train models, compare them and track parameters&metrics along the way.

Project description

🌿 Trava ( initially stands for TrainValidation )

Framework that helps to train models, compare them and track parameters&metrics along the way. Works with tabular data only.

pip install trava

Why

When experimenting with some data&models, notebooks quickly become messy and unreliable. Usually when we solve some problem we are focused on some set of metrics and we want to compare models with each other. This lib tries to provide unified interface for this and other tasks.

Another important thing is experiment tracking. Trava helps you to track all the model parameters as well as metrics. You may subclass TravaTracker to support you tracking system. Now Trava goes with ready-to-go MLFlowTracker.

How

You tell what metrics you want to calculate and how results should be presented to you. Then you just run Trava using a model of your choice and parameters for it. Fit&predict process is customizable as well. See examples/ dir for the details. For now only sklearn-style model are supported. ( fit, predict, predict_proba methods )

Example

Note: See examples/Basics.ipynb for the intro tour.

# what metrics to calculate. sk(...) means wrapper for sklearn metrics, custom metrics are easily supported as well.
scorers = [sk(recall_score), sk(precision_score)]

# how to show the metrics. In this case dictionary with metrics values will be returned
dict_handler = MetricsDictHandler(scorers=output_scorers)

# prepare data
df = pd.read_csv('...')

split_config = DataSplitConfig(split_logic=BasicSplitLogic(shuffle=True),
                               target_col_name='target',
                               test_size=0.3)
# just splits data into Train/Test
split_result = Splitter.split(df=df, config=split_config)

# initialize Trava
trava = TravaSV(results_handlers=[dict_handler])

# get your results
trava.fit_predict(raw_split_data=split_result, 
                  model_id='xgb',  # uniquely identifies your model
                  model_type=xgb.XGBClassifier,  # what model to run
                  model_init_params={'max_depth': 3})  # parameters to init model with

# then go on playing with other models 
...
# call this to get all previous results at once
trava.results 

Prerequisites

pandas
numpy
python 3.7 ( the true minimum version is not yet confirmed ) 

The lib was written using Python 3.7, yet I currently don't know the minimum Python version required.

It's also convenient to use the lib with sklearn ( e.g. for taking metrics from there. ). Also couple of extensions are based on sklearn classes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trava-0.1.3.tar.gz (305.8 kB view hashes)

Uploaded Source

Built Distribution

trava-0.1.3-py3-none-any.whl (43.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page