Core functionality for lightweight, collaborative data science projects
Project description
ballet
A lightweight framework for collaborative data science projects through feature engineering.
Ballet projects maintain a feature engineering pipeline invariant: at any point, the code and features within a project repository can be used for end-to-end feature engineering for a given dataset. To expand on an existing feature engineering pipeline, well-structured feature source code submissions can be proposed by contributors and extensively validated for compatibility and performance.
Ballet provides the following functionality:
ballet-quickstart
, a command to generate a new predictive modeling project that uses Ballet frameworkFeature
objects, that store feature metadata as well as a robustDelegatingRobustTransformer
transformer pipeline built alongside thesklearn_pandas
project.ballet.eng
, a library of versatile transformers and transformer building blocks for developing features that learn.- an extensive feature validation suite, that checks project structure and feature API adherence and runs a streaming logical feature selection algorithm.
Ballet is under active development, please report all bugs.
- Free software: MIT license
- Documentation: https://hdi-project.github.io/ballet
- Homepage: https://github.com/HDI-Project/ballet
History
0.5 (2018-10-14)
- Add project template and ballet-quickstart command
- Add project structure checks and feature API checks
- Implement multi-stage validation routine driver
0.4 (2018-09-21)
- Implement
Modeler
for versatile modeling and evaluation - Change project name
0.3 (2018-04-28)
- Implement
PullRequestFeatureValidator
- Add
util.travis
,util.modutil
,util.git
util modules
0.2
- Implement
ArrayLikeEqualityTestingMixin
- Implement
get_contrib_features
0.1
- First release on PyPI
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ballet-0.5.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 81a40a39ef8f5b115e23b3dfdb9e7a361f59e27dc7c1851f2b0005a6c0fdbad2 |
|
MD5 | e7c10acd0a41dbc7e26ea3473f8d6412 |
|
BLAKE2b-256 | c169762be4d5a7a05adf4bd3ea7a2ecba3553a38259a5faba963fef991f8e134 |