Very Basic package make some usefull log transformation for scikit-learn
Project description
Scikit-transformers : Scikit-learn + Custom transformers
About
Basic package to enable usefull transformers in scikit-learn pipelines.
First transformer implemented is a LogTransformer, which is a simple wrapper around the numpy log function.
Installation
Using regular pip and venv tools :
python3 -m venv .venv
source .venv/bin/activate
pip install scikit-transformers
Usage
For a very basic usage :
import pandas as pd
from sktransf import LogTransformer
df = pd.DataFrame(
{ "a": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
"b": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
}
)
logger = LogTransformer()
logger.fit_transform(df)
df_transf = logger.transform(df)
Using common transformers :
import pandas as pd
from sktransf import LogTransformer, DropUniqueColumnTransformer, BoolColumnTransformer
df = pd.DataFrame(
{ "a": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
"b": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
}
)
df_bool = BoolColumnTransformer().fit_transform(df)
df_unique = DropUniqueColumnTransformer().fit_transform(df)
df_logged = LogTransformer().fit_transform(df)
Using a pipeline :
import pandas as pd
from sklearn.pipeline import Pipeline
from sktransf import LogTransformer, DropUniqueColumnTransformer, BoolColumnTransformer
pipe = Pipeline([
('bool', BoolColumnTransformer()),
('unique', DropUniqueColumnTransformer()),
('log', LogTransformer())
])
df = pd.DataFrame(
{ "a": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
"b": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
}
)
df_transf = pipe.fit_transform(df)
Using a pipeline with a scikit-learn model :
import pandas as pd
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LinearRegression
from sktransf import LogTransformer, DropUniqueColumnTransformer, BoolColumnTransformer
pipe = Pipeline([
('bool', BoolColumnTransformer()),
('unique', DropUniqueColumnTransformer()),
('log', LogTransformer()),
('model', LinearRegression())
])
X = pd.DataFrame(
{ "a": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
"b": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
}
)
y = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
pipe.fit(X, y)
y_pred = pipe.predict(X)
Documentation
For more specific information, please refer to the notebooks:
- Pipelines notebook
- BoolColumnTransformer notebook
- DropUniqueColumnTransformer notebook
- LogColumnTransformer notebook
A complete documentation is be available on the github page.
Changelog, Releases and Roadmap
Please refer to the changelog page for more information.
Contributing
Pull requests are welcome.
For major changes, please open an issue first to discuss what you would like to change.
For more information, please refer to the contributing page.
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for scikit_transformers-0.2.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a7925fa06a636e4b3d42b78054ddb448a34ac674464773239a91af038ff1d3a |
|
MD5 | dd5c0d742fdbd922df6a65fd1023192a |
|
BLAKE2b-256 | b3fc4cdedb1bb00888fdf80a51fa3b62e61b9237a326fc095c7c4360ad398489 |
Hashes for scikit_transformers-0.2.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2cfdd2465edf024262f9a7c93350a75a39985fdc157c38895536b76cca2ac578 |
|
MD5 | 3a7444c1d03349b98534ee5bb076d7e7 |
|
BLAKE2b-256 | 32aca91be379f9692aa5fe31de9190c763b6a3062b6079655912357b615b0989 |