Very Basic package make some usefull log transformation for scikit-learn

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Python Repo Size Coverage Tests Statics Doc Pypi GitHub commit activity

Scikit-transformers : Scikit-learn + Custom transformers

About

Basic package to enable usefull transformers in scikit-learn pipelines.

First transformer implemented is a LogTransformer, which is a simple wrapper around the numpy log function.

Installation

Using regular pip and venv tools :

python3 -m venv .venv
source .venv/bin/activate
pip install scikit-transformers

Usage

For a very basic usage :

import pandas as pd

from sktransf import LogTransformer

df = pd.DataFrame(
    { "a": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
      "b": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    }
)

logger = LogTransformer()
logger.fit_transform(df)
df_transf = logger.transform(df)

Using common transformers :

import pandas as pd

from sktransf import LogTransformer, DropUniqueColumnTransformer, BoolColumnTransformer

df = pd.DataFrame(
    { "a": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
      "b": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    }
)

df_bool = BoolColumnTransformer().fit_transform(df)
df_unique = DropUniqueColumnTransformer().fit_transform(df)
df_logged = LogTransformer().fit_transform(df)

Using a pipeline :

import pandas as pd
from sklearn.pipeline import Pipeline

from sktransf import LogTransformer, DropUniqueColumnTransformer, BoolColumnTransformer

pipe = Pipeline([
    ('bool', BoolColumnTransformer()),
    ('unique', DropUniqueColumnTransformer()),
    ('log', LogTransformer())
])

df = pd.DataFrame(
    { "a": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
      "b": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    }
)

df_transf = pipe.fit_transform(df)

Using a pipeline with a scikit-learn model :

import pandas as pd
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LinearRegression

from sktransf import LogTransformer, DropUniqueColumnTransformer, BoolColumnTransformer

pipe = Pipeline([
    ('bool', BoolColumnTransformer()),
    ('unique', DropUniqueColumnTransformer()),
    ('log', LogTransformer()),
    ('model', LinearRegression())
])

X = pd.DataFrame(
    { "a": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
      "b": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    }
)

y = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

pipe.fit(X, y)

y_pred = pipe.predict(X)

Documentation

For more specific information, please refer to the notebooks:

A complete documentation is be available on the github page.

Changelog, Releases and Roadmap

Please refer to the changelog page for more information.

Contributing

Pull requests are welcome.

For major changes, please open an issue first to discuss what you would like to change.

For more information, please refer to the contributing page.

License

GPLv3

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.3.2

Feb 12, 2024

0.3.1

Feb 9, 2024

This version

0.2.1

Feb 8, 2024

0.2.0

Feb 8, 2024

0.1.0

Jan 26, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikit_transformers-0.2.1.tar.gz (17.3 kB view hashes)

Uploaded Feb 8, 2024 Source

Built Distribution

scikit_transformers-0.2.1-py3-none-any.whl (18.8 kB view hashes)

Uploaded Feb 8, 2024 Python 3

Hashes for scikit_transformers-0.2.1.tar.gz

Hashes for scikit_transformers-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`3a7925fa06a636e4b3d42b78054ddb448a34ac674464773239a91af038ff1d3a`
MD5	`dd5c0d742fdbd922df6a65fd1023192a`
BLAKE2b-256	`b3fc4cdedb1bb00888fdf80a51fa3b62e61b9237a326fc095c7c4360ad398489`

Hashes for scikit_transformers-0.2.1-py3-none-any.whl

Hashes for scikit_transformers-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2cfdd2465edf024262f9a7c93350a75a39985fdc157c38895536b76cca2ac578`
MD5	`3a7444c1d03349b98534ee5bb076d7e7`
BLAKE2b-256	`32aca91be379f9692aa5fe31de9190c763b6a3062b6079655912357b615b0989`