Skip to main content

Unifying an inconsistently coded categorical variable in a panel/longtitudal dataset.

Project description

cat2cat

Build Status codecov

Unifying an inconsistently coded categorical variable in a panel/longtitudal dataset.

Installation

$ pip install cat2cat

Usage

For more examples and descriptions please vist the example notebook

load example data

# cat2cat datasets
from cat2cat.datasets import load_trans, load_occup
trans = load_trans()
occup = load_occup()

Low-level functions

# Low-level functions
from cat2cat.mappings import get_mappings, get_freqs, cat_apply_freq

mappings = get_mappings(trans)
codes_new = occup.code[occup.year == 2010].values
freqs = get_freqs(codes_new)
mapp_new_p = cat_apply_freq(mappings["to_new"], freqs)
mappings["to_new"]['3481']
mapp_new_p['3481']

cat2cat function

from cat2cat import cat2cat
from cat2cat.dataclass import cat2cat_data, cat2cat_mappings, cat2cat_ml

from pandas import DataFrame

o_old = occup.loc[occup.year == 2008, :].copy()
o_new = occup.loc[occup.year == 2010, :].copy()

# dataclasses a core arguments for cat2cat function
data = cat2cat_data(old = o_old, new = o_new, "code", "code", "year")
mappings = cat2cat_mappings(trans, "backward")

c2c = cat2cat(data, mappings)
data_final = concat([c2c["old"], c2c["new"]])

Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

cat2cat was created by Maciej Nasinski. It is licensed under the terms of the MIT license.

Credits

cat2cat was created with cookiecutter and the py-pkgs-cookiecutter template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cat2cat-0.1.1.tar.gz (2.6 MB view hashes)

Uploaded Source

Built Distribution

cat2cat-0.1.1-py3-none-any.whl (2.6 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page