Calculate weight factors for survey data to approximate a representative sample
Project description
Weight Factors
Calculate weight factors for survey data to approximate a representative sample
Installation
pip install weightfactors
or clone and install from source
git clone https://github.com/markteffect/weightfactors
cd weightfactors
poetry install
Usage
Currently, the package implements a generalized raking algorithm.
If you'd like to see support for other algorithms, please open an issue or submit a pull request.
Let's use the following dataset as an example:
sample = pd.DataFrame(
{
"Gender": [
"Male",
"Male",
"Female",
"Female",
"Female",
"Male",
"Female",
"Female",
"Male",
"Female",
],
"Score": [7.0, 6.0, 8.5, 7.5, 8.0, 5.0, 9.5, 8.0, 4.5, 8.5],
}
)
Suppose our sample comprises 40% males and 60% females.
If we were to calculate the average score, we would get:
np.average(sample["Score"])
# 7.25
Now, assuming a 50/50 gender distribution in the population,
let's calculate weight factors to approximate the population distribution:
from weightfactors import GeneralizedRaker
raker = GeneralizedRaker({"Gender": {"Male": 0.5, "Female": 0.5}})
weights = raker.rake(sample)
# [1.25000008 1.25000008 0.83333334 0.83333334 0.83333334 1.25000008
# 0.83333334 0.83333334 1.25000008 0.83333334]
Let's calculate the average score again, this time applying the weight factors:
np.average(sample["Score"], weights=weights)
# 6.9791666284520835
For more detailed information and customization options, please refer to the docstrings.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for weightfactors-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ea19a8038bbf0f52e1a28c117b7adb6f74a97410c6ed3546beb1f33c12cfa10 |
|
MD5 | 809a7baf9f8e17c648af380d43e0ccb1 |
|
BLAKE2b-256 | 8ca4d6d7d14faf33b77a48b33d646002be4b434fd2bd26ef21d80adef64c3f1f |