A package for automatic clustering hyperparameter optmization
Project description
Hypercluster
A package for clustering optimization with sklearn.
Requirements:
pandas
numpy
scipy
matplotlib
seaborn
scikit-learn
hdbscan
Optional: snakemake
Install
pip install hypercluster
or
conda install -c bioconda hypercluster
Right now there are issue with the bioconda install on linux. Try the pip, if you are having problems.
Docs
https://hypercluster.readthedocs.io/en/latest/index.html
Examples
https://github.com/liliblu/hypercluster/tree/dev/examples
Quickstart example
import pandas as pd
from sklearn.datasets import make_blobs
import hypercluster
data, labels = make_blobs()
data = pd.DataFrame(data)
labels = pd.Series(labels, index=data.index, name='labels')
# With a single clustering algorithm
clusterer = hypercluster.utilities.AutoClusterer()
clusterer.fit(data).evaluate(
methods = hypercluster.constants.need_ground_truth+hypercluster.constants.inherent_metrics,
gold_standard = labels
)
hypercluster.visualize.visualize_evaluations(clusterer.evaluation_, multiple_clusterers=False)
# With a range of algorithms
evals, labels_df, labels_dict = optimize_clustering(data)
hypercluster.visualize.visualize_evaluations(evals)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
hypercluster-0.1.1.tar.gz
(13.2 kB
view hashes)
Built Distribution
Close
Hashes for hypercluster-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2e759797f5b0ec407988094e5717c947f264a2fcca9e740a700fdb13501204c2 |
|
MD5 | 4d1dff36636c35f26daccc1c4def5e34 |
|
BLAKE2b-256 | 92c987bea67760d620f06380d3dab4002995a36492df49d282654b507cd9308a |