K-means clustering with weights
Project description
ek
K-means clustering with weights
To install: pip install ek
Overview
The ek package provides an implementation of the K-means clustering algorithm that incorporates sample weights. This is particularly useful in scenarios where certain data points are of more significance than others and should have a greater influence on the formation of clusters.
Main Features
- Weighted K-means Clustering: Allows clustering with weighted data points, which can be crucial for datasets where some instances are more important than others.
- Compatibility with Scikit-learn: The implementation is designed to be compatible with Scikit-learn's clustering framework, making it easy to integrate with existing codebases that use Scikit-learn for machine learning tasks.
- Support for Sparse Data: Efficiently handles sparse matrices, which is beneficial for high-dimensional data.
- Custom Initialization Methods: Supports various methods for initializing cluster centers, including a weighted version of the k-means++ initialization.
Installation
To install the package, use the following pip command:
pip install ek
Usage
Basic Example
Here is a simple example of how to use the ek package to perform weighted K-means clustering:
import numpy as np
from ek import KMeansWeighted
# Sample data
X = np.array([[1, 2], [1, 4], [1, 0],
[10, 2], [10, 4], [10, 0]])
# Weights for each data point
weights = np.array([1, 2, 1, 1, 1, 2])
# Number of clusters
n_clusters = 2
# Create a KMeansWeighted instance
kmeans = KMeansWeighted(n_clusters=n_clusters)
# Fit the model
kmeans.fit(X, weights)
# Get cluster labels
labels = kmeans.labels_
# Print the labels
print(labels)
Advanced Usage
For more advanced usage, you can specify additional parameters such as init for the initialization method, max_iter for the maximum number of iterations, and tol for the convergence tolerance.
kmeans = KMeansWeighted(n_clusters=3, init='random', max_iter=100, tol=1e-4)
kmeans.fit(X, weights)
Documentation
Classes and Functions
KMeansWeighted
A class for K-means clustering with weights.
-
Parameters:
n_clusters: Number of clusters.init: Method for initialization ('k-means++_with_weights','random'or an ndarray).max_iter: Maximum number of iterations.tol: Tolerance for convergence.precompute_distances: Whether to precompute distances ('auto',True,False).verbose: Verbosity mode.random_state: Seed or numpy.RandomState instance.copy_x: If True, input data is copied.n_jobs: Number of parallel jobs to run.
-
Methods:
fit(X, weights): Compute K-means clustering.fit_predict(X, weights): Compute clustering and predict cluster indices.fit_transform(X, weights): Compute clustering and transform X to cluster-distance space.transform(X): Transform X to cluster-distance space.predict(X): Predict the closest cluster each sample in X belongs to.score(X): Opposite of the value of X on the K-means objective.
This package is designed to be easy to use while providing the flexibility needed for more complex clustering tasks. The implementation is optimized for performance and can handle large datasets efficiently.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ek-0.0.6.tar.gz.
File metadata
- Download URL: ek-0.0.6.tar.gz
- Upload date:
- Size: 16.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf6cecdd542300e8490ea05a43e25c35d3a7fcc522e6eec8e30c723560657760
|
|
| MD5 |
5b0cc45a6ca7609be96f5ba2bd6ab9be
|
|
| BLAKE2b-256 |
f9a028aeda5870b6aac312cd710e976ab2044e8887f972aa213f37b51bf0a6ec
|
File details
Details for the file ek-0.0.6-py3-none-any.whl.
File metadata
- Download URL: ek-0.0.6-py3-none-any.whl
- Upload date:
- Size: 15.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67ea55912e920e4c8405468c9c63b818fb61f33c1d4ce817dfdf367fc2bc416a
|
|
| MD5 |
ace9653f90c2884339bf6df65fdba625
|
|
| BLAKE2b-256 |
7565a4867cd376772628fa25e63a7639cab5d05cb50127230c4dc13a3b6be0fc
|