Association mining
Project description
nw
Association mining
To install: pip install nw
Overview
The nw package provides a Python implementation of the FP-growth algorithm for efficient frequent itemset mining, which is a common task in the field of association rule learning in data mining. The implementation includes functions to generate frequent itemsets, construct association rules from these itemsets, and calculate their support and confidence metrics without generating candidate itemsets.
Main Features
- Frequent Itemset Generation: Using the FP-growth algorithm to efficiently find frequent itemsets in a dataset.
- Association Rule Learning: Generating association rules from the frequent itemsets with user-defined minimum confidence.
- Support Calculation: Calculating the support metric for itemsets, which is the proportion of transactions in the dataset that contain the itemset.
- Verbose Output Options: Detailed logging of the algorithm's process for debugging or insight purposes.
Installation
To install nw, use pip:
pip install nw
Usage
Importing the Module
import nw
Preparing Your Dataset
Your dataset should be a list of transactions, where each transaction is a list of items. For example:
dataset = [['milk', 'bread'], ['bread', 'butter'], ['milk', 'bread', 'butter']]
Running the FP-growth Algorithm
To find frequent itemsets:
frequent_itemsets, support_data = nw.fpgrowth(dataset, min_support=0.5, include_support=True)
Printing the Rules
If you want to generate and print rules based on the frequent itemsets:
rules = nw.generate_rules(frequent_itemsets, support_data, min_confidence=0.7)
nw.print_rules(rules)
Example Output
This will output rules such as:
milk --> bread (sup = 0.67)
bread --> butter (sup = 0.67)
Documentation
Functions and Classes
fpgrowth(dataset, min_support=0.5, include_support=False, verbose=False)
Implements the FP-growth algorithm to find frequent itemsets.
dataset: List of transactions (each transaction is a list of items).min_support: Minimum support threshold for itemsets to be considered frequent.include_support: IfTrue, returns a tuple of itemsets and their support values.verbose: IfTrue, prints detailed logs of the algorithm's execution.
generate_rules(F, support_data, min_confidence=0.5, verbose=False)
Generates association rules from frequent itemsets.
F: List of frequent itemsets.support_data: Dictionary with support data for itemsets.min_confidence: Minimum confidence threshold for rules to be considered.verbose: IfTrue, prints each rule with its confidence and support.
print_rules(rules_tuples)
Prints formatted association rules.
rules_tuples: List of tuples representing the rules, where each tuple is (antecedent, consequent, support).
Classes
FPTree
A class representing an FP-tree structure for storing transactions and itemsets efficiently.
FPNode
A class representing a node in the FP-tree, which contains a count of occurrences and links to other nodes.
Contributing
Contributions to the nw package are welcome. Please ensure that any pull requests or issues are relevant to the FP-growth algorithm or associated functionalities.
For more details on the implementation and usage, refer to the in-line comments and documentation within the code.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nw-0.0.5.tar.gz.
File metadata
- Download URL: nw-0.0.5.tar.gz
- Upload date:
- Size: 13.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6989b0672d13e42f177b7723bb88c6a52ff28c6eb060c8a4433dda844b0b1533
|
|
| MD5 |
039b3f92363d504e98e156f17c99b5f6
|
|
| BLAKE2b-256 |
24ca76af07ba982dcc1618dac7308a10630a4b89c05a9bb2b6e0ae24a7c5ae1f
|
File details
Details for the file nw-0.0.5-py3-none-any.whl.
File metadata
- Download URL: nw-0.0.5-py3-none-any.whl
- Upload date:
- Size: 12.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b5965d63e495a8019df546002ce0568ef95817ba7a14de3456fd3a374e6cd27c
|
|
| MD5 |
3f1042ef48330d468025fa9b62326e05
|
|
| BLAKE2b-256 |
305b9e5a4419dc7e5c4cac708e4f490c2e78b338da060d496b333bab3e2059d2
|