Skip to main content

Association mining

Project description

nw

Association mining

To install: pip install nw

Overview

The nw package provides a Python implementation of the FP-growth algorithm for efficient frequent itemset mining, which is a common task in the field of association rule learning in data mining. The implementation includes functions to generate frequent itemsets, construct association rules from these itemsets, and calculate their support and confidence metrics without generating candidate itemsets.

Main Features

  • Frequent Itemset Generation: Using the FP-growth algorithm to efficiently find frequent itemsets in a dataset.
  • Association Rule Learning: Generating association rules from the frequent itemsets with user-defined minimum confidence.
  • Support Calculation: Calculating the support metric for itemsets, which is the proportion of transactions in the dataset that contain the itemset.
  • Verbose Output Options: Detailed logging of the algorithm's process for debugging or insight purposes.

Installation

To install nw, use pip:

pip install nw

Usage

Importing the Module

import nw

Preparing Your Dataset

Your dataset should be a list of transactions, where each transaction is a list of items. For example:

dataset = [['milk', 'bread'], ['bread', 'butter'], ['milk', 'bread', 'butter']]

Running the FP-growth Algorithm

To find frequent itemsets:

frequent_itemsets, support_data = nw.fpgrowth(dataset, min_support=0.5, include_support=True)

Printing the Rules

If you want to generate and print rules based on the frequent itemsets:

rules = nw.generate_rules(frequent_itemsets, support_data, min_confidence=0.7)
nw.print_rules(rules)

Example Output

This will output rules such as:

milk --> bread (sup = 0.67)
bread --> butter (sup = 0.67)

Documentation

Functions and Classes

fpgrowth(dataset, min_support=0.5, include_support=False, verbose=False)

Implements the FP-growth algorithm to find frequent itemsets.

  • dataset: List of transactions (each transaction is a list of items).
  • min_support: Minimum support threshold for itemsets to be considered frequent.
  • include_support: If True, returns a tuple of itemsets and their support values.
  • verbose: If True, prints detailed logs of the algorithm's execution.

generate_rules(F, support_data, min_confidence=0.5, verbose=False)

Generates association rules from frequent itemsets.

  • F: List of frequent itemsets.
  • support_data: Dictionary with support data for itemsets.
  • min_confidence: Minimum confidence threshold for rules to be considered.
  • verbose: If True, prints each rule with its confidence and support.

print_rules(rules_tuples)

Prints formatted association rules.

  • rules_tuples: List of tuples representing the rules, where each tuple is (antecedent, consequent, support).

Classes

FPTree

A class representing an FP-tree structure for storing transactions and itemsets efficiently.

FPNode

A class representing a node in the FP-tree, which contains a count of occurrences and links to other nodes.

Contributing

Contributions to the nw package are welcome. Please ensure that any pull requests or issues are relevant to the FP-growth algorithm or associated functionalities.

For more details on the implementation and usage, refer to the in-line comments and documentation within the code.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nw-0.0.5.tar.gz (13.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nw-0.0.5-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file nw-0.0.5.tar.gz.

File metadata

  • Download URL: nw-0.0.5.tar.gz
  • Upload date:
  • Size: 13.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for nw-0.0.5.tar.gz
Algorithm Hash digest
SHA256 6989b0672d13e42f177b7723bb88c6a52ff28c6eb060c8a4433dda844b0b1533
MD5 039b3f92363d504e98e156f17c99b5f6
BLAKE2b-256 24ca76af07ba982dcc1618dac7308a10630a4b89c05a9bb2b6e0ae24a7c5ae1f

See more details on using hashes here.

File details

Details for the file nw-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: nw-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 12.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for nw-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 b5965d63e495a8019df546002ce0568ef95817ba7a14de3456fd3a374e6cd27c
MD5 3f1042ef48330d468025fa9b62326e05
BLAKE2b-256 305b9e5a4419dc7e5c4cac708e4f490c2e78b338da060d496b333bab3e2059d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page