Frequent itemsets -- fp-tree naeseth
Project description
og
Frequent itemsets -- fp-tree naeseth
To install: pip install og
Overview
The og package provides a Python implementation of the FP-growth algorithm for finding frequent itemsets in transactional data sets. This method is efficient and scalable, making it suitable for large data sets where traditional apriori-based methods may be too slow. It constructs a compressed representation of the dataset, the FP-tree, which is then used to extract frequent itemsets directly.
Features
- Efficiently find frequent itemsets without candidate generation.
- Return itemsets along with their support counts if desired.
- Handle any iterable of iterables as input for transactions.
Usage
Basic Usage
To use the find_frequent_itemsets function, you need to provide a list of transactions and a minimum support threshold. Here is a simple example:
from og import find_frequent_itemsets
# Sample transactions
transactions = [
['milk', 'bread', 'butter'],
['beer', 'bread'],
['milk', 'bread'],
['butter', 'beer'],
['bread', 'butter'],
['milk', 'butter'],
['milk', 'bread', 'butter', 'beer']
]
# Finding itemsets with a minimum support of 3
result = list(find_frequent_itemsets(transactions, minimum_support=3))
print(result)
Including Support Counts
If you want to include the support counts of the itemsets in the results, set include_support=True:
# Finding itemsets with support counts
result_with_support = list(find_frequent_itemsets(transactions, minimum_support=3, include_support=True))
print(result_with_support)
Classes and Functions
FPTree
This class represents an FP-tree structure that can store transaction items. It supports adding transactions, and provides methods to access the nodes and items.
FPNode
Represents a node in an FP-tree. Each node contains a count of occurrences, links to child nodes, and a reference to its parent node.
find_frequent_itemsets
A function to find frequent itemsets in the given transactions using the FP-growth algorithm. It can optionally return the support count for each itemset.
Parameters:
- transactions: An iterable of iterable items. Each inner iterable represents a transaction.
- minimum_support: An integer specifying the minimum number of occurrences for an itemset to be considered frequent.
- include_support: A boolean flag that, when set to True, includes the support count of each itemset in the results.
Development and Contributions
Contributions to the og package are welcome. You can contribute in several ways including providing feedback, reporting bugs, and submitting feature requests or pull requests.
For more detailed contribution guidelines, please refer to the official repository documentation.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file og-0.0.5.tar.gz.
File metadata
- Download URL: og-0.0.5.tar.gz
- Upload date:
- Size: 10.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5cd3f0566390adf9606808bc8aaa4f9c5e6ff72f6887a7d27641f6cf489ca4e
|
|
| MD5 |
b8fdc55f0d6de56edbc5e52844403204
|
|
| BLAKE2b-256 |
31344daf82925d23061e4e0be69f841c0b68fa76dcba53dbe861f0cf7161baf5
|
File details
Details for the file og-0.0.5-py3-none-any.whl.
File metadata
- Download URL: og-0.0.5-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f1de5b11866f84e944be3c5e3c20f711852ff14c78adf4a0317ca8fc1292b6e
|
|
| MD5 |
5f5addc4ea7ce095a3e831065c01fbb9
|
|
| BLAKE2b-256 |
09f366dc77d9ed3843f23e861ac35fbe1b7828d3fbc2a155b428369c3dc0743c
|