Memory frugal torch dataset from a csv collection
Project description
csvsdataset
csvsdataset is a Python library designed to simplify the process of working with multiple CSV files as a single dataset. The primary functionality is provided by the CsvsDataset class in the csvsdataset.py module.
This was written by ChatGPT4 as mentioned here. Issues will be cut and paste into a session. It is an experiment in semi-autonomous code maintenance.
Installation
To install the csvsdataset library, simply run:
pip install csvsdataset
Usage
from csvsdataset.csvsdataset import CsvsDataset
# Initialize the CsvsDataset instance
dataset = CsvsDataset(folder_path="path/to/your/csv/folder",
file_pattern="*.csv",
x_columns=["column1", "column2"],
y_column="target_column")
# Iterate over the dataset
for x_data, y_data in dataset:
# Your processing code here
pass
# Access a specific item in the dataset
x_data, y_data = dataset[42]
Memory frugality
Only data from a small number of csv files are maintained in memory. The rest is discarded on a LRU basis. This class is intended for use when a very large number of data files exist which cannot be loaded into memory conveniently.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file csvsdataset-0.0.7.tar.gz.
File metadata
- Download URL: csvsdataset-0.0.7.tar.gz
- Upload date:
- Size: 35.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
edbd1b5640a4a904014ed9476ea7ee3f551994a9d125f48d5d64500070c9161d
|
|
| MD5 |
9f1f2706473b41c8e2d7c6fbd92b8afc
|
|
| BLAKE2b-256 |
1cef7259452de864117bed0e0ec17ffc07117b901fe54ad5f5e51f0d8adf85b5
|
File details
Details for the file csvsdataset-0.0.7-py3-none-any.whl.
File metadata
- Download URL: csvsdataset-0.0.7-py3-none-any.whl
- Upload date:
- Size: 35.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
151e992427bc6969f52a5f93966b59d32c70fd71166f7b4e48f5b8c39704bcba
|
|
| MD5 |
6e6c1b815810df06ec270efa43f25cfb
|
|
| BLAKE2b-256 |
5e34610d9451ec9ad9100ea571d6d201e12b8a6594705507c688df053b7d0634
|