Skip to main content

Tiny Block Operations for Data Pipelines

Project description

tiny-blocks

Documentation Status License-MIT GitHub Actions PyPI version

Tiny Blocks to build large and complex pipelines!

Tiny-Blocks is a library for streaming operations, composed using the >> operator. This allows for easy extract, transform and load operations.

Pipeline Components: Sources, Pipes, and Sinks

This library relies on a fundamental streaming abstraction consisting of three parts: extract, transform, and load. You can view a pipeline as a extraction, followed by zero or more transformations, followed by a sink. Visually, this looks like:

source >> pipe1 >> pipe2 >> pipe3 >> ... >> pipeN >> sink

Installation

Install it using pip

pip install tiny-blocks

Basic usage example

from tiny_blocks.extract import FromCSV
from tiny_blocks.transform import DropDuplicates
from tiny_blocks.transform import Fillna
from tiny_blocks.load import ToSQL

# ETL Blocks
from_csv = FromCSV(path='/path/to/file.csv')
drop_duplicates = DropDuplicates()
fill_na = Fillna(value="Hola Mundo")
to_sql = ToSQL(dsn_conn='psycopg2+postgres://...')

# Run the Pipeline
from_csv >> drop_duplicates >> fill_na >> to_sql

Documentation

Please visit this link for documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tiny_blocks-0.1.2.tar.gz (12.1 kB view hashes)

Uploaded Source

Built Distribution

tiny_blocks-0.1.2-py3-none-any.whl (22.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page