Skip to main content

A pandas extension built for OpenLineage

Project description

pandas-lineage

BEWARE: This project is in very early stages (as of 2022-09-12)

pandas-lineage is intended to extend the functionality of I/O and standard transform operations on a pandas dataframe to emit OpenLineage RunEvents. I am starting just with read/write operations emiting RunEvents with schema facets.

Badges:

python-package

Installation

pip install pandas-lineage

Development Documentation

Examples:

  • marquez-examples
    • contains getting started code and a script for running Marquez locally in Docker
  • mock-api-example
    • contains getting started code and a simple Flask API for sending lineage events to which will just always return a 200 status code

References:

Contributing:

Issues

I have not created any sort of contribution guide yet, but I don't want that to stop anyone! If you are interested in contributing, fork this repository and open a PR. As this becomes more feature-rich/useful, we will establish a contributors workflow. For now, please just use the pre-commit hooks.

Notes:

  • The pandas-lineage directory structure (for now) will mirror the directory structure of pandas for the components that it is extending.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas-lineage-0.0.2.tar.gz (8.3 kB view hashes)

Uploaded Source

Built Distribution

pandas_lineage-0.0.2-py3-none-any.whl (12.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page