Skip to main content

A scalable, fast, ACID-compliant Data Catalog powered by Ray.

Project description

DeltaCAT

DeltaCAT is a Pythonic Data Catalog powered by Ray.

Its data storage model allows you to define and manage fast, scalable, ACID-compliant data catalogs through git-like stage/commit APIs, and has been used to successfully host exabyte-scale enterprise data lakes.

DeltaCAT uses the Ray distributed compute framework together with Apache Arrow for common table management tasks, including petabyte-scale change-data-capture, data consistency checks, and table repair.

Getting Started


Install

pip install deltacat

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deltacat-0.1.18b5.tar.gz (122.9 kB view details)

Uploaded Source

File details

Details for the file deltacat-0.1.18b5.tar.gz.

File metadata

  • Download URL: deltacat-0.1.18b5.tar.gz
  • Upload date:
  • Size: 122.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for deltacat-0.1.18b5.tar.gz
Algorithm Hash digest
SHA256 bb3b3cbb8af430530661a412e29c07fb05deb1c84fcb66c66dbbca51d36d8379
MD5 d3c21d8a509c6ffba388a1aa1f173231
BLAKE2b-256 0c6110b43b6622ea31847fe01e31fd2e6a521f0149a30d9495de5ee18a534d18

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page