airbyte-serverless

Airbyte made easy (no UI, no database, no cluster)

These details have not been verified by PyPI

Project links

Download

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

logo

Airbyte made simple

🔍️ What is AirbyteServerless?

AirbyteServerless is a simple tool to manage Airbyte connectors, run them locally or deploy them in serverless mode.

💡 Why AirbyteServerless?

Airbyte is a must-have in your data-stack with its catalog of open-source connectors to move your data from any source to your data-warehouse.

To manage these connectors, Airbyte offers Airbyte-Open-Source-Platform which includes a server, workers, database, UI, orchestrator, connectors, secret manager, logs manager, etc.

AirbyteServerless aims at offering a lightweight alternative to Airbyte-Open-Source-Platform to simplify connectors management.

📝 Comparing Airbyte-Open-Source-Platform & AirbyteServerless

Airbyte-Open-Source-Platform	AirbyteServerless
Has a UI	Has NO UI Connections configurations are managed by documented yaml files
Has a database	Has NO database - Configurations files are versioned in git - The destination stores the `state` (the checkpoint of where sync stops) and `logs` which can then be visualized with your preferred BI tool
Has a transform layer Airbyte loads your data in a raw format but then enables you to perform basic transform such as replace, upsert, schema normalization	Has NO transform layer - Data is appended in your destination in raw format. - `airbyte_serverless` is dedicated to do one thing and do it well: `Extract-Load`.
NOT Serverless - Can be deployed on a VM or Kubernetes Cluster. - The platform is made of tens of dependent containers that you CANNOT deploy with serverless	Serverless - An Airbyte source docker image is upgraded with a destination connector - The upgraded docker image can then be deployed as an isolated `Cloud Run Job` (or `Cloud Run Service`) - Cloud Run is natively monitored with metrics, dashboards, logs, error reporting, alerting, etc - It can be scheduled or triggered by events
Is scalable with conditions Scalable if deployed on autoscaled Kubernetes Cluster and if you are skilled enough. 👉 Check that you are skilled enough with Kubernetes by watching this video 😁.	Is scalable Each connector is deployed independently of each other. You can have as many as you want.

💥 Getting Started with `abs` CLI

abs is the CLI (command-line-interface) of AirbyteServerless which facilitates connectors management.

Install `abs` 🛠️

pip install airbyte-serverless

Create your first Connection 👨‍💻

abs create my_first_connection --source="airbyte/source-faker:0.1.4" --destination="bigquery:my_project.my_dataset"

Docker is required. Make sure you have it installed.

source param can be any Public Docker Airbyte Source (here is the list). We recomend that you use faker source to get started.

destination param must be one of the following:

print

bigquery:my_project.my_dataset with my_project a GCP project where you can run BigQuery queries and my_dataset a BigQuery dataset where you have dataEditor permission.

contributions are welcome to offer more destinations 🤗

The command will create a configuration file ./connections/my_first_connection.yaml with initialized configuration.

Update this configuration file to suit your needs.

Run it! ⚡

abs run my_first_connection

The runcommmand will only work if you have correctly edited ./connections/my_first_connection.yaml configuration file.

If you chose bigquery destination, you must have gcloud installed on your machine with default credentials initialized with the command gcloud auth application-default login

Data is always appended at destination (not replaced nor upserted). It will be in raw format.

If the connector supports incremental extract (extract only new or recently modified data) then this mode is chosen.

Select only some streams 🧛🏼

You may not want to copy all the data that the source can get. To see all available streams run:

abs list-streams my_first_connection

Run extract-load for only stream1 and stream2 with:

abs run my_first_connection --streams="stream1,stream2"

If you want to persist the choice of these streams for all future extract-loads, run:

abs set-config my_first_connection --streams="stream1,stream2"

Get help 📙

$ abs --help
Usage: abs [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  deploy  Deploy BIGFUNCTION
  doc     Generate, serve and publish documentation
  test    Test BIGFUNCTION

Deploy 🚀

...

👋 Contribute

Any contribution is more than welcome 🤗!

Add a ⭐ on the repo to show your support
Raise an issue to raise a bug or suggest improvements
Open a PR! Below are some suggestions of work to be done:
- improve secrets management
- implement a CLI
- manage configurations as yaml files
- implement the get_logs method of BigQueryDestination
- add a new destination connector (Cloud Storage?)
- add more serverless deployment examples.
- implement optional post-processing (replace, upsert data at destination instead of append?)

🏆 Credits

Big kudos to Airbyte for all the hard work on connectors!
The generation of the sample connector configuration in yaml is heavily inspired from the code of octavia CLI developed by airbyte.

Project details

These details have not been verified by PyPI

Project links

Download

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.24

May 28, 2024

0.23

Dec 18, 2023

0.22

Oct 10, 2023

0.21

Oct 9, 2023

0.20

Oct 6, 2023

0.19

Oct 6, 2023

0.18

Sep 22, 2023

0.16

Sep 22, 2023

This version

0.15

Sep 22, 2023

0.14

Sep 13, 2023

0.13

Sep 7, 2023

0.12

Sep 7, 2023

0.11

Sep 7, 2023

0.10

Sep 7, 2023

0.9

Sep 7, 2023

0.8

Sep 7, 2023

0.7

Sep 7, 2023

0.6

Sep 7, 2023

0.5

Sep 7, 2023

0.4

Sep 7, 2023

0.3

Sep 7, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airbyte_serverless-0.15.tar.gz (16.2 kB view hashes)

Uploaded Sep 22, 2023 Source

Hashes for airbyte_serverless-0.15.tar.gz

Hashes for airbyte_serverless-0.15.tar.gz
Algorithm	Hash digest
SHA256	`7c9f9f4dce93fb781daf2d4b8461bc1c8e8354fb63663d695c9b4edbcb42626c`
MD5	`49b04bcbdc656eb998798369ace0b7fe`
BLAKE2b-256	`973532c1efbcbf3840c4734d7b95a8475c0c0e0594bd161b13f83ae894403b87`

airbyte-serverless 0.15

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

🔍️ What is AirbyteServerless?

💡 Why AirbyteServerless?

📝 Comparing Airbyte-Open-Source-Platform & AirbyteServerless

💥 Getting Started with `abs` CLI

Install `abs` 🛠️

Create your first Connection 👨‍💻

Run it! ⚡

Select only some streams 🧛🏼

Get help 📙

Deploy 🚀

👋 Contribute

🏆 Credits

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

airbyte-serverless 0.15

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

🔍️ What is AirbyteServerless?

💡 Why AirbyteServerless?

📝 Comparing Airbyte-Open-Source-Platform & AirbyteServerless

💥 Getting Started with abs CLI

Install abs 🛠️

Create your first Connection 👨‍💻

Run it! ⚡

Select only some streams 🧛🏼

Get help 📙

Deploy 🚀

👋 Contribute

🏆 Credits

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

💥 Getting Started with `abs` CLI

Install `abs` 🛠️