Airflow provider for Versatile Data Kit.
Project description
Versatile Data Kit Airflow provider
A set of Airflow operators, sensors and a connection hook intended to help schedule Versatile Data Kit jobs using Apache Airflow.
Usage
To install it simply run:
pip install airflow-provider-vdk
Then you can create a workflow of data jobs (deployed by VDK Control Service) like this:
from datetime import datetime
from airflow import DAG
from vdk_provider.operators.vdk import VDKOperator
with DAG(
"airflow_example_vdk",
schedule_interval=None,
start_date=datetime(2022, 1, 1),
catchup=False,
tags=["example", "vdk"],
) as dag:
trino_job1 = VDKOperator(
conn_id="vdk-default",
job_name="airflow-trino-job1",
team_name="taurus",
task_id="trino-job1",
)
trino_job2 = VDKOperator(
conn_id="vdk-default",
job_name="airflow-trino-job2",
team_name="taurus",
task_id="trino-job2",
)
transform_job = VDKOperator(
conn_id="vdk-default",
job_name="airflow-transform-job",
team_name="taurus",
task_id="transform-job",
)
[trino_job1, trino_job2] >> transform_job
Example
Demo
You can see demo during one of the community meetings here: https://www.youtube.com/watch?v=c3j1aOALjVU&t=690s
Architecture
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Close
Hashes for airflow-provider-vdk-0.0.1184833162.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 943e4a533736fa6e321086da3ebbf361da7ae18bb8f60948947878ad144f16d0 |
|
MD5 | 588938e1ac231199c493644b317db8ab |
|
BLAKE2b-256 | 63966af341cb8a9e70300dd9a1b228b300a6e7c4b15aeb5cf65742b934bad7fc |