Skip to main content

Use dask to run the DVC graph

Project description

Coverage Status PyTest PyPI version zincware

Dask4DVC - Distributed Node Exectuion

DVC provides tools for building and executing the computational graph locally through various methods. The dask4dvc package combines Dask Distributed with DVC to make it easier to use with HPC managers like Slurm.

The dask4dvc repro package will run the DVC graph in parallel where possible. Currently, dask4dvc run will not run stages per experiment sequentially.

:warning: This is an experimental package not affiliated in any way with iterative or DVC.

Usage

Dask4DVC provides a CLI similar to DVC.

  • dvc repro becomes dask4dvc repro.
  • dvc queue start becomes dask4dvc run

You can follow the progress using dask4dvc <cmd> --dashboard.

SLURM Cluster

You can use dask4dvc easily with a slurm cluster. This requires a running dask scheduler:

from dask_jobqueue import SLURMCluster

cluster = SLURMCluster(
    cores=1, memory='128GB',
    queue="gpu",
    processes=1,
    walltime='8:00:00',
    job_cpu=1,
    job_extra=['-N 1', '--cpus-per-task=1', '--tasks-per-node=64', "--gres=gpu:1"],
    scheduler_options={"port": 31415}
)
cluster.adapt()

with this setup you can then run dask4dvc repro --address 127.0.0.1:31415 on the example port 31415.

You can also use config files with dask4dvc repro --config myconfig.yaml. All dask.distributed Clusters should be supported.

default:
  SGECluster:
    queue: regular
    cores: 10
    memory: 16 GB

dask4dvc repro

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dask4dvc-0.2.3.tar.gz (11.2 kB view hashes)

Uploaded Source

Built Distribution

dask4dvc-0.2.3-py3-none-any.whl (12.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page