Skip to main content

Library and command line interface for darwin.v7labs.com

Project description

Darwin

Official library to manage datasets along with V7 Darwin annotation platform.

Darwin-py can both be used from the command line and as a python library.

Main functions are (but not limited to):

  • Client authentication
  • Listing local and remote datasets
  • Create/remove datasets
  • Upload/download data to/from remote datasets
  • Direct integration with pytorch dataloaders (See torch/README.md)

Support tested for python 3.7.

Installation

pip install darwin-py

You can now type darwin in your terminal and access the command line interface.


Usage as a Command Line Interface (CLI)

Once installed, darwin is accessible as a command line tool. A useful way to navigate the CLI usage is through the help command -h/--help which will provide additional information for each command available.

Client Authentication

To perform remote operations on Darwin you first need to authenticate. This requires a team-specific API-key.
If you do not already have a Darwin account, you can contact us and we can set one up for you.

To start the authentication process:

$ darwin authenticate
API key: 
Make example-team the default team? [y/N] y
Datasets directory [~/.darwin/datasets]: 
Authentication succeeded.

You will be then prompted to enter your API-key, whether you want to set the corresponding team as default and finally the desired location on the local file system for the datasets of that team. This process will create a configuration file at ~/.darwin/config.yaml. This file will be updated with future authentications for different teams.

Listing local and remote datasets

Lists a summary of local existing datasets

$ darwin dataset local
NAME            IMAGES     SYNC_DATE         SIZE
mydataset       112025     yesterday     159.2 GB

Lists a summary of remote datasets accessible by the current user.

$ darwin dataset remote
NAME                       IMAGES     PROGRESS
example-team/mydataset     112025        73.0%

Create/remove a dataset

To create an empty dataset remotely:

$ darwin dataset create test
Dataset 'test' (example-team/test) has been created.
Access at https://darwin.v7labs.com/datasets/579

The dataset will be created in the team you're authenticated for.

To delete the project on the server:

$ darwin dataset remove test
About to delete example-team/test on darwin.
Do you want to continue? [y/N] y

Upload/download data to/from a remote dataset

Uploads data to an existing remote project. It takes the dataset name and a single image (or directory) with images/videos to upload as parameters.

The -e/--exclude argument allows to indicate file extension/s to be ignored from the data_dir. e.g.: -e .jpg

For videos, the frame rate extraction rate can be specified by adding --fps <frame_rate>

Supported extensions:

  • Video files: [.mp4, .bpm, .mov formats].
  • Image files [.jpg, .jpeg, .png formats].
$ darwin dataset push test /path/to/folder/with/images
100%|████████████████████████| 2/2 [00:01<00:00,  1.27it/s] 

Before a dataset can be downloaded, a release needs to be generated:

$ darwin dataset export test 0.1
Dataset test successfully exported to example-team/test:0.1

This version is immutable, if new images / annotations have been added you will have to create a new release to included them.

To list all available releases

$ darwin dataset releases test
NAME                           IMAGES     CLASSES                   EXPORT_DATE
example-team/test:0.1               4           0     2019-12-07 11:37:35+00:00

And to finally download a release.

$ darwin dataset pull test:0.1
Dataset example-team/test:0.1 downloaded at /directory/choosen/at/authentication/time.

Usage as a Python library

The framework is designed to be usable as a standalone python library. Usage can be inferred from looking at the operations performed in darwin/cli_functions.py. A minimal example to download a dataset is provided below and a more extensive one can be found in darwin_demo.py.

from darwin.client import Client

client = Client.local() # use the configuration in ~/.darwin/config.yaml
dataset = client.get_remote_dataset("example-team/test")
dataset.pull() # downloads annotations and images for the latest exported version

See torch/README.md for how to integrate darwin datasets directly in torch.

Project details


Release history Release notifications | RSS feed

This version

0.4

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

darwin-py-0.4.tar.gz (38.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

darwin_py-0.4-py3-none-any.whl (45.2 kB view details)

Uploaded Python 3

File details

Details for the file darwin-py-0.4.tar.gz.

File metadata

  • Download URL: darwin-py-0.4.tar.gz
  • Upload date:
  • Size: 38.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.7

File hashes

Hashes for darwin-py-0.4.tar.gz
Algorithm Hash digest
SHA256 75786f18733c1d1a72a290e7dc87d14615c893e8a6e26b177ae683b18f12684c
MD5 9be1bcd7adff9c0ad2c1577ffe551c94
BLAKE2b-256 89b80edf3fa9229e9f7771b889b281d450e5c1d2855b27b333cba888c1af423d

See more details on using hashes here.

File details

Details for the file darwin_py-0.4-py3-none-any.whl.

File metadata

  • Download URL: darwin_py-0.4-py3-none-any.whl
  • Upload date:
  • Size: 45.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.7

File hashes

Hashes for darwin_py-0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 60386553bc462de1ac60be4c0ac986231bf007a8213493c457a9e7071faf8434
MD5 e9df56953fc52274a43925b37907718a
BLAKE2b-256 aa91b5b33e2b0f796cd47489887fa320489056ee49281e157a022caec10b6968

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page