Skip to main content

Script of Scripts (SoS): an interactive, cross-platform, and cross-language workflow system for reproducible data analysis

Project description

Exploratory data analysis in computationally intensive disciplines such as computational biology often requires one to exploit a variety of tools implemented in different programming languages and analyzing large datasets on high performance computing systems (e.g. computer clusters). On top of all the difficulties in exchanging data between languages and computing systems and analyzing data on different platforms, it becomes challenging to keep track of such fragmented workflows and reproduce prior analyses.

With strong emphases on readability, practicality, and reproducibility, we have developed a workflow system called “Script of Scripts” (SoS) with a web front-end and notebook format based on Jupyter. Major features of SoS for exploratory analysis include multi-language support, explicit and automatic data exchange between running sessions (kernels) in different languages, cell-specific kernel switch using frontend-UI or cell magics, a side-panel that allows scratch execution of statements, preview of files and expressions, and line-by-line execution of statements in cells. In particular, variable and file preview on the side panel makes it possible to trouble-shoot scripts in multiple languages without contaminating the main notebook or interrupting the logic flow of the analysis. For large-scale data analysis, the SoS workflow engine provides a unified interface to executing and managing tasks on a variety of computing platforms such as PBS/Torch/LSF/Slurm clusters and RQ and Celery task queues. Specified files are automatically synchronized between file systems, thus enabling a single workflow to utilize multiple remote computing environments.

Researchers will benefit from the SoS system the flexibility to use their preferred languages and tools for tasks without having to worry about data flow, and can perform light interactive analysis while executing heavy remote tasks simultaneous in the same notebook in a neat and organized fashion. SoS is available at http://vatlab.github.io/SOS/ and is distributed freely under a GPL3 license. A live Jupyter server and several docker containers are available for testing and running SoS without a local installation.

Please refer to http://vatlab.github.io/SOS/ for more details on SoS.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sos-0.9.8.6.tar.gz (4.0 MB view details)

Uploaded Source

File details

Details for the file sos-0.9.8.6.tar.gz.

File metadata

  • Download URL: sos-0.9.8.6.tar.gz
  • Upload date:
  • Size: 4.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for sos-0.9.8.6.tar.gz
Algorithm Hash digest
SHA256 a766b5ab87f02a0bb5fd871201c1dde32e154e5542f8889d0cf7700ce6d2511a
MD5 4e13e9732a114ed4bf3afc54801dd8e1
BLAKE2b-256 dfbfff8e3b2224c8f8698acfdec740987b1b3a4b2fee13c25443127e6de15d4b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page