Skip to main content

Slurm Experiment Management Library

Project description

Github Actions

SEML: Slurm Experiment Management Library

SEML is the missing link between the open-source workload scheduling system Slurm, the experiment management tool sacred, and a MongoDB experiment database. It is lightweight, hackable, written in pure Python, and scales to thousands of experiments.

Keeping track of computational experiments can be annoying and failure to do so can lead to lost results, duplicate running of the same experiments, and lots of headaches. While workload scheduling systems such as Slurm make it easy to run many experiments in parallel on a cluster, it can be hard to keep track of which parameter configurations are running, failed, or completed. sacred is a great tool to collect and manage experiments and their results, especially when used with a MongoDB. However, it is lacking integration with workload schedulers.

SEML enables you to

  • very easily define hyperparameter search spaces using YAML files,
  • run these hyperparameter configurations on a compute cluster using Slurm,
  • and to track the experimental results using sacred and MongoDB.

In addition, SEML offers many more features to make your life easier, such as

  • automatically saving and loading your source code for reproducibility,
  • easy debugging on Slurm or locally,
  • automatically checking your experiment configurations,
  • extending Slurm with local workers,
  • and keeping track of resource usage (experiment runtime, RAM, etc.).

Get started

To get started, install SEML either via pip:

pip install seml

or conda (the conda version may be outdated, we highly recommend the pypi version!):

conda install -c conda-forge seml

Then configure your MongoDB via:

seml configure  --mongodb # provide your MongoDB credentials

Development

If you want to develop seml please clone the repository and install it via

pip install -e .[dev]

and install pre-commit hooks via

pre-commit install

Documentation

Documentation is available in our docs.md or via the CLI:

seml --help

Example

See our simple example to get familiar with how SEML works.

CLI completion

SEML supports command line completion. To install this feature run:

seml --install-completion {shell}

If you are using the zsh shell, you might have to append compinit -D to the ~/.zshrc file (see this issue).

Slurm version

SEML should work with Slurm 18.08 and above out of the box. Version 17.11 and earlier do not have a SIGNALING job state, which you have to remove from the SLURM_STATES defined in SEML's settings (seml/settings.py). Earlier versions have not been tested and might have other issues.

Contact

Contact us at zuegnerd@in.tum.de, johannes.gasteiger@tum.de, or n.gao@tum.de for any questions.

Cite

When you use SEML in your own work, please cite the software along the lines of the following bibtex:

@software{seml_2023,
  author = {Z{\"u}gner, Daniel and Gasteiger, Johannes and Gao, Nicholas and Dominik Fuchsgruber},
  title = {{SEML: Slurm Experiment Management Library}},
  url = {https://github.com/TUM-DAML/seml},
  version = {0.4.0},
  year = {2023}
}

Copyright (C) 2023 Daniel Zügner, Johannes Gasteiger, Nicholas Gao, Dominik Fuchsgruber Technical University of Munich

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seml-0.4.1.tar.gz (72.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seml-0.4.1-py3-none-any.whl (78.9 kB view details)

Uploaded Python 3

File details

Details for the file seml-0.4.1.tar.gz.

File metadata

  • Download URL: seml-0.4.1.tar.gz
  • Upload date:
  • Size: 72.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for seml-0.4.1.tar.gz
Algorithm Hash digest
SHA256 c7f18717ed2c4e401bee0cb531c28022ab250ecc106967783e6604e450de714f
MD5 1d52b1a8a54820d7da32e95867f78d28
BLAKE2b-256 d4c16879928a691be8f99b86918442bc107752d9414f938915d06565aa00cefa

See more details on using hashes here.

File details

Details for the file seml-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: seml-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 78.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for seml-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 84760090cec688a3432d2a4a5f7d578fffc91fd6ca2e9e6c419cfa3c8d8d33b8
MD5 cfa84a621504fbdf8240231a20cb5415
BLAKE2b-256 dee4656c93a95d6caad03363c87d5e1b06a3fb435fd091bc57cef8e1fc9f98ec

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page