Skip to main content

tools for comparing DNA sequences with MinHash sketches

Project description

sourmash

Documentation Build Status PyPI codecov DOI License: 3-Clause BSD

🦀 Rust API Documentation on docs.rs


Compute MinHash signatures for nucleotide (DNA/RNA) and protein sequences.

Usage:

sourmash compute *.fq.gz
sourmash compare *.sig -o distances
sourmash plot distances

sourmash 1.0 is published on JOSS; please cite that paper if you use sourmash (doi: 10.21105/joss.00027):.


The name is a riff off of Mash, combined with @ctb's love of whiskey. (Sour mash is used in making whiskey.)

Primary authors: C. Titus Brown (@ctb) and Luiz C. Irber, Jr (@luizirber).

sourmash is a product of the Lab for Data-Intensive Biology at the UC Davis School of Veterinary Medicine.

Installation

We recommend using bioconda to install sourmash:

conda install -c conda-forge -c bioconda sourmash

This will install the latest stable version of sourmash 2.

You can also use pip to install sourmash:

pip install sourmash

A quickstart tutorial is available.

Requirements

sourmash runs under both Python 2.7.x and Python 3.5+. The base requirements are screed and ijson, together with a Rust environment (for the extension code). We suggest using rustup to install the Rust environment:

curl https://sh.rustup.rs -sSf | sh

The comparison code (sourmash compare) uses numpy, and the plotting code uses matplotlib and scipy, but most of the code is usable without these.

For search and gather you also need khmer version 2.1+.

Installation with conda

Bioconda is a channel for the conda package manager with a focus on bioinformatics software. After installing conda you will need to add the bioconda channel as well as the other channels bioconda depends on. Once you have setup bioconda, you can install sourmash by running:

$ conda create -n sourmash_env -c conda-forge -c bioconda sourmash python=3.7
$ source activate sourmash_env
$ sourmash compute -h

which will install the latest alpha release.

Support

Please ask questions and files issues on Github.

Development

Development happens on github at dib-lab/sourmash.

After installation, sourmash is the main command-line entry point; run it with python -m sourmash, or do pip install -e /path/to/repo to do a developer install in a virtual environment.

The sourmash/ directory contains the library code.

Tests require py.test and can be run with make test.

Please see the developer notes for more information.


CTB Dec 2018

Project details


Release history Release notifications | RSS feed

This version

3.0.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sourmash-3.0.0.tar.gz (7.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

sourmash-3.0.0-py2.py3-none-manylinux2010_x86_64.whl (1.0 MB view details)

Uploaded Python 2Python 3manylinux: glibc 2.12+ x86-64

sourmash-3.0.0-py2.py3-none-manylinux1_x86_64.whl (1.0 MB view details)

Uploaded Python 2Python 3

sourmash-3.0.0-py2.py3-none-macosx_10_6_intel.whl (431.0 kB view details)

Uploaded Python 2Python 3macOS 10.6+ Intel (x86-64, i386)

File details

Details for the file sourmash-3.0.0.tar.gz.

File metadata

  • Download URL: sourmash-3.0.0.tar.gz
  • Upload date:
  • Size: 7.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.0

File hashes

Hashes for sourmash-3.0.0.tar.gz
Algorithm Hash digest
SHA256 f8b56b80142edc713594ae5790b390fd2fe0b4e4e0208fdd54e63a33a3b87038
MD5 7636640ee1d8c42de1bd2c1ef46edee6
BLAKE2b-256 cdad5306743a484a4545b282cfa70c0c913723a514fb0b7130e43287ee7e97ef

See more details on using hashes here.

File details

Details for the file sourmash-3.0.0-py2.py3-none-manylinux2010_x86_64.whl.

File metadata

  • Download URL: sourmash-3.0.0-py2.py3-none-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: Python 2, Python 3, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.0

File hashes

Hashes for sourmash-3.0.0-py2.py3-none-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 66d49f006c9efa6d3aa49e9f2b01a564e0e309490152e87572a646389e711d13
MD5 cac5c7bd1a581cf100e020e91ad49add
BLAKE2b-256 1b4b6c902a9f197d9004b84c86d47fcdc137381b88292eab1fdef2e15f60b426

See more details on using hashes here.

File details

Details for the file sourmash-3.0.0-py2.py3-none-manylinux1_x86_64.whl.

File metadata

  • Download URL: sourmash-3.0.0-py2.py3-none-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.0

File hashes

Hashes for sourmash-3.0.0-py2.py3-none-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 8e96ee1f6ce0d54d07bd13cbc6f682fe7bc158eb27bbbfce4c3a245fd78b7926
MD5 a0d35ebccc41915aa190b0b9833e0750
BLAKE2b-256 4eba11b77a5353643d7efb89c167ccfecf61e6d9e673624f55fb14b3416912fb

See more details on using hashes here.

File details

Details for the file sourmash-3.0.0-py2.py3-none-macosx_10_6_intel.whl.

File metadata

  • Download URL: sourmash-3.0.0-py2.py3-none-macosx_10_6_intel.whl
  • Upload date:
  • Size: 431.0 kB
  • Tags: Python 2, Python 3, macOS 10.6+ Intel (x86-64, i386)
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.0

File hashes

Hashes for sourmash-3.0.0-py2.py3-none-macosx_10_6_intel.whl
Algorithm Hash digest
SHA256 0fd4af1d8c080c443d7ed5f1f1a46fe1af11d2197ed4ebc99bfedba1872f1f2c
MD5 24fe9091477b8aaf78d08ec293f14e61
BLAKE2b-256 65a4cd772bb479cd39b08bd109b074fa0797f77df2fd61dff0a374370539b451

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page