Skip to main content

Cython bindings and Python interface to FastANI.

Project description

🐍⏩🧬 pyFastANI Stars

Cython bindings and Python interface to FastANI, a method for fast whole-genome similarity estimation.

Actions Coverage License PyPI Bioconda Wheel Python Versions Python Implementations Source GitHub issues Changelog Downloads DOI

🗺️ Overview

FastANI is a method published in 2018 by Jain et al. for high-throughput computATION of whole-genome Average Nucleotide Identity (ANI). It uses MashMap to compute orthologous mappings without the need for expensive alignments.

pyfastani is a Python module, implemented using the Cython language, that provides bindings to FastANI. It directly interacts with the FastANI internals, which has the following advantages over CLI wrappers:

  • simpler compilation: FastANI requires several additional libraries, which make compilation of the original binary non-trivial. In pyFastANI, libraries that were needed for threading or I/O are provided as stubs, so you only need to have boost::math to build. Or even better, just install from one of the provided wheels!
  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyfastani as a dependency to your project, and stop worrying about the FastANI binary being present on the end-user machine.
  • sans I/O: Everything happens in memory, in Python objects you control, making it easier to pass your sequences to FastANI without needing to write them to a temporary file.

This library is still a work-in-progress, and in an experimental stage, but it should already pack enough features to run one-to-one computations.

🔧 Installing

pyFastANI can be installed directly from PyPI, which hosts some pre-built CPython wheels for x86-64 Unix platforms, as well as the code required to compile from source with Cython:

$ pip install pyfastani

Note that in the event you compile from source, you will need to have the headers and libraries for boost::math available.

💡 Example

The following snippets show how to compute the ANI between two genomes, with the reference being a draft genome. For one-to-many or many-to-many searches, simply add additional references with m.add_draft before indexing. Note that any name can be given to the reference sequences, this will just affect the name attribute of the hits returned for a query.

🔬 Biopython

Biopython does not let us access to the sequence directly, so we need to convert it to bytes first with the bytes builtin function. For older versions of Biopython (earlier than 1.79), use record.seq.encode() instead of bytes(record.seq).

import pyfastani
import Bio.SeqIO

m = pyfastani.Mapper()

# add a single draft genome to the mapper, and index it
ref = list(Bio.SeqIO.parse("vendor/FastANI/data/Shigella_flexneri_2a_01.fna", "fasta"))
m.add_draft("Shigella_flexneri_2a_01", (bytes(record.seq) for record in ref))
m.index()

# read the query and query the mapper
query = Bio.SeqIO.read("vendor/FastANI/data/Escherichia_coli_str_K12_MG1655.fna", "fasta")
hits = m.query_sequence(bytes(query.seq))

for hit in hits:
    print("Escherichia_coli_str_K12_MG1655", hit.name, hit.identity, hit.matches, hit.fragments)

🧪 Scikit-bio

Scikit-bio lets us access to the sequence directly as a numpy array, but shows the values as byte strings by default. To make them readable as char (for compatibility with the C code), they must be cast with seq.values.view('B').

import pyfastani
import skbio.io

m = pyfastani.Mapper()

ref = list(skbio.io.read("vendor/FastANI/data/Shigella_flexneri_2a_01.fna", "fasta"))
m.add_draft("Shigella_flexneri_2a_01", (seq.values.view('B') for seq in ref))
m.index()

# read the query and query the mapper
query = next(skbio.io.read("vendor/FastANI/data/Escherichia_coli_str_K12_MG1655.fna", "fasta"))
hits = m.query_genome(query.values.view('B'))

for hit in hits:
    print("Escherichia_coli_str_K12_MG1655", hit.name, hit.identity, hit.matches, hit.fragments)

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⚖️ License

This library is provided under the MIT License. The FastANI code was written by Chirag Jain and is distributed under the terms of the Apache License 2.0 license, unless otherwise specified in vendored sources. See vendor/FastANI/LICENSE for more information.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original FastANI authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyfastani-0.1.0.tar.gz (18.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pyfastani-0.1.0-pp37-pypy37_pp73-manylinux_2_24_x86_64.whl (198.8 kB view details)

Uploaded PyPymanylinux: glibc 2.24+ x86-64

pyfastani-0.1.0-pp37-pypy37_pp73-macosx_10_7_x86_64.whl (178.9 kB view details)

Uploaded PyPymacOS 10.7+ x86-64

pyfastani-0.1.0-cp39-cp39-manylinux_2_24_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.24+ x86-64

pyfastani-0.1.0-cp39-cp39-macosx_10_14_x86_64.whl (219.4 kB view details)

Uploaded CPython 3.9macOS 10.14+ x86-64

pyfastani-0.1.0-cp38-cp38-manylinux_2_24_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.24+ x86-64

pyfastani-0.1.0-cp38-cp38-macosx_10_14_x86_64.whl (216.9 kB view details)

Uploaded CPython 3.8macOS 10.14+ x86-64

pyfastani-0.1.0-cp37-cp37m-manylinux_2_24_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.24+ x86-64

pyfastani-0.1.0-cp37-cp37m-macosx_10_14_x86_64.whl (215.9 kB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

pyfastani-0.1.0-cp36-cp36m-manylinux_2_24_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.24+ x86-64

pyfastani-0.1.0-cp36-cp36m-macosx_10_14_x86_64.whl (215.9 kB view details)

Uploaded CPython 3.6mmacOS 10.14+ x86-64

File details

Details for the file pyfastani-0.1.0.tar.gz.

File metadata

  • Download URL: pyfastani-0.1.0.tar.gz
  • Upload date:
  • Size: 18.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for pyfastani-0.1.0.tar.gz
Algorithm Hash digest
SHA256 fa35045f97610db6751b699fd487952d7f8e67d7ed554a3038e3089053b02435
MD5 e67e6a3b53ffbddeff98ca22149b4e09
BLAKE2b-256 56e38a2309181b801add4dcb4cd5a067892457dc02af9499e2d02e749721faa7

See more details on using hashes here.

File details

Details for the file pyfastani-0.1.0-pp37-pypy37_pp73-manylinux_2_24_x86_64.whl.

File metadata

  • Download URL: pyfastani-0.1.0-pp37-pypy37_pp73-manylinux_2_24_x86_64.whl
  • Upload date:
  • Size: 198.8 kB
  • Tags: PyPy, manylinux: glibc 2.24+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for pyfastani-0.1.0-pp37-pypy37_pp73-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 764933733acf7b58e1d0e765328d1423d9ce458df681847ffb9ea54fba08edca
MD5 bdb33baabb015cc264eec1fa24ed2db3
BLAKE2b-256 d5f247d4148347c1a2b740ce7b82652069c29b0b13a468e90e421567527906e8

See more details on using hashes here.

File details

Details for the file pyfastani-0.1.0-pp37-pypy37_pp73-macosx_10_7_x86_64.whl.

File metadata

  • Download URL: pyfastani-0.1.0-pp37-pypy37_pp73-macosx_10_7_x86_64.whl
  • Upload date:
  • Size: 178.9 kB
  • Tags: PyPy, macOS 10.7+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for pyfastani-0.1.0-pp37-pypy37_pp73-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 b373a289ad64cfa8eebb2f1c187ec3182f1c46e1235cd4cb55b8ce1fd47530ee
MD5 8622bd9c7390365ecd38a47e3dcc5a0b
BLAKE2b-256 465484aedc9071fb3f9cc57c3ca1be666d5a2f4b194329ff35e3a496b99e0706

See more details on using hashes here.

File details

Details for the file pyfastani-0.1.0-cp39-cp39-manylinux_2_24_x86_64.whl.

File metadata

  • Download URL: pyfastani-0.1.0-cp39-cp39-manylinux_2_24_x86_64.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: CPython 3.9, manylinux: glibc 2.24+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for pyfastani-0.1.0-cp39-cp39-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 aece0a8ef63b9e8ef36740ef5077297ba288f444454fc1a1f4ccd8ddc010124f
MD5 35f6cddd07d97c36f0909e5479059810
BLAKE2b-256 44d6d1c43c20d420c44f5944c0ff1b9527f7ad33a51f9263b98bd250a112eea9

See more details on using hashes here.

File details

Details for the file pyfastani-0.1.0-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pyfastani-0.1.0-cp39-cp39-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 219.4 kB
  • Tags: CPython 3.9, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for pyfastani-0.1.0-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 ac87a9fb42a6f98378f7ad9d748db5b583f5d2ce179a24a6943ba05a7171661a
MD5 528b6fa7844ea4e8728d6bd39e0b1ea5
BLAKE2b-256 47e93e6818ce5f4a58f82f4cc4181df9fe99a9626d2743beadfc4799a3961a8c

See more details on using hashes here.

File details

Details for the file pyfastani-0.1.0-cp38-cp38-manylinux_2_24_x86_64.whl.

File metadata

  • Download URL: pyfastani-0.1.0-cp38-cp38-manylinux_2_24_x86_64.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: CPython 3.8, manylinux: glibc 2.24+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for pyfastani-0.1.0-cp38-cp38-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 74cdf95a0267fb4a5f863fc0160ed5325bf6d10e729cf6fccdb513f290996e33
MD5 2ca9ce24667d227653700484cfdc0f01
BLAKE2b-256 a8ea0daaf8f35f62dcbad48bab59d26e6a811d93def44ebe1585d7c6fc289d0f

See more details on using hashes here.

File details

Details for the file pyfastani-0.1.0-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pyfastani-0.1.0-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 216.9 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for pyfastani-0.1.0-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e9d6254c9ec17c008effec13176a3c23fc861e3979d34446795665c0c652e76d
MD5 7b7a0e5791f84957e595cc9f04924edf
BLAKE2b-256 e07d1efdab8fe5f1920cca5fb23d6506a5cb60bff0340b64d73f981466acf38e

See more details on using hashes here.

File details

Details for the file pyfastani-0.1.0-cp37-cp37m-manylinux_2_24_x86_64.whl.

File metadata

  • Download URL: pyfastani-0.1.0-cp37-cp37m-manylinux_2_24_x86_64.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.24+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for pyfastani-0.1.0-cp37-cp37m-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 867c813a43069acc78250e32fee5105209ccc58542e896afd463a3fefb9ab3d2
MD5 2687fa77fda2b59008624df0f802ca86
BLAKE2b-256 7fe46818f273e6990a0e649d236d44db7a3f56a694cbe6cba4b18fa51c517c4f

See more details on using hashes here.

File details

Details for the file pyfastani-0.1.0-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pyfastani-0.1.0-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 215.9 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for pyfastani-0.1.0-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 7fcafbf6861ebc02701203c8cfe1c525b33cd6999abe7b3c13d1009480652e6e
MD5 7096838eb8ed9e0e1b9088aa347bf760
BLAKE2b-256 23fbc55d6b960314cb27f162ca010782906e0ccfe8390648cc01e82bab185953

See more details on using hashes here.

File details

Details for the file pyfastani-0.1.0-cp36-cp36m-manylinux_2_24_x86_64.whl.

File metadata

  • Download URL: pyfastani-0.1.0-cp36-cp36m-manylinux_2_24_x86_64.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.24+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for pyfastani-0.1.0-cp36-cp36m-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 61beb75c2f0ed41a31edaba5a0eccf6c63c6bc1bd131d2e36188be790dec32a1
MD5 b1f09b20202fca25395be03650e71d37
BLAKE2b-256 315afb8f4066ef68dd8eba0c289654cd5247244a01c6f6f9d0f91259c89a83f4

See more details on using hashes here.

File details

Details for the file pyfastani-0.1.0-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pyfastani-0.1.0-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 215.9 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for pyfastani-0.1.0-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 0fd1b6282e1dc95be7702025f651b419d903286c3a89bc96b458558551dc91b5
MD5 16c56a31efe3f815fc62e16943f210dd
BLAKE2b-256 18079ed2c77a83ae6bcce41fcfcf98969b25efc6b061ac740bb3bd70fd90de25

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page