Skip to main content

Cython bindings and Python interface to HMMER3.

Project description

🐍🟡♦️🟦 PyHMMER Stars

Cython bindings and Python interface to HMMER3.

Actions Coverage PyPI Bioconda AUR Wheel Python Versions Python Implementations License Source Mirror GitHub issues Docs Changelog Downloads Paper

🗺️ Overview

HMMER is a biological sequence analysis tool that uses profile hidden Markov models to search for sequence homologs. HMMER3 is developed and maintained by the Eddy/Rivas Laboratory at Harvard University.

pyhmmer is a Python package, implemented using the Cython language, that provides bindings to HMMER3. It directly interacts with the HMMER internals, which has the following advantages over CLI wrappers (like hmmer-py):

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyhmmer as a dependency to your project, and stop worrying about the HMMER binaries being properly setup on the end-user machine.
  • no intermediate files: Everything happens in memory, in Python objects you have control on, making it easier to pass your inputs to HMMER without needing to write them to a temporary file. Output retrieval is also done in memory, via instances of the pyhmmer.plan7.TopHits class.
  • no input formatting: The Easel object model is exposed in the pyhmmer.easel module, and you have the possibility to build a DigitalSequence object yourself to pass to the HMMER pipeline. This is useful if your sequences are already loaded in memory, for instance because you obtained them from another Python library (such as Pyrodigal or Biopython).
  • no output formatting: HMMER3 is notorious for its numerous output files and its fixed-width tabular output, which is hard to parse (even Bio.SearchIO.HmmerIO is struggling on some sequences).
  • efficient: Using pyhmmer to launch hmmsearch on sequences and HMMs in disk storage is typically as fast as directly using the hmmsearch binary (see the Benchmarks section). pyhmmer.hmmer.hmmsearch uses a different parallelisation strategy compared to the hmmsearch binary from HMMER, which can help getting the most of multiple CPUs when annotating smaller sequence databases.

This library is still a work-in-progress, and in an experimental stage, but it should already pack enough features to run biological analyses or workflows involving hmmsearch, hmmscan, nhmmer, phmmer, hmmbuild and hmmalign.

🔧 Installing

pyhmmer can be installed from PyPI, which hosts some pre-built CPython wheels for x86-64 Linux, as well as the code required to compile from source with Cython:

$ pip install pyhmmer

Compilation for UNIX PowerPC is not tested in CI, but should work out of the box. Other architectures (e.g. Arm) and OSes (e.g. Windows) are not supported by HMMER.

A Bioconda package is also available:

$ conda install -c bioconda pyhmmer

🔖 Citation

PyHMMER is scientific software, with a published paper in the Bioinformatics. Please cite both PyHMMER and HMMER if you are using it in an academic work, for instance as:

PyHMMER (Larralde et al., 2023), a Python library binding to HMMER (Eddy, 2011).

Detailed references are available on the Publications page of the online documentation.

📖 Documentation

A complete API reference can be found in the online documentation, or directly from the command line using pydoc:

$ pydoc pyhmmer.easel
$ pydoc pyhmmer.plan7

💡 Example

Use pyhmmer to run hmmsearch to search for Type 2 PKS domains (t2pks.hmm) inside proteins extracted from the genome of Anaerococcus provencensis (938293.PRJEB85.HG003687.faa). This will produce an iterable over TopHits that can be used for further sorting/querying in Python. Processing happens in parallel using Python threads, and a TopHits object is yielded for every HMM passed in the input iterable.

import pyhmmer

with pyhmmer.easel.SequenceFile("pyhmmer/tests/data/seqs/938293.PRJEB85.HG003687.faa", digital=True) as seq_file:
    sequences = list(seq_file)

with pyhmmer.plan7.HMMFile("pyhmmer/tests/data/hmms/txt/t2pks.hmm") as hmm_file:
    for hits in pyhmmer.hmmsearch(hmm_file, sequences, cpus=4):
      print(f"HMM {hits.query_name.decode()} found {len(hits)} hits in the target sequences")

Have a look at more in-depth examples such as building a HMM from an alignment, analysing the active site of a hit, or fetching marker genes from a genome in the Examples page of the online documentation.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⏱️ Benchmarks

Benchmarks were run on a i7-10710U CPU running @1.10GHz with 6 physical / 12 logical cores, using a FASTA file containing 4,489 protein sequences extracted from the genome of Escherichia coli (562.PRJEB4685) and the version 33.1 of the Pfam HMM library containing 18,259 domains. Commands were run 3 times on a warm SSD. Plain lines show the times for pressed HMMs, and dashed-lines the times for HMMs in text format.

Benchmarks

Raw numbers can be found in the benches folder. They suggest that phmmer should be run with the number of logical cores, while hmmsearch should be run with the number of physical cores (or less). A possible explanation for this observation would be that HMMER platform-specific code requires too many SIMD registers per thread to benefit from simultaneous multi-threading.

To read more about how PyHMMER achieves better parallelism than HMMER for many-to-many searches, have a look at the Performance page of the documentation.

🔍 See Also

Building a HMM from scratch? Then you may be interested in the pyfamsa package, providing bindings to FAMSA, a very fast multiple sequence aligner. In addition, you may want to trim alignments: in that case, consider pytrimal, which wraps trimAl 2.0.

If despite of all the advantages listed earlier, you would rather use HMMER through its CLI, this package will not be of great help. You can instead check the hmmer-py package developed by Danilo Horta at the EMBL-EBI.

⚖️ License

This library is provided under the MIT License. The HMMER3 and Easel code is available under the BSD 3-clause license. See vendor/hmmer/LICENSE and vendor/easel/LICENSE for more information.

This project is in no way affiliated, sponsored, or otherwise endorsed by the original HMMER authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Project details


Release history Release notifications | RSS feed

This version

0.8.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhmmer-0.8.0.tar.gz (11.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pyhmmer-0.8.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.7 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl (10.6 MB view details)

Uploaded PyPymacOS 10.9+ x86-64

pyhmmer-0.8.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.7 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (10.6 MB view details)

Uploaded PyPymacOS 10.9+ x86-64

pyhmmer-0.8.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.8 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl (10.6 MB view details)

Uploaded PyPymacOS 10.9+ x86-64

pyhmmer-0.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.0-cp311-cp311-macosx_10_9_universal2.whl (11.1 MB view details)

Uploaded CPython 3.11macOS 10.9+ universal2 (ARM64, x86-64)

pyhmmer-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.0-cp310-cp310-macosx_11_0_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.10macOS 11.0+ x86-64

pyhmmer-0.8.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.5 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.0-cp39-cp39-macosx_11_0_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.9macOS 11.0+ x86-64

pyhmmer-0.8.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.9 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.0-cp38-cp38-macosx_10_15_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.8macOS 10.15+ x86-64

pyhmmer-0.8.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.0-cp37-cp37m-macosx_10_15_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.7mmacOS 10.15+ x86-64

pyhmmer-0.8.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

File details

Details for the file pyhmmer-0.8.0.tar.gz.

File metadata

  • Download URL: pyhmmer-0.8.0.tar.gz
  • Upload date:
  • Size: 11.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for pyhmmer-0.8.0.tar.gz
Algorithm Hash digest
SHA256 9d3ddfe2115c5ed089b87599a64a4882608228e51f12a7d7793792f02ef5ba64
MD5 7aafd5ebf1d861c6a5eec33aff3b019c
BLAKE2b-256 0f5c14aaa03cfdaaf956e079eb555c17b6307858b99f1c3f80ef027dbe3910bf

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 499cd23bd5a22942b5425c5495e08dbb0a268d7e7c7fda108a722762526c738c
MD5 7e1667c584b7669b916507f67fb7f8b1
BLAKE2b-256 7463648053f4d3cc6fe8f86473265e709fbdac17700a2618706e2eb83d219716

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 642dc2e2c5605094b9a813e07b45c0aa8a91a509f04a767faa0bb444f438a88b
MD5 f6d5fcc77796bc197df3d87422565432
BLAKE2b-256 cd5e537eab68a1a757accf9bf36fb86913427225f548bc5b721852024f0c49ab

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 5917841acf546c357f810e1fb4d5c50b6ae95cfbbe18779a935f0771a83a5e87
MD5 92a3397d55af3835823f0a05c7a6680e
BLAKE2b-256 18643883ac90e9e1cd3826a4ba4872af387f914b6879d76cb48e2ccca4d589ed

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c8f49b7cdfbc376317448cbde66c485318d73c39c6fcfbb6efafbea28ce503d7
MD5 1a754f9d24fcdee0d73ac35841ef353d
BLAKE2b-256 cd2e5a9fdfa2fa8e7896954ae2bdceea262ed56668704d442e7b4a8f2cfca534

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 ac8d556c10a440ead212654b8a438a68a42a2e954b6265348dd553b74d21ae06
MD5 c0eb8da69c5c2391b204d2145d29e181
BLAKE2b-256 e6295a7a4bd1ada554e49d34d8050ae571e5679f80fcae115ae2e4794cfac402

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 63cd3577efd8e1baa7e47494d96ee2ed1394c5566e55a2abd6c84681b6db24e8
MD5 b7e835174b3196b477cca4141c65e2cc
BLAKE2b-256 88900f0af36b0861b542292872bf99696eb852bb2ffdae35e02532b924a91736

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 ccee98ec29fded50bc77a5ceb13792a207f93218b2b633158c239a9ca538f92d
MD5 5d1ec27b726a9040a33fa32197d841f9
BLAKE2b-256 c4a62560bcdda70fda6cb767272851a84c700ab754bb3193ab4e7a077d10cf3d

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 44d43597bb79cc5ab8737aea720ccdf0eaf63362c71bce43dd52cd523254f760
MD5 cd52c66255f241d55aa3e8ed3a28eacd
BLAKE2b-256 e17299abe4102e7994572f8d786abaaa8974352e917a690e888c2becc89f93b6

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 a8b22cb85fd9537858db60c3f66f5b18950c80b8f98fd3580f373e15a0698071
MD5 7b4f17337f9d70e185fb0f957ac90f35
BLAKE2b-256 da4e6698347f2848bf6e953daee133c6cc27fb7f340a1946815cc9566a3a3539

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-cp310-cp310-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-cp310-cp310-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 94c0f8ee40d51996b213601dd8d8478c62d8b92534e8e896916436aafb0d7f79
MD5 8fd3aa568d49c93f9c6917b11be5eaaa
BLAKE2b-256 e0769d254f4f85a2b15718dca73e2091837e359b3b19eb30d47ae15fc236c088

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 f9da34751c7c68e47b6e1fda090be407e3048f4f3536233a5e3e7cd54b68f887
MD5 b8cedf80a26f8a2e9b4b54665fa1127b
BLAKE2b-256 bbd6df65848b5f3c3ea24a7a1a19fee4fd559377953d62e048bec4eba7b68043

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-cp39-cp39-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-cp39-cp39-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 1eff46e4c52deb5e36b2fd0fe0972dfdc345a38579f07363f21983918fbe01a0
MD5 91931302fc8a299c2f5056331e16212b
BLAKE2b-256 712d6d020e9303d2422ed199d3181a96117b25366bef87fd4e1d49f964644a72

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 da65138c2f863bfdd342b22e04c0c69f5adb413650d12364d1a940d8ec223a32
MD5 4c9fb7243a85fa5168cd88628a379851
BLAKE2b-256 178e1161db151a5638878ffa1de64d6af0133a3a24400cd02ae6967aef498c4d

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 28ce05c6536143fa231ce1995ed0720fef9e1876149dec40f8ab8daefd80ecff
MD5 aaac4f8a258216f113668b1cc9c9154e
BLAKE2b-256 cc73fe5f2fab87655bb9eb30f9b6177b1d4b1593873d12960a5e4fee74d9c9f2

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 e09136f6b35a03b09b4c4e537823e67fb966464eef76bf14065383bc4922c671
MD5 977cc3ad7e88bab01d801726fc6f6aec
BLAKE2b-256 9ad40d65fae4b288faaec2eb9cc27db600a01dcf50be1ad4961904340e1b74f6

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 2db9429c2c8f417f9badc20137486785727df33ab004edb74bd6333ffbe4b848
MD5 ca57c7049601afed83c0af409cfd90e3
BLAKE2b-256 ffc57fd7e8f5418e609b6d6b12c5a24439943d85600a98ca94a248cfcf5aa1dd

See more details on using hashes here.

File details

Details for the file pyhmmer-0.8.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 8e0f0a5141b40ae8a8b648b73fd1a6de203e2ada05e0a108fe900f201b825416
MD5 0617b73d0d0ddfa35e009002f22541e7
BLAKE2b-256 548054e8e341dd4dbe26f6421db55e52a5381403c5a7fdb505b6348dc9661e07

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page