updated standalone version of dbCAN annotation tool for automated CAZyme annotation
Project description
run_dbcan - Standalone Tool of dbCAN3
Announcement
⚠️ Important Notice:
Due to a recent cyberattack, our primary dbCAN web server is currently offline, and you will not be able to access the online database. Our IT team is actively working to resolve the issue. We apologize for any inconvenience this may cause.
In the meantime, you can still obtain the dbCAN database using our AWS S3 backup. Recommended methods:
1. Use the run_dbcan database command (recommended):
run_dbcan database --db_dir db --aws_s3
This command will download and organize the database files automatically.
2. Download via wget (not for folders):
Please note that wget cannot directly download an entire folder from an S3 bucket. It can only fetch individual files. To download all files, you will need to list the files and download them one by one or use AWS CLI. If you still want to download using wget, you must specify each file’s URL directly, for example:
wget https://dbcan.s3.us-west-2.amazonaws.com/db_v5-2_9-13-2025/some_file
If you want to download the entire folder, please use the AWS CLI as follows:
aws s3 cp s3://dbcan/db_v5-2_9-13-2025/ ./db --recursive
For more details on database downloads, please refer to our documentation.
If you have any questions or need help, feel free to open an issue.
Update
10/20/2025:
- SignalP6.0 Topology Annotation: Added support for SignalP6.0 signal peptide prediction. Use
--run_signalpflag inCAZyme_annotationcommand to enable topology annotation. Results are automatically added to the overview.tsv file. - Global Logging System: Implemented comprehensive logging system with
--log-level,--log-file, and--verboseoptions for better debugging and monitoring. - Database Download Command: Added new
databasecommand for easy database downloading. Supports both HTTP and AWS S3 sources (use--aws_s3flag for faster downloads). Use--cgc/--no-cgcto control CGC-related database downloads. - Code Structure Improvements: Continued refactoring with object-oriented programming, improved modularity, and centralized configuration management.
5/12/2025:
dev-dbcan branch is used to test new functions and fix issues. After testing, this branch will be merged into the main branch and update docker/conda/pypi. If you want to use those beta functions, please replace the code folder (dbcan) with your current package.
3/16/2025:
- Rewrite the structure of run_dbcan 4.0 (suggested by Haidong), using object-oriented programming (OOP) to improve maintainability and readability.
- Added new function: cgc_circle, which can visualize CGC in genome.
Future plans Add prediction of food consumption through CAZyme. If you have new suggestions, please contact Dr. Yanbin Yin (yyin@unl.edu), Xinpeng Zhang (xzhang55@huskers.unl.edu), and Dr. Haidong Yi (hyi@stjude.org).
Introduction
Notice
This is the updated version of run_dbcan 4.0. Many changes have been made and described in https://run-dbcan.readthedocs.io/en/latest/. From now on, this repo is the official run_dbcan site, and the site at run_dbcan 4.0 will be no longer maintained.
run_dbcan is the standalone version of the dbCAN3 annotation tool for automated CAZyme annotation. This tool, known as run_dbcan, incorporates pyHMMER (replacing HMMER for better performance), Diamond, and dbCAN_sub for annotating CAZyme families, and integrates CAZyme Gene Clusters (CGCs) and substrate predictions.
Main Commands
The tool provides the following main commands:
database- Download dbCAN databases (supports HTTP and AWS S3)CAZyme_annotation- Annotate CAZymes using Diamond, pyHMMER, and dbCAN-subgff_process- Generate GFF files for CGC identificationcgc_finder- Identify CAZyme Gene Clusters (CGCs)substrate_prediction- Predict substrate specificities of CGCscgc_circle_plot- Generate circular plots for CGCseasy_CGC- Complete CGC analysis pipeline (annotation + GFF processing + CGC identification)easy_substrate- Complete CGC analysis with substrate predictionPfam_null_cgc- Annotate null genes in CGCs using Pfam
All commands support global logging options: --log-level, --log-file, and --verbose.
For usage discussions, visit our issue tracker. To learn more, read the dbcan doc. If you're interested in contributing, whether through issues or pull requests, please review our contribution guide.
Reference
Please cite the following dbCAN publications if you use run_dbcan in your research:
dbCAN3: automated carbohydrate-active enzyme and substrate annotation
Jinfang Zheng, Qiwei Ge, Yuchen Yan, Xinpeng Zhang, Le Huang, Yanbin Yin,
Nucleic Acids Research, 2023;, gkad328, doi: 10.1093/nar/gkad328.
dbCAN2: a meta server for automated carbohydrate-active enzyme annotation
Han Zhang, Tanner Yohe, Le Huang, Sarah Entwistle, Peizhi Wu, Zhenglu Yang, Peter K Busk, Ying Xu, Yanbin Yin
Nucleic Acids Research, Volume 46, Issue W1, 2 July 2018, Pages W95–W101, doi: 10.1093/nar/gky418.
dbCAN-seq: a database of carbohydrate-active enzyme (CAZyme) sequence and annotation
Le Huang, Han Zhang, Peizhi Wu, Sarah Entwistle, Xueqiong Li, Tanner Yohe, Haidong Yi, Zhenglu Yang, Yanbin Yin
Nucleic Acids Research, Volume 46, Issue D1, 4 January 2018, Pages D516–D521, doi: 10.1093/nar/gkx894*.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbcan-5.2.7.tar.gz.
File metadata
- Download URL: dbcan-5.2.7.tar.gz
- Upload date:
- Size: 3.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14e36bb82a088d9c8dfa0ab6c37f58247e30b8666480f6b5999ca8d9567dc9c0
|
|
| MD5 |
01fda5c709a339b5d523ca3d4cb069da
|
|
| BLAKE2b-256 |
31c876fa92950f384ce43118990be71ce304af7813c7b85fe2b10db362e6c328
|
Provenance
The following attestation bundles were made for dbcan-5.2.7.tar.gz:
Publisher:
pypi_release.yml on bcb-unl/run_dbcan
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dbcan-5.2.7.tar.gz -
Subject digest:
14e36bb82a088d9c8dfa0ab6c37f58247e30b8666480f6b5999ca8d9567dc9c0 - Sigstore transparency entry: 912482968
- Sigstore integration time:
-
Permalink:
bcb-unl/run_dbcan@8da2d8ebe09fa7196fbd61e939f2d9ea43bb8b34 -
Branch / Tag:
refs/tags/v5.2.7 - Owner: https://github.com/bcb-unl
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi_release.yml@8da2d8ebe09fa7196fbd61e939f2d9ea43bb8b34 -
Trigger Event:
release
-
Statement type:
File details
Details for the file dbcan-5.2.7-py3-none-any.whl.
File metadata
- Download URL: dbcan-5.2.7-py3-none-any.whl
- Upload date:
- Size: 148.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
372a80d0671310ae6cc125224ddb19b106eee8b6d4cd597d20bd89553bb9b4a8
|
|
| MD5 |
8ac8eca59d2cf8a1b6b3feb207c83489
|
|
| BLAKE2b-256 |
3652e2d6de5c5e7e089d88f1b30709b98571196d7883dfee907a5d1887415c32
|
Provenance
The following attestation bundles were made for dbcan-5.2.7-py3-none-any.whl:
Publisher:
pypi_release.yml on bcb-unl/run_dbcan
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dbcan-5.2.7-py3-none-any.whl -
Subject digest:
372a80d0671310ae6cc125224ddb19b106eee8b6d4cd597d20bd89553bb9b4a8 - Sigstore transparency entry: 912482996
- Sigstore integration time:
-
Permalink:
bcb-unl/run_dbcan@8da2d8ebe09fa7196fbd61e939f2d9ea43bb8b34 -
Branch / Tag:
refs/tags/v5.2.7 - Owner: https://github.com/bcb-unl
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi_release.yml@8da2d8ebe09fa7196fbd61e939f2d9ea43bb8b34 -
Trigger Event:
release
-
Statement type: