No project description provided
Project description
BABACHI: Backgroud Allelic Dosage Bayesian Checkpoint Identification
BABACHI is a tool for Background Allelic Dosage (BAD) genomic regions calling from
non-phased heterozygous SNVs. It is aimed at estimation of BAD on low-coverage sequencing data, where
the precise estimation of allelic copy numbers is not possible.
BAD corresponds to the ratio of Major copy number to Minor copy number.
BABACHI takes in a vcf-like .tsv file with heterozygous SNVs sorted by genome positions (ascending). The input file must contain the following first 7 columns: chromosome, position, ID, reference base, alternative base, reference read count, alternative read count All lines, starting with # are ignored.
The output is a .bed file with BAD annotations.
System Requirements
Hardware requirements
BABABCHI package requires only a standard computer with enough RAM to support the in-memory operations.
Software requirements
OS Requirements
The package can be installed on all major platforms (e.g. BSD, GNU/Linux, OS X, Windows) from Python Package Index (PyPI) and GitHub. The package has been tested on the following systems:
- Windows: Windows 10
- Linux: Ubuntu 18.04
Python Dependencies
BABACHI mainly depends on the following Python 3 packages:
docopt>=0.6.2
numpy>=1.18.0
schema>=0.7.2
contextlib2>=0.5.5
pandas>=1.0.4
matplotlib>=3.2.1
seaborn>=0.10.1
Installation
Install from PyPi
pip3 install babachi
Install from Github
git clone https://github.com/autosome-ru/BABACHI
cd BABACHI
python3 setup.py install
sudo, if required The package should take less than 1 minute to install.
Requirements
python >= 3.6
Usage
babachi <options>...
To get full usage description one can execute:
babachi --help
This will produce the following message:
Usage:
babachi <file> [-O <path> |--output <path>] [-q | --quiet] [--allele_reads_tr <int>] [--force-sort] [--visualize] [--boundary-penalty <float>] [--states <string>]
babachi (--test) [-O <path> |--output <path>] [-q | --quiet] [--allele_reads_tr <int>] [--force-sort] [--visualize] [--boundary-penalty <float>]
babachi visualize <file> (-b <badmap>| --badmap <badmap>) [-q | --quiet] [--allele_reads_tr <int>]
babachi -h | --help
Arguments:
<file> Path to input file in tsv format with columns:
chr pos ID ref_base alt_base ref_read_count alt_read_count.
<badmap> Path to badmap .bed format file
<int> Non negative integer
<float> Non negative number
<states_string> String of states separated with "," (to provide fraction use "/", e.g. 4/3). Each state must be >= 1
Options:
-h, --help Show help.
-q, --quiet Less log messages during work time.
-b <badmap>, --badmap <badmap> Input badmap file
-O <path>, --output <path> Output directory or file path. [default: ./]
--allele_reads_tr <int> Allelic reads threshold. Input SNPs will be filtered by ref_read_count >= x and
alt_read_count >= x. [default: 5]
--force-sort Do chromosomes need to be sorted
--visualize Perform visualization of SNP-wise AD and BAD for each chromosome.
Will create a directory in output path for the .svg visualizations.
--boundary-penalty <float> Boundary penalty coefficient [default: 9]
--states <states_string> States string [default: 1,2,3,4,5,6,1.5,2.5]
--test Run segmentation on test file
Demo
To perform a test run:
babachi --test
The test run takes approximately 2 minutes on a standard computer.
The result is a file named test.bed that will be produced in the root directory of the project (if -O option is not used).
The contents of the test.bed file should start as follows:
#chr start end BAD Q1.00 Q1.33 Q1.50 Q2.00 Q2.50 Q3.00 Q4.00 Q5.00 Q6.00 SNP_count sum_cover
chr1 1 125183196 2 -63.47825919524621 -24.598710473939718 -8.145646624117944 -2.000888343900442e-11 -30.773041699645546 -78.80480783186977 -189.88685134708248 -299.82657588596703 -401.6012195141575 1325 17280
Each row represents a single segment with a constant estimated BAD. The columns are as follows:
- #chr: chromosome
- start: segment start position
- end: segment end position
- BAD: estimated BAD
- QX: the logarithmic likelyhood of the segment to have BAD = X
- SNP_count: number of SNPs of the segment
- sum_cover: the total read coverage of all SNPs of the segment
The BABACHI tool is maintained by Sergey Abramov and Alexandr Boytsov.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file babachi-1.5.1.tar.gz.
File metadata
- Download URL: babachi-1.5.1.tar.gz
- Upload date:
- Size: 18.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7f05be9be4e0a19f1038c7fd248f13890cbc867fe92b46151aabe820696e77b
|
|
| MD5 |
2e0deb13c97545bd5bba3c5c4fccb2f6
|
|
| BLAKE2b-256 |
a4e8a4afdf1ac766f89a80800c11062e3f24d75e3918987a383c13ab3a64cf33
|
File details
Details for the file babachi-1.5.1-py3-none-any.whl.
File metadata
- Download URL: babachi-1.5.1-py3-none-any.whl
- Upload date:
- Size: 736.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ea670c7f3febc4446dfe4341f8e150e4aa70b81eba09374b9a7d6ee5eade7ab8
|
|
| MD5 |
42c4aa953934ba29a190f1c2be6585c3
|
|
| BLAKE2b-256 |
03984dc1f52fcc4cd9d3e24ef52a365105bf7fd6005d3ed906beca0596fd7316
|