Skip to main content

convert between bioinformatics formats

Project description

Bioconvert

Bioconvert is a collaborative project to facilitate the interconversion of life science data from one format to another.

https://badge.fury.io/py/bioconvert.svg https://github.com/bioconvert/bioconvert/actions/workflows/main.yml/badge.svg?branch=main https://coveralls.io/repos/github/bioconvert/bioconvert/badge.svg?branch=main Documentation Status https://img.shields.io/github/issues/bioconvert/bioconvert.svg https://anaconda.org/bioconda/bioconvert/badges/platforms.svg https://anaconda.org/bioconda/bioconvert/badges/version.svg https://anaconda.org/bioconda/bioconvert/badges/downloads.svg https://zenodo.org/badge/106598809.svg https://static.pepy.tech/personalized-badge/bioconvert?period=month&units=international_system&left_color=black&right_color=blue&left_text=Downloads/months https://raw.githubusercontent.com/bioconvert/bioconvert/main/doc/_static/logo_300x200.png
contributions:

Want to add a convertor ? Please join https://github.com/bioconvert/bioconvert/issues/1

Overview

Life science uses many different formats. They may be old, or with complex syntax and converting those formats may be a challenge. Bioconvert aims at providing a common tool / interface to convert life science data formats from one to another.

Many conversion tools already exist but they may be dispersed, focused on few specific formats, difficult to install, or not optimised. With Bioconvert, we plan to cover a wide spectrum of format conversions; we will re-use existing tools when possible and provide facilities to compare different conversion tools or methods via benchmarking. New implementations are provided when considered better than existing ones.

In Jan 2023, we had 50 formats, 100 direct conversions available.

https://raw.githubusercontent.com/bioconvert/bioconvert/main/doc/conversion.png

Installation

BioConvert is developped in Python. Please use conda or any Python environment manager to install BioConvert using the pip command:

pip install bioconvert

50% of the conversions should work out of the box. However, many conversions require external tools. This is why we recommend to use a conda environment. In particular, most external tools are available on the bioconda channel. For instance if you want to convert a SAM file to a BAM file you would need to install samtools as follow:

conda install -c bioconda samtools

Since bioconvert is available on bioconda on solution that installs BioConvert and all its dependencies is to use conda/mamba:

conda env create --name bioconvert mamba
conda activate bioconvert
mamba install bioconvert
bioconvert --help

See the Installation section for more details and alternative solutions (docker, singularity).

Quick Start

There are many conversions available. Type:

bioconvert --help

to get a list of valid method of conversions. Taking the example of a conversion from a FastQ file into a FastA file, you could do the conversion as follows:

bioconvert fastq2fasta input.fastq output.fasta
bioconvert fastq2fasta input.fq    output.fasta
bioconvert fastq2fasta input.fq.gz output.fasta.gz
bioconvert fastq2fasta input.fq.gz output.fasta.bz2

When there is no ambiguity, you can be implicit:

bioconvert input.fastq output.fasta

The default method of conversion is used but you may use another one. Checkout the available methods with:

bioconvert fastq2fasta --show-methods

For more help about a conversion, just type:

bioconvert fastq2fasta --help

and more generally:

bioconvert --help

You may also call BioConvert from a Python shell:

# import a converter
from bioconvert.fastq2fasta import FASTQ2FASTA

# Instanciate with infile/outfile names
convert = FASTQ2FASTA(infile, outfile)

# the conversion itself:
convert()

Available Converters

Conversion table

Converters

CI testing

Default method

abi2fasta

https://github.com/bioconvert/bioconvert/actions/workflows/abi2fasta.yml/badge.svg

BIOPYTHON

abi2fastq

https://github.com/bioconvert/bioconvert/actions/workflows/abi2fastq.yml/badge.svg

BIOPYTHON

abi2qual

https://github.com/bioconvert/bioconvert/actions/workflows/abi2qual.yml/badge.svg

BIOPYTHON

bam2bedgraph

https://github.com/bioconvert/bioconvert/actions/workflows/bam2bedgraph.yml/badge.svg

BEDTOOLS

bam2bigwig

https://github.com/bioconvert/bioconvert/actions/workflows/bam2bigwig.yml/badge.svg

DEEPTOOLS

bam2cov

https://github.com/bioconvert/bioconvert/actions/workflows/bam2cov.yml/badge.svg

BEDTOOLS

bam2cram

https://github.com/bioconvert/bioconvert/actions/workflows/bam2cram.yml/badge.svg

SAMTOOLS

bam2fasta

https://github.com/bioconvert/bioconvert/actions/workflows/bam2fasta.yml/badge.svg

SAMTOOLS

bam2fastq

https://github.com/bioconvert/bioconvert/actions/workflows/bam2fastq.yml/badge.svg

SAMTOOLS

bam2json

https://github.com/bioconvert/bioconvert/actions/workflows/bam2json.yml/badge.svg

BAMTOOLS

bam2sam

https://github.com/bioconvert/bioconvert/actions/workflows/bam2sam.yml/badge.svg

SAMBAMBA

bam2tsv

https://github.com/bioconvert/bioconvert/actions/workflows/bam2tsv.yml/badge.svg

SAMTOOLS

bam2wiggle

https://github.com/bioconvert/bioconvert/actions/workflows/bam2wiggle.yml/badge.svg

WIGGLETOOLS

bcf2vcf

https://github.com/bioconvert/bioconvert/actions/workflows/bcf2vcf.yml/badge.svg

BCFTOOLS

bcf2wiggle

https://github.com/bioconvert/bioconvert/actions/workflows/bcf2wiggle.yml/badge.svg

WIGGLETOOLS

bed2wiggle

https://github.com/bioconvert/bioconvert/actions/workflows/bed2wiggle.yml/badge.svg

WIGGLETOOLS

bedgraph2bigwig

https://github.com/bioconvert/bioconvert/actions/workflows/bedgraph2bigwig.yml/badge.svg

UCSC

bedgraph2cov

https://github.com/bioconvert/bioconvert/actions/workflows/bedgraph2cov.yml/badge.svg

BIOCONVERT

bedgraph2wiggle

https://github.com/bioconvert/bioconvert/actions/workflows/bedgraph2wiggle.yml/badge.svg

WIGGLETOOLS

bigbed2bed

https://github.com/bioconvert/bioconvert/actions/workflows/bigbed2bed.yml/badge.svg

DEEPTOOLS

bigbed2wiggle

https://github.com/bioconvert/bioconvert/actions/workflows/bigbed2wiggle.yml/badge.svg

WIGGLETOOLS

bigwig2bedgraph

https://github.com/bioconvert/bioconvert/actions/workflows/bigwig2bedgraph.yml/badge.svg

DEEPTOOLS

bigwig2wiggle

https://github.com/bioconvert/bioconvert/actions/workflows/bigwig2wiggle.yml/badge.svg

WIGGLETOOLS

bplink2plink

https://github.com/bioconvert/bioconvert/actions/workflows/bplink2plink.yml/badge.svg

PLINK

bplink2vcf

https://github.com/bioconvert/bioconvert/actions/workflows/bplink2vcf.yml/badge.svg

PLINK

bz22gz

https://github.com/bioconvert/bioconvert/actions/workflows/bz22gz.yml/badge.svg

Unix commands

clustal2fasta

https://github.com/bioconvert/bioconvert/actions/workflows/clustal2fasta.yml/badge.svg

BIOPYTHON

clustal2nexus

https://github.com/bioconvert/bioconvert/actions/workflows/clustal2nexus.yml/badge.svg

GOALIGN

clustal2phylip

https://github.com/bioconvert/bioconvert/actions/workflows/clustal2phylip.yml/badge.svg

BIOPYTHON

clustal2stockholm

https://github.com/bioconvert/bioconvert/actions/workflows/clustal2stockholm.yml/badge.svg

BIOPYTHON

cram2bam

https://github.com/bioconvert/bioconvert/actions/workflows/cram2bam.yml/badge.svg

SAMTOOLS

cram2fasta

https://github.com/bioconvert/bioconvert/actions/workflows/cram2fasta.yml/badge.svg

SAMTOOLS

cram2fastq

https://github.com/bioconvert/bioconvert/actions/workflows/cram2fastq.yml/badge.svg

SAMTOOLS

cram2sam

https://github.com/bioconvert/bioconvert/actions/workflows/cram2sam.yml/badge.svg

SAMTOOLS

csv2tsv

https://github.com/bioconvert/bioconvert/actions/workflows/csv2tsv.yml/badge.svg

BIOCONVERT

csv2xls

https://github.com/bioconvert/bioconvert/actions/workflows/csv2xls.yml/badge.svg

Pandas

dsrc2gz

https://github.com/bioconvert/bioconvert/actions/workflows/dsrc2gz.yml/badge.svg

DSRC software

embl2fasta

https://github.com/bioconvert/bioconvert/actions/workflows/embl2fasta.yml/badge.svg

BIOPYTHON

embl2genbank

https://github.com/bioconvert/bioconvert/actions/workflows/embl2genbank.yml/badge.svg

BIOPYTHON

fasta2clustal

https://github.com/bioconvert/bioconvert/actions/workflows/fasta2clustal.yml/badge.svg

BIOPYTHON

fasta2faa

https://github.com/bioconvert/bioconvert/actions/workflows/fasta2faa.yml/badge.svg

BIOCONVERT

fasta2fasta_agp

https://github.com/bioconvert/bioconvert/actions/workflows/fasta2fasta_agp.yml/badge.svg

BIOCONVERT

fasta2fastq

https://github.com/bioconvert/bioconvert/actions/workflows/fasta2fastq.yml/badge.svg

PYSAM

fasta2genbank

https://github.com/bioconvert/bioconvert/actions/workflows/fasta2genbank.yml/badge.svg

BIOCONVERT

fasta2nexus

https://github.com/bioconvert/bioconvert/actions/workflows/fasta2nexus.yml/badge.svg

GOALIGN

fasta2phylip

https://github.com/bioconvert/bioconvert/actions/workflows/fasta2phylip.yml/badge.svg

BIOPYTHON

fasta2twobit

https://github.com/bioconvert/bioconvert/actions/workflows/fasta2twobit.yml/badge.svg

UCSC

fasta_qual2fastq

https://github.com/bioconvert/bioconvert/actions/workflows/fasta_qual2fastq.yml/badge.svg

PYSAM

fastq2fasta

https://github.com/bioconvert/bioconvert/actions/workflows/fastq2fasta.yml/badge.svg

BIOCONVERT available

fastq2fasta_qual

https://github.com/bioconvert/bioconvert/actions/workflows/fastq2fasta_qual.yml/badge.svg

BIOCONVERT

fastq2qual

https://github.com/bioconvert/bioconvert/actions/workflows/fastq2qual.yml/badge.svg

READFQ

genbank2embl

https://github.com/bioconvert/bioconvert/actions/workflows/genbank2embl.yml/badge.svg

BIOPYTHON

genbank2fasta

https://github.com/bioconvert/bioconvert/actions/workflows/genbank2fasta.yml/badge.svg

BIOPYTHON

genbank2gff3

https://github.com/bioconvert/bioconvert/actions/workflows/genbank2gff3.yml/badge.svg

BIOCODE

gfa2fasta

https://github.com/bioconvert/bioconvert/actions/workflows/gfa2fasta.yml/badge.svg

BIOCONVERT

gff22gff3

https://github.com/bioconvert/bioconvert/actions/workflows/gff22gff3.yml/badge.svg

BIOCONVERT

gff32gff2

https://github.com/bioconvert/bioconvert/actions/workflows/gff32gff2.yml/badge.svg

BIOCONVERT

gff32gtf

https://github.com/bioconvert/bioconvert/actions/workflows/gff32gtf.yml/badge.svg

BIOCONVERT

gz2bz2

https://github.com/bioconvert/bioconvert/actions/workflows/gz2bz2.yml/badge.svg

pigz/pbzip2 software

gz2dsrc

https://github.com/bioconvert/bioconvert/actions/workflows/gz2dsrc.yml/badge.svg

DSRC software

json2yaml

https://github.com/bioconvert/bioconvert/actions/workflows/json2yaml.yml/badge.svg

Python

maf2sam

https://github.com/bioconvert/bioconvert/actions/workflows/maf2sam.yml/badge.svg

BIOCONVERT

newick2nexus

https://github.com/bioconvert/bioconvert/actions/workflows/newick2nexus.yml/badge.svg

GOTREE

newick2phyloxml

https://github.com/bioconvert/bioconvert/actions/workflows/newick2phyloxml.yml/badge.svg

GOTREE

nexus2clustal

https://github.com/bioconvert/bioconvert/actions/workflows/nexus2clustal.yml/badge.svg

GOALIGN

nexus2fasta

https://github.com/bioconvert/bioconvert/actions/workflows/nexus2fasta.yml/badge.svg

BIOPYTHON

nexus2newick

https://github.com/bioconvert/bioconvert/actions/workflows/nexus2newick.yml/badge.svg

GOTREE

nexus2phylip

https://github.com/bioconvert/bioconvert/actions/workflows/nexus2phylip.yml/badge.svg

GOALIGN

nexus2phyloxml

https://github.com/bioconvert/bioconvert/actions/workflows/nexus2phyloxml.yml/badge.svg

GOTREE

ods2csv

https://github.com/bioconvert/bioconvert/actions/workflows/ods2csv.yml/badge.svg

pyexcel library

pdb2faa

https://github.com/bioconvert/bioconvert/actions/workflows/pdb2faa.yml/badge.svg

BIOCONVERT

phylip2clustal

https://github.com/bioconvert/bioconvert/actions/workflows/phylip2clustal.yml/badge.svg

BIOPYTHON

phylip2fasta

https://github.com/bioconvert/bioconvert/actions/workflows/phylip2fasta.yml/badge.svg

BIOPYTHON

phylip2nexus

https://github.com/bioconvert/bioconvert/actions/workflows/phylip2nexus.yml/badge.svg

GOALIGN

phylip2stockholm

https://github.com/bioconvert/bioconvert/actions/workflows/phylip2stockholm.yml/badge.svg

BIOPYTHON

phylip2xmfa

https://github.com/bioconvert/bioconvert/actions/workflows/phylip2xmfa.yml/badge.svg

BIOPYTHON

phyloxml2newick

https://github.com/bioconvert/bioconvert/actions/workflows/phyloxml2newick.yml/badge.svg

GOTREE

phyloxml2nexus

https://github.com/bioconvert/bioconvert/actions/workflows/phyloxml2nexus.yml/badge.svg

GOTREE

plink2bplink

https://github.com/bioconvert/bioconvert/actions/workflows/plink2bplink.yml/badge.svg

PLINK

plink2vcf

https://github.com/bioconvert/bioconvert/actions/workflows/plink2vcf.yml/badge.svg

PLINK

sam2bam

https://github.com/bioconvert/bioconvert/actions/workflows/sam2bam.yml/badge.svg

SAMTOOLS

sam2cram

https://github.com/bioconvert/bioconvert/actions/workflows/sam2cram.yml/badge.svg

SAMTOOLS

sam2paf

https://github.com/bioconvert/bioconvert/actions/workflows/sam2paf.yml/badge.svg

BIOCONVERT

scf2fasta

https://github.com/bioconvert/bioconvert/actions/workflows/scf2fasta.yml/badge.svg

BIOCONVERT

scf2fastq

https://github.com/bioconvert/bioconvert/actions/workflows/scf2fastq.yml/badge.svg

BIOCONVERT

sra2fastq

https://github.com/bioconvert/bioconvert/actions/workflows/sra2fastq.yml/badge.svg

FASTQDUMP

stockholm2clustal

https://github.com/bioconvert/bioconvert/actions/workflows/stockholm2clustal.yml/badge.svg

BIOPYTHON

stockholm2phylip

https://github.com/bioconvert/bioconvert/actions/workflows/stockholm2phylip.yml/badge.svg

BIOPYTHON

tsv2csv

https://github.com/bioconvert/bioconvert/actions/workflows/tsv2csv.yml/badge.svg

BIOCONVERT

twobit2fasta

https://github.com/bioconvert/bioconvert/actions/workflows/twobit2fasta.yml/badge.svg

DEEPTOOLS

vcf2bcf

https://github.com/bioconvert/bioconvert/actions/workflows/vcf2bcf.yml/badge.svg

BCFTOOLS

vcf2bed

https://github.com/bioconvert/bioconvert/actions/workflows/vcf2bed.yml/badge.svg

BIOCONVERT

vcf2bplink

https://github.com/bioconvert/bioconvert/actions/workflows/vcf2bplink.yml/badge.svg

PLINK

vcf2plink

https://github.com/bioconvert/bioconvert/actions/workflows/vcf2plink.yml/badge.svg

PLINK

vcf2wiggle

https://github.com/bioconvert/bioconvert/actions/workflows/vcf2wiggle.yml/badge.svg

WIGGLETOOLS

wig2bed

https://github.com/bioconvert/bioconvert/actions/workflows/wig2bed.yml/badge.svg

BEDOPS

xls2csv

https://github.com/bioconvert/bioconvert/actions/workflows/xls2csv.yml/badge.svg

xlsx2csv

https://github.com/bioconvert/bioconvert/actions/workflows/xlsx2csv.yml/badge.svg

Pandas library

xmfa2phylip

https://github.com/bioconvert/bioconvert/actions/workflows/xmfa2phylip.yml/badge.svg

BIOPYTHON

yaml2json

https://github.com/bioconvert/bioconvert/actions/workflows/yaml2json.yml/badge.svg

Pandas library

Contributors

Setting up and maintaining Bioconvert has been possible thanks to users and contributors. Thanks to all:

https://contrib.rocks/image?repo=bioconvert/bioconvert

Changes

Version

Description

1.1.1

  • Fix benchmark labels.

  • NEW: fast52pod5 conversion

  • FIX: set goalign and gotree instead of go requirements

1.1.0

  • Implement ability to benchmark the CPU and memory usage (not just time) benchmark incorporates CPU/memory usage

1.0.0

0.6.3

  • add picard method in bam2sam

  • Fixed all CI workflows to use mamba

  • drop python3.7 support and add 3.10 support

  • update bedops test file to fit the latest bedops 2.4.41 version

  • revisit logging system

0.6.2

  • added gff3 to gtf conversion.

  • Added pdb to faa conversion

  • Added missing –reference argument to the cram2sam conversion

0.6.1

  • output file can be in sub-directories allowing syntax such as ‘bioconvert fastq2fasta test.fastq outputs/test.fasta

  • fix all CI actions

  • add more examples as notebooks in ./examples

  • add a Snakefile for the paper in ./doc/Snakefile_paper

0.6.0

  • Fix bug in bam2sam (method sambamba)

  • Fix graph layout

  • add threading in fastq2fasta (seqkit method)

  • multibenchmark feature added

  • stable version used for web interface

0.5.2

  • Update requirements and environment.yml and add a conda spec-file.txt file

0.5.1

  • add genbank2gff3 requirement material in bioconvert.utils.biocode

0.5.0

  • Add CI actions for all converters

  • remove sniffer (now in biosniff on pypi https://pypi.org/project/biosniff/)

  • A complete benchmarking suite (see doc/Snakefile_benchmark file and benchmarking)

  • documentation and tests for all converters

  • removed the validators (we assume intputs are correct)

0.4.X

  • (aug 2019) added nexus2fasta, cram2fasta, fasta2faa … ; 1-to-many and many-to-one converters are now part of the API.

0.3.X

may 2019. new methods abi2qual, bigbed2bed, etc. added –threads option

0.2.X

aug 2018. abi2fastx, bioconvert_stats tool added

0.1.X

major refactoring to have subcommands with implicit/explicit mode

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bioconvert-1.1.1.tar.gz (216.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page