Skip to main content

VDJ assignment and antibody sequence annotation. Scalable from a single sequence to billions of sequences.

Project description

# AbStar

VDJ assignment and antibody sequence annotation. Scalable from a single sequence to billions of sequences.

### install pip install abstar

### use

To run AbStar on a single FASTA or FASTQ file: abstar -i <input-file> -o <output-directory> -t <temp-directory>

To iteratively run AbStar on all files in an input directory: abstar -i <input-directory> -o <output-directory> -t <temp-directory>

### additional options -l, –log Change the log file location. Default is <output_directory>/mongo.log.

-m, –merge Input directory should contain paired FASTQ (or gzipped FASTQ) files. Paired files will be merged with PANDAseq prior to processing with AbAnalysis.

-b, –basespace Download a sequencing run from BaseSpace, which is Illumina’s cloud storage environment. Since Illumina sequencers produce paired-end reads, –merge is also set.

-u N, –uaid N Sequences contain a unique antibody ID (UAID, or molecular barcode) of length N. The uaid will be parsed from the beginning of each input sequence and added to the JSON output. Negative values result in the UAID being parsed from the end of the sequence.

-s, –species Select the species from which the input sequences are derived. Supported options are ‘human’, ‘mouse’, and ‘macaque’. Default is ‘human’.

-c, –cluster Runs AbStar in distributed mode on a Celery cluster.

-h, –help Prints detailed information about all runtime options.

-D –debug Much more verbose logging.

### helper scripts Two helper scripts are included: batch_mongoimport automates the import of multiple JSON output files into a MongoDB database. make_basespace_credfile makes a credentials file for BaseSpace, which is required if downloading sequences from BaseSpace with Abstar. Developer credentials are required, and the process for obtaining them is explained here: https://support.basespace.illumina.com/knowledgebase/articles/403618-python-run-downloader

### requirements Python 2.7 (3.x probably doesn’t work, but hasn’t been tested) abtools biopython celery pymongo scikit-bio

All of the above dependencies can be installed with pip, and will be installed automatically when installing AbStar with pip. If you’re new to Python, a great way to get started is to install the Anaconda Python distribution (https://www.continuum.io/downloads), which includes pip as well as a ton of useful scientific Python packages.

sequence merging requires PANDAseq (https://github.com/neufeld/pandaseq) batch_mongoimport requires MongoDB (http://www.mongodb.org/) BaseSpace downloading requires the BaseSpace Python SDK (https://github.com/basespace/basespace-python-sdk)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

abstar-0.1.2.tar.gz (25.9 MB view details)

Uploaded Source

File details

Details for the file abstar-0.1.2.tar.gz.

File metadata

  • Download URL: abstar-0.1.2.tar.gz
  • Upload date:
  • Size: 25.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for abstar-0.1.2.tar.gz
Algorithm Hash digest
SHA256 33d514fb531b86e08b6b9d8493349834e333e89f0910f82696dcc84a7c61be22
MD5 ddf1bc7191dd54e020de8d7c5265f500
BLAKE2b-256 7b7d43cb8237be8379f3cd849ca2d3bf15ff00ec07c3dda65e86f5629b913fb3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page