Skip to main content

Some tools I find useful for working with Ig receptor sequences

Project description

receptor_utils

Some tools I find useful for working with Ig receptor sequences.

Installation

pip install receptor_utils

The module requires Biopython.

Overview

Please refer to the files themselves for slightly more detailed documentation.

simple_bio_seq

Contains some convenience functions that are backed by BioPython but simplified for my use case. It uses the following approach to keep things simple (at the expense of some flexibility/scalability):

  • store sequences as strings, use dicts for collections
  • convert sequences to upper case on input
  • coerce iterators into lists for ease of debugging
from receptor_utils import simple_bio_seq as simple
seqs = simple.read_fasta('seqfile.fasta')  # read sequences into a dict with names as keys
seq = simple.read_single_fasta('seqfile.fasta')  # reads the first or only sequence into a string
seq = simple.reverse_complement(seq)

See the file for other functions.

novel_allele_name

Contains the function name_novel(), which will generate a name for a 'previously undocumented' allele, given its sequence. The name will consist of the name of the nearest allele in a reference set provided to the function, suffixed by the SNPs that differentiate it, for example:

IGHV1-69*01_a29g_c113t

Numbering of V-sequences uses the IMGT alignment. The naming convention follows that used by Tigger and VDJbase.

number_ighv

Contains various functions for working with V-sequences according to the IMGT numbering scheme. The most useful is gap_sequence() which will gap the provided V-sequence by using the closest sequence in a reference set as a template.

Example scripts

These may be useful in their own right, but also show how to use some of the functions mentioned above. Once the package is installed, you should be able to run these at the command line without the .py extension, for example type

$ extract_refs --help

for help

extract_refs

A script which uses simple_bio_seq to extract files for particular loci and species from an IMGT reference file.

gap_inferred

A script which will gap a set of sequences listed in a FASTA file, using the closest sequences discovered from a reference set.

identical_seqs

A script which uses simple_bio_seq to list identical sequences and sub-sequences in a fasta file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

receptor_utils-0.0.2.tar.gz (2.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

receptor_utils-0.0.2-py3-none-any.whl (3.1 kB view details)

Uploaded Python 3

File details

Details for the file receptor_utils-0.0.2.tar.gz.

File metadata

  • Download URL: receptor_utils-0.0.2.tar.gz
  • Upload date:
  • Size: 2.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for receptor_utils-0.0.2.tar.gz
Algorithm Hash digest
SHA256 a9fda128c7b3bbb94271635e86aac97675d73be5d17adf126e4c0d4455de1274
MD5 22be133a647699f004e223bbffcc35e4
BLAKE2b-256 be14f351d7434e7446e79f6f276b4b6c022471651b6030001265deb279f3c43e

See more details on using hashes here.

File details

Details for the file receptor_utils-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: receptor_utils-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 3.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for receptor_utils-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 367d23d738b0b36e9fa71582c8a2be6e219a1ee1ae74379411e936e9784be23d
MD5 c28f6bd17b126a81f96f45d7707de1dd
BLAKE2b-256 34efac7aab7dfa2555f14efd7d665e21bcf6577380b16400cd24adae08b30a66

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page