Skip to main content

Package to load genes from GENCODE GTF files

Project description

GENCODEGenes

This package loads genes from GENCODE GTF/GFF files, groups transcripts by gene, and provides methods for transcripts, so you can find exon coordinates, CDS distances and sequences.

Install

pip install gencodegenes

Usage

from gencodegenes import Gencode

gencode = Gencode(GTF_PATH)
# full function arguments are Gencode(gtf_path, fasta_path=None, coding_only=True)
#  - fasta_path: pass in path to fasta file to get gene transcripts with sequence
#  - coding_only: pass in False to include all transcripts, not just protein coding

# get gene by HGNC symbol
gene = gencode['OR5A1']
transcripts = gene.transcripts
canonical = gene.canonical  # picks MANE transcript if available, if none named
                            # as MANE, picks the one tagged as appris_principal
                            # (or longest CDS if multiple), if none tagged, picks
                            # the longest protein coding, if none protein coding,
                            # picks the longest cDNA 
gene.start, gene.end, gene.chrom, gene.strand, gene.symbol # other attributes available


# find gene nearest a genomic position, or overlapping a genomic region
gencode.nearest('chr1', 1000000)
gencode.in_region('chr1', 1000000, 2000000)

# and the transcript has a bunch of methods
tx = gene.canonical
tx.in_exons(pos)                         # check if pos in exons
tx.in_coding_region(pos)                 # check if pos in CDS
tx.get_coding_distance(pos)              # get distance in CDS to CDS start
tx.get_closest_exon(pos)                 # find exon closest to position
tx.get_position_on_chrom(cds_pos)        # convert CDS pos to genomic pos
tx.get_codon_info(pos)                   # get info about codon for a site
tx.get_codon_number_for_cds_pos(cds_pos) # convert CDS pos to codon number
tx.translate(seq)                        # translate DNA to AA (if opened with Fasta)

# the transcript also has associated data fields
tx.name         # transcript ID
tx.chrom        # transcript chromosome
tx.start        # transcript start (TSS)
tx.end          # transcript end
tx.cds_start    # CDS start position
tx.cds_end      # CDS end position 
tx.type         # transcript type e.g. protein_coding
tx.strand       # strand (+ or -)
tx.exons        # list of exon coordinates
tx.cds          # list of CDS coordinates
tx.cds_sequence # get cDNA sequence (if Gencode was opened with fasta)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gencodegenes-1.1.2.tar.gz (313.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

gencodegenes-1.1.2-cp312-cp312-win_amd64.whl (548.4 kB view details)

Uploaded CPython 3.12Windows x86-64

gencodegenes-1.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

gencodegenes-1.1.2-cp312-cp312-macosx_10_9_x86_64.whl (561.1 kB view details)

Uploaded CPython 3.12macOS 10.9+ x86-64

gencodegenes-1.1.2-cp311-cp311-win_amd64.whl (547.4 kB view details)

Uploaded CPython 3.11Windows x86-64

gencodegenes-1.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

gencodegenes-1.1.2-cp311-cp311-macosx_10_9_x86_64.whl (560.1 kB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

gencodegenes-1.1.2-cp310-cp310-win_amd64.whl (546.9 kB view details)

Uploaded CPython 3.10Windows x86-64

gencodegenes-1.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

gencodegenes-1.1.2-cp310-cp310-macosx_10_9_x86_64.whl (559.4 kB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

gencodegenes-1.1.2-cp39-cp39-win_amd64.whl (547.1 kB view details)

Uploaded CPython 3.9Windows x86-64

gencodegenes-1.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

gencodegenes-1.1.2-cp39-cp39-macosx_10_9_x86_64.whl (559.9 kB view details)

Uploaded CPython 3.9macOS 10.9+ x86-64

gencodegenes-1.1.2-cp38-cp38-win_amd64.whl (547.5 kB view details)

Uploaded CPython 3.8Windows x86-64

gencodegenes-1.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

gencodegenes-1.1.2-cp38-cp38-macosx_10_9_x86_64.whl (560.4 kB view details)

Uploaded CPython 3.8macOS 10.9+ x86-64

File details

Details for the file gencodegenes-1.1.2.tar.gz.

File metadata

  • Download URL: gencodegenes-1.1.2.tar.gz
  • Upload date:
  • Size: 313.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for gencodegenes-1.1.2.tar.gz
Algorithm Hash digest
SHA256 e5c827dfc45a4d41e0537706f75a501f12e3882a78c9b2468a75bb4c7a537310
MD5 41d10fba8521f7902eead745a0539aa6
BLAKE2b-256 f043bc46a135d5a85cae6fbebd2a905ebbb7630923166c9071bbcd538f27685e

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 c437ce0a41466f2a9cc973fc60914fedbd3c7bd6205afb06bc4637490ffab883
MD5 8e4b25e0cca237e0981def630b8d3554
BLAKE2b-256 3b378845a61169f4a4fd3c96d069b98b214986c963eb22ae24ace507c6ae5a7b

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4177e45953ff5a62a23dd2815f0c7baf943eddcd518c85fed0b3f20d2c9c3754
MD5 787ed8cdc9a3580a1aed6abdb7c0cbea
BLAKE2b-256 0a72155ad95fa32b80daa07295ff783e6de18f549b3c53b145c2b546a5f7f7fc

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp312-cp312-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.2-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 5c79876dfff8b33110cf0e4e701a9379762597100a808e40eb04abca7ec66c96
MD5 4006ab6dc9ebf4f6512d11a43b9d6286
BLAKE2b-256 5eb6b4f4363a868fdaf9309ec885cfafd57c5c945ae49e13ef897c97a17dd248

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 e9e95986c17e71804fd1bf2e9b67b09c2eac98acadbac39df54e28a45d3f98bb
MD5 d8083f4bd67a91eaa13076757bc77eb8
BLAKE2b-256 7073834d665447f29584bc145d10423a6ae068d65b09faf444bab745663a42f8

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c4d5996bf197cc1e045b2a89da90d18fa3dce4d38a0eb023e66faca81b4ab135
MD5 3355bcd06b5a211e0f86620da938039c
BLAKE2b-256 6633d204267feab901755e533e2c141ac4c5114e0889beb72181a387f0ae805f

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.2-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 e755bdd8f91da27bd7e68fa104cfe25a388266c0615d51603bbc6fd1fc8c4b86
MD5 62d84f4d6cd3a4d3952bd7cc39d322fe
BLAKE2b-256 64ea3d12a639e016bada52ddf9b9c7706372f1ab998b01c37eadec7987bbc75d

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 52a8ff4259af4ac841e7097b4ba8c386dc6c673a9b9e5fafd1237a710190c95a
MD5 579c5a7a9f9c4feda372a9257475864e
BLAKE2b-256 95d7bf67d79120c6b3cf6e4211000e30871d2956c07fbcec71e7bd344471150c

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 efd1a25f5c72211927795c99967ec59ee76230af68ab8a463a4ee9bdd3f3ef00
MD5 f14e98e2f75ee0b5c7d71ca9525e952d
BLAKE2b-256 4a090beb7df5dd8997962808c356aa86110065fc6698ce7757bb44d24837965d

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.2-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 92e33102351def8e597582fdea4ffc64ae6bf85d4a9229f52682fd4b77d835a5
MD5 1b2a13b75caf27cd0f630def4703ae59
BLAKE2b-256 676ffec462120418aaf60f0f977721d2ce9de851aa0a259d7cbbc6a3b9564f76

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: gencodegenes-1.1.2-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 547.1 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for gencodegenes-1.1.2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 699b35bb6dfcd573fe78f95cbbfd2691670057ba4a76a261b19a15977a9a4ae1
MD5 42bf6f9225c776c7dcec9735f81a2ba2
BLAKE2b-256 d9584872ce67677fd1bca5c3f97c00e1d0f93b5821aa2ebfe9b37f03cff1ff51

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5f8b751b3cf0520e9c9c86a2e62cefbd5af788a61256531bf32a9360a132d83b
MD5 cb1aa4949eab63354f57a320bab2961e
BLAKE2b-256 c65aa39ef5ab7c99303b70c9ef12987841a50ec348d76a36c2a0c53937a6dd84

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.2-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 497e18be12fb4325c0510f09b1d175de208c200972c856def8380803951cca55
MD5 2ce93deaea2bac9085ed5e3efb4d79a1
BLAKE2b-256 6fbacdb60c96458ebe3e40953d3723799a11bee32f965ab634f3d54f64599ab1

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: gencodegenes-1.1.2-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 547.5 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for gencodegenes-1.1.2-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 0ff472ae199c73992bc089f948356a92ce44aa6286cea4b7ddcc9477f99a8078
MD5 077a5e7ae9437fbc4fa8f7375f302ec9
BLAKE2b-256 59549e764050e48a7b48fa69d4671ff384376b9f356e7606fb5a9e33d8c72cad

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 418e1202dc236d890439e079f6326f911e52834b3dc7d1cdc80e5c2b21b67eb5
MD5 df9f19b2b1b138df2e4effc02b6e0e59
BLAKE2b-256 076754994bd003043f8dd6b4ea02916a228755b9e7630c8cc238231f7619403e

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.2-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.2-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 97f00cdb69bfe4ee32103615ad85b2997f96ba108183346cf1a311b0a229abd1
MD5 5caea69593e39a8f9c2bba29db5af29e
BLAKE2b-256 6e2de4d63a9317868d47a740ed8cd62ea4c3c0e4082ce16936e546909e9822d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page