Skip to main content

Package for loading data from bgen files

Project description

Another bgen reader

travis

This is a package for reading bgen files.

This package uses cython to wrap c++ code for parsing bgen files. It's not too slow, it can parse genotypes from 500,000 individuals at >100 variants per second within python.

This has been primarily been designed around UKBiobank bgen files (i.e. bgen version 1.2 with zlib compressed genotype probabilities, but the other versions and compression schemes have also been tested using example bgen files).

Install

pip install bgen

Usage

from bgen import BgenFile
bfile = BgenFile(BGEN_PATH, SAMPLE_PATH=None)
rsids = bfile.rsids()

# select a variant by indexing
var = bfile[1000]

# pull out genotype probabilities
probs = var.probabilities()  # returns 2D numpy array
dosage = var.alt_dosage()  # requires biallelic variant, returns numpy array

# exclude variants from analyses by passing in indices
to_drop = [1, 3, 500]
bfile.drop_variants(to_drop)

# pickle variants for easy message passing
import pickle
dumped = pickle.dumps(var)
var = pickle.loads(dumped)

# iterate through every variant in the file, without preloading every variant
with BgenFile(BGEN_PATH, SAMPLE_PATH=None, delay_parsing=True) as bfile:
  for var in bfile:
      probs = var.probabilities()
      dosage = var.alt_dosage()
      ploidy = var.ploidy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bgen-1.1.3.tar.gz (646.1 kB view details)

Uploaded Source

File details

Details for the file bgen-1.1.3.tar.gz.

File metadata

  • Download URL: bgen-1.1.3.tar.gz
  • Upload date:
  • Size: 646.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2.post20191203 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.7.5

File hashes

Hashes for bgen-1.1.3.tar.gz
Algorithm Hash digest
SHA256 b13d63d31c01f71b9e67d9017011487d6b37340a356ea4c478daab06fc6589ae
MD5 ae293143f0c0275d12c8ea3c07194e7f
BLAKE2b-256 8d1ed427b7a65c7490b96ecd8c02397d36d529d513c6f1a041224860c7d29bba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page