Skip to main content

A C implementation of a mesh based atomic pairwise distance computating engine, with docking pose generation capabilities and fast solvant accessible surface estimation

Project description

A Python package and C Library for fast molecular contact map computation

Current Version 2.1.3

This package was designed as a tool to quickly compute thousands of sets of atomic or residue molecular contacts. The contacts can be evaluated inside a single body or across two bodies. The library scales well, with the support of the native python multithreading. The module also provides docking poses evaluation by the application of triplets of Euler angles and translation vectors to initial unbound conformations.

Installing and using the python module

Installation

Should be as simple as pip intstall ccmap. Alternatively you can clone this repo and run python setup.py install at the root folder. Current release was successfully installed through pip on the following combinations of interpreter/platforms.

  • python3.8/OSX.10.14.6
  • python3.8/Ubuntu LTS

Usage

From there you can load the package and display its help.

import ccmap
help(ccmap)

Functions

Four functions are available:

  • cmap: computes the contacts of one single/two body molecule
  • lcmap: computes the contacts of a list of single/two body molecules
  • zmap: computes the contacts between a receptor and a ligand molecule after applying transformations to the ligand coordinates
  • lzmap: computes many sets of contacts between a receptor and a ligand molecule, one for each applied ligand transformation

Parameters

All module functions take molecular object coordinates as dictionaries, where keys are atoms descriptors and values are lists.

  • 'x' : list of float x coordinates
  • 'y' : list of float x coordinates
  • 'y' : list of float x coordinates
  • 'seqRes' : list of strings
  • 'chainID' : list of one-letter string
  • 'resName' : list of strings
  • 'name' : list of strings

Additional arguments

Contact threshold distance

In Angstrom's unit, its default value is 4.5. It can be redefined by the name parameter d.

encode : Boolean

If True, contacts are returned as integers. Each integer encoding one pair of atoms/residues positions in contact with this simple formula,

def K2IJ(k, sizeBody1, sizeBody2):
    nCol = sizeBody2 if sizeBody2 else sizeBody1
    return int(k/nCol), k%nCol

if False, contacts are returned as strings of JSON Objects

atomic : Boolean

If True, compute contact at the atomic level. By default, this if False and the contacts are computed at the residue level.

apply : Boolean

If True, the past dictionaries of coordinates will be modified according to Euler/translation parameters. This is useful to generate single docking conformation. This argument is only available for the cmap function.

offsetRec and offsetLig

When working with protein docking data, unbound conformations are often centered to the origin of the coordinates system. Specify the translation vectors for each body with the offsetRec and offsetLig named arguments. Only available for the zmap and lzmap functions.

Working with PDB coordinates files

Parsing coordinate data

We usually work with molecules in the PDB format. We can use the pyproteinsExt package to handle the boilerplate.

import pyproteinsExt
parser = PDB.Parser()
pdbREC = parser.load(file="dummy_A.pdb")
pdbDictREC = pdbREC.atomDictorize
pdbDictREC.keys()
#dict_keys(['x', 'y', 'z', 'seqRes', 'chainID', 'resName', 'name']) ```

By convention, following examples will use two molecules names REC(eptor) and LIG(and).

pdbLIG = parser.load(file="dummy_B.pdb")
pdbDictLIG = pdbLIG.atomDictorize
pdbDictLIG.keys()
#dict_keys(['x', 'y', 'z', 'seqRes', 'chainID', 'resName', 'name']) ```

Examples

Computing single body contact map

Computing one map

Setting contact distance of 6.0 and recovering residue-residue contact as an integer list.

ccmap.cmap(pdbDictLIG, d=6.0, encode=True)

Computing many maps

Using default contact distance and recovering atomic contact maps as JSON object string. The first positional argument specifies a list of bodies to process independently.

import json
json.load( ccmap.lcmap([ pdbDictLIG, pdbDictREC ], atomic=True) )

Computing two-body contact map

Straight computation of one map

The second positional argument of cmap is optional and defines the second body.

ccmap.cmap(pdbDictLIG, pdbDictLIG, d=6.0, encode=True)

Straight computation of many maps

The second positional argument of lcmap is an optional list of second bodies. The first two arguments must be of the same size, as the i-element of the first will be processed with the i-element of the second.

ccmap.lcmap([pdbDictREC_1, ..., pdbDictREC_n], [pdbDictLIG_1, pdbDictLIG_n], d=6.0, encode=True)

Computation of one map after conformational change

Use the zmap function with third and fourth positional arguments respectively specifying the :

  • Euler angles triplet
  • translation vector
ccmap.zmap(pdbDictREC, pdbDictLIG , (e1, e2, e3), (t1, t2, t3) )

Transformations are always applied to the coordinates provided as a second argument, e.g. : pdbDictLIG.

Computation of many maps after conformational changes

Use the lzmap function, arguments are similar but for the Euler angles and translation vectors which must be supplied as lists.

ccmap.lzmap(pdbDictREC, pdbDictLIG , [(e1, e2, e3),], [(t1, t2, t3),] )

Generating docking conformations

The conformations obtained by coordinate transformation can be back mapped to PDB files. Here, offset vectors [u1, u2, u3] and [v1, v2, v3] respectively center pdbDictREC and pdbDictLIG and one transformation defined by the [e1, e2, e3] Euler's angles and the [t1, t2, t3] translation vector is applied to pdbDictLIG. The resulting two-body conformation is finally applied to the provided pdbDictREC and pdbDictLIG. These updated coordinates update the original PDB object for later writing to file.

# Perform computation & alter provided dictionaries
ccmap.zmap( pdbDictREC, pdbDictLIG,
\ [e1, e2, e3], [t1, t2, t3],
\ offsetRec=[u1, u2, u3],
\ offsetLig=[v1, v2, v3],
\ apply=True)
# Update PDB containers from previous examples
pdbREC.setCoordinateFromDictorize(pdbDictREC)
pdbLIG.setCoordinateFromDictorize(pdbDictLIG)
# Dump to coordinate files
with open("new_receptor.pdb", "w") as fp:
    fp.write( str(pdbREC) )
with open("new_ligand.pdb", "w") as fp:
    fp.write( str(pdbLIG) )

Multithreading

The C implementation makes it possible for the ccmap functions to release Python Global Interpreter Lock. Hence, "actual" multithreading can be achieved and performances scale decently with the number of workers. For this benchmark, up to 50000 docking poses were generated and processed for three coordinate sets of increasing number of atoms: 1974(1GL1) 3424(1F34) 10677(2VIS).

benchmark

A simple example of a multithread implementation can be found in the provided script. The tests folder allows for the reproduction of the above benchmark.

Installing and using the C Library

C executable can be generated with the provided makefile. The low-level functions are the same, but the following limitations exist:

  • One computation per executable call
  • No multithreading.

Finding the optimal molecular path connecting two atoms

Using a thiner mesh size, it is possible to obtain the shortest path connecting two atoms. The atoms to connect must be solvent accessible and path search will operate over the surface and solvant accessible cells. The solvent excluded volume is computed for each atom as the sum of of its Van Der Waals radius and the radius of water molecule.

install

 git clone -b fibonacci git@github.com:MMSB-MOBI/ccmap.git
 cd ccmap/ccmap
 make pathfinder

Usage

./bin/linky -i ../tests/structures/gil_input.pdb -x 'LI1:B:502:CA' -y 'LI2:C:502:CA' -s 1

will display:

Applying H20 probe radius of 1.4 A. to atomic solvant volume exclusion
User atom selection:
	Start atom:	" CA  LI1  502 B 14.024000 15.909000 -5.891000"
	End atom :	" CA  LI2  502 C 8.770000 25.230000 -5.380000"
Mesh [459x408x316] created: 3959 cells contain atoms
Building surfaces w/ mesh unit of 0.2 A. ...
	Total of 12749100 voxels constructed
Searching for start/stop cells at start/stop atoms surfaces...
	start/stop surfaces contain 391/1058 voxels, picking closest to the other volume center cell ...
Starting from cell (240,281, 134) (b=0)
Trying to reachcell (230,307, 139) (b=0)
	---Best walk is made of 47 moves---
Theoritical distance from 1st vox_path to start atom
	voxel [240 281 134] 11.018,16.813,-6.889
	atom:  CA  LI1  502 B 14.024000 15.909000 -5.891000
	d=3.29382 A.
Start/Stop atoms exclusion radius: 3.2 / 3.2 A.
Best pathway -- aprox. polyline lengths 15.8 A
Trailing space equals 0.2A
Threading of 7 atoms w/ 1A spacing completed
Approximate linker length 12.6 A.

Where last line is fair approximation of the shortest curve linking desired atoms.

Effect of parameters on search

The path is guaranteed to be optimal but may take some time to run depending on structure topology and command line parameters.

cell size

Too large value may lead irrealistic path. Smaller values are better but increase path search space and therefore the computation time.

water probe

Water probe radius affect the solvant volume and therefore the path

bead spacing

Doesn't affect the optimal path, but will affect the evaluation of the linker length.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ccmap-4.1.1-cp314-cp314-macosx_11_0_arm64.whl (48.1 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

ccmap-4.1.1-cp313-cp313-macosx_11_0_arm64.whl (48.1 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

ccmap-4.1.1-cp312-cp312-macosx_11_0_arm64.whl (48.1 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

ccmap-4.1.1-cp311-cp311-macosx_11_0_arm64.whl (47.9 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

ccmap-4.1.1-cp310-cp310-macosx_11_0_arm64.whl (47.9 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

ccmap-4.1.1-cp39-cp39-macosx_11_0_arm64.whl (48.0 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file ccmap-4.1.1-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ccmap-4.1.1-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 efb28f44fd73fcb06141512d69c94482c4bd24d8f0b7a4db49438d42e45270ee
MD5 e200c108f417c0557ecb98f3d501bd34
BLAKE2b-256 4aef8d28e6a0866e706830ae760836f4c0f91c6e606fdd4184a3916566f68751

See more details on using hashes here.

File details

Details for the file ccmap-4.1.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ccmap-4.1.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5bdc4438be84d38fae1138d3b121f0431c517098bc113febbec94f8e2b4129c4
MD5 609fcec134c988df77e813a88cf1c962
BLAKE2b-256 efbcae138c83b070a16511f43a49ffbadaef451e83b600992faba83f240cf3fa

See more details on using hashes here.

File details

Details for the file ccmap-4.1.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ccmap-4.1.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 247be7456709c868ca0d50447ad4d4499e46eeb9114aeb1ad4cb16643231462d
MD5 e16e01b04b459b4a306f9a4f5a279f35
BLAKE2b-256 8c4af92c38aadf0cae580f7a5ed52d5948249e85111812044ccb3779bbbfcb65

See more details on using hashes here.

File details

Details for the file ccmap-4.1.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ccmap-4.1.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 adad311ac61447e630a063a9a92cb0569b300aed1a1d31a2a386935a966efbc6
MD5 e39eb4512342a5f05f27656cabf561a5
BLAKE2b-256 1c91d9eb4350ef916b5e19d2376980ed29325b52ef20fd52a2ae7827899784e4

See more details on using hashes here.

File details

Details for the file ccmap-4.1.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ccmap-4.1.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ca5f7d02b19144a6a63f9f72cd6fb7884e3667034b2cbf4685cbde5f3a7651bd
MD5 40b3e8736034dfc37b322157f2aee34f
BLAKE2b-256 16a46689e0d2a8037e9bede5b40815cf1950fb9941e01bdafa3326d61b0e1589

See more details on using hashes here.

File details

Details for the file ccmap-4.1.1-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

  • Download URL: ccmap-4.1.1-cp39-cp39-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 48.0 kB
  • Tags: CPython 3.9, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for ccmap-4.1.1-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1fba30ff33e4a7cd4801dbe4a66b88d67e6a9a8330c464347ab0aabfac29cbfb
MD5 ec88cb2828e00731b92080f9b50f4133
BLAKE2b-256 8f73ee1c37f84581002ac58b81f7a992ce308ab46c594e0cc5ac4f66bff26892

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page