Skip to main content

Create 3D images from atomic coordinates

Project description

Macromolecular Voxelization

Last release Python version Documentation Test status Test coverage Last commit

Macromol Voxelize is a highly performant library for converting atomic structures into 3D images, i.e. images where each channel might represent a different element type, and each voxel might be on the order of 1Å in each dimension. The intended use case is machine learning. More specifically, it is to allow image-based model architectures (such as CNNs) to be applied to macromolecular data.

Some noteworthy aspects of this library:

  • Algorithm: The voxelization procedure implemented by this library is to (i) treat each atom as a sphere and (ii) fill each voxel in proportion to amount it overlaps that sphere. Although this procedure may seem intuitive, it's actually quite unique. Macromolecular structures are typically voxelized in one of two ways: either by assigning the entire density for each atom to a single voxel, or by modeling each atom as a 3D Gaussian distribution. The advantage of the overlap-based procedure is that the image changes more smoothly as atoms move around. It also makes it easier to infer the exact position of each atom from just the image. The disadvantage is that calculating sphere/cube overlap volumes turns out to be quite difficult. Here, the overlap library is used to make this calculation.

  • Performance: Because voxelization can be a bottleneck during training, most of this library is implemented in C++. However, the API is in Python, for compatibility with common machine learning frameworks such as PyTorch, JAX, etc. Note that the voxelization algorithm is deliberately single-threaded. This is a bit counter-intuitive, since voxelization is an embarrassingly parallel problem. However, in the context of loading training examples, it's more efficient to have a larger number of single-threaded data loader subprocesses than a smaller number of multi-threaded ones.

Here's an example showing how to voxelize a set of atoms:

import polars as pl
import macromol_voxelize as mmvox

# Load the atoms in question.  These particular coordinates are for a 
# methionine amino acid.
atoms = pl.DataFrame([
        dict(element='N', x= 1.052, y=-1.937, z=-1.165, occupancy=1.0),
        dict(element='C', x= 1.540, y=-0.561, z=-1.165, occupancy=1.0),
        dict(element='C', x= 3.049, y=-0.521, z=-1.165, occupancy=1.0),
        dict(element='O', x= 3.733, y=-1.556, z=-1.165, occupancy=1.0),
        dict(element='C', x= 0.965, y= 0.201, z= 0.059, occupancy=1.0),
        dict(element='C', x=-0.570, y= 0.351, z= 0.100, occupancy=1.0),
        dict(element='S', x=-1.037, y= 1.495, z= 1.409, occupancy=1.0),
        dict(element='C', x=-2.800, y= 1.146, z= 1.451, occupancy=1.0),
])

# Add a "radius" column to the dataframe.  This function simply gives each atom 
# the same radius, but you could calculate radii however you want.
atoms = mmvox.set_atom_radius_A(atoms, 0.75)

# Add a "channels" column to the dataframe.  This function assigns channels by 
# matching a series of regular expressions against each atom's element name, 
# but again you could do this however you want.
atoms = mmvox.set_atom_channels_by_element(atoms, ['C', 'N', 'O', 'S|Se'])

# Create the 3D image.  Note that this step is not specific to macromolecules 
# in any way.  It just expects a data frame with "x", "y", "z", "radius", 
# "occupancy", and "channels" columns.
img_params = mmvox.ImageParams(
        channels=4,
        grid=mmvox.Grid(
            length_voxels=8,
            resolution_A=1,
            center_A=[0, 0, 0],
        ),
)
img = mmvox.image_from_atoms(atoms, img_params)

Here's a rendering of this image:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

macromol_voxelize-0.1.1.tar.gz (2.9 MB view hashes)

Uploaded Source

Built Distributions

macromol_voxelize-0.1.1-pp310-pypy310_pp73-win_amd64.whl (4.0 MB view hashes)

Uploaded PyPy Windows x86-64

macromol_voxelize-0.1.1-pp310-pypy310_pp73-macosx_11_0_arm64.whl (4.0 MB view hashes)

Uploaded PyPy macOS 11.0+ ARM64

macromol_voxelize-0.1.1-pp310-pypy310_pp73-macosx_10_9_x86_64.whl (4.0 MB view hashes)

Uploaded PyPy macOS 10.9+ x86-64

macromol_voxelize-0.1.1-cp312-cp312-win_amd64.whl (4.0 MB view hashes)

Uploaded CPython 3.12 Windows x86-64

macromol_voxelize-0.1.1-cp312-cp312-win32.whl (4.0 MB view hashes)

Uploaded CPython 3.12 Windows x86

macromol_voxelize-0.1.1-cp312-cp312-musllinux_1_1_x86_64.whl (4.6 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.1+ x86-64

macromol_voxelize-0.1.1-cp312-cp312-musllinux_1_1_i686.whl (4.7 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.1+ i686

macromol_voxelize-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

macromol_voxelize-0.1.1-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl (4.1 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ i686

macromol_voxelize-0.1.1-cp312-cp312-macosx_11_0_arm64.whl (4.0 MB view hashes)

Uploaded CPython 3.12 macOS 11.0+ ARM64

macromol_voxelize-0.1.1-cp312-cp312-macosx_10_9_x86_64.whl (4.0 MB view hashes)

Uploaded CPython 3.12 macOS 10.9+ x86-64

macromol_voxelize-0.1.1-cp311-cp311-win_amd64.whl (4.0 MB view hashes)

Uploaded CPython 3.11 Windows x86-64

macromol_voxelize-0.1.1-cp311-cp311-win32.whl (4.0 MB view hashes)

Uploaded CPython 3.11 Windows x86

macromol_voxelize-0.1.1-cp311-cp311-musllinux_1_1_x86_64.whl (4.6 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ x86-64

macromol_voxelize-0.1.1-cp311-cp311-musllinux_1_1_i686.whl (4.7 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ i686

macromol_voxelize-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

macromol_voxelize-0.1.1-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (4.1 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ i686

macromol_voxelize-0.1.1-cp311-cp311-macosx_11_0_arm64.whl (4.0 MB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

macromol_voxelize-0.1.1-cp311-cp311-macosx_10_9_x86_64.whl (4.0 MB view hashes)

Uploaded CPython 3.11 macOS 10.9+ x86-64

macromol_voxelize-0.1.1-cp310-cp310-win_amd64.whl (4.0 MB view hashes)

Uploaded CPython 3.10 Windows x86-64

macromol_voxelize-0.1.1-cp310-cp310-win32.whl (4.0 MB view hashes)

Uploaded CPython 3.10 Windows x86

macromol_voxelize-0.1.1-cp310-cp310-musllinux_1_1_x86_64.whl (4.6 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

macromol_voxelize-0.1.1-cp310-cp310-musllinux_1_1_i686.whl (4.7 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ i686

macromol_voxelize-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

macromol_voxelize-0.1.1-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (4.1 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686

macromol_voxelize-0.1.1-cp310-cp310-macosx_11_0_arm64.whl (4.0 MB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

macromol_voxelize-0.1.1-cp310-cp310-macosx_10_9_x86_64.whl (4.0 MB view hashes)

Uploaded CPython 3.10 macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page