Skip to main content

Connected components on 3D images, supports multiple labels.

Project description

[![Build Status](https://travis-ci.org/seung-lab/connected-components-3d.svg?branch=master)](https://travis-ci.org/seung-lab/connected-components-3d)

Connected Components 3D
=======================

Implementation of connected components in three dimensions using a 26-connected neighborhood. This package uses a 3D variant of the two pass method by Rosenfeld and Pflatz augmented with Union-Find. This implementation is compatible with images containing many different labels, not just binary images. It can be used with 2D or 3D images.

I wrote this package because I was working on densely labeled 3D biomedical images of brain tissue (e.g. 512x512x512 voxels). Other off the shelf implementations I reviewed were limited to binary images. This rendered these other packages too slow for my use case as it required masking each label and running the connected components algorithm once each time. For reference, there are often between hundreds to thousands of labels in a given volume. The benefit of this package is that it labels all connected components in one shot, improving performance by one or more orders of magnitude.

## Python `pip` Installaction

If binaries are available for your platform:

```bash
pip install cc3d
```

Otherwise:

```bash
pip install numpy
pip install cc3d
```

## Python Manual Installation

*Requires a C++ compiler.*

```bash
pip install -r requirements.txt
python setup.py develop
```

## Python Use

```python
from cc3d import connected_components
import numpy as np

labels_in = np.ones((512, 512, 512), dtype=np.int32)
labels_out = connected_components(labels_in)
```

If you know approximately how many labels you are going to generate, you can save substantial memory by specifying a number a bit above that range. The max label ID in your input labels must be less than `max_labels`.

```python
labels_out = connected_components(labels_in, max_labels=20000)
```

## C++ Use

```cpp
#include "cc3d.hpp"

// 3d array represented as 1d array
int* labels = new int[512*512*512]();

path = cc3d::connected_components3d<int>(
labels, /*sx=*/512, /*sy=*/512, /*sz=*/512
);
```

## Algorithm Description

The algorithm contained in this package is an elaboration into 3D images of the 2D image connected components algorithm described by Rosenfeld and Pflatz in 1968 [1] (which is well illustrated by [this youtube video](https://www.youtube.com/watch?v=ticZclUYy88)) using an equivalency list implemented as a Union-Find disjoint set with path compression [2].

In RP's two-pass method for 2D images, you raster scan and every time you first encounter a foreground pixel, mark it with a new label if the pixels to its top and left are background. If there is a preexisting label in its neighborhood, use that label instead. Whenever you see that two labels are adjacent, record they are equivalent as this will be used in the second pass. This equivalency table can be constructed in several ways, but some popular approaches are Union-Find with path compression and Selkow's algorithm (which can avoid pipeline stalls). However, Selkow's algorithm is designed for two trees of depth two, appropriate for binary images. We would like to process multiple labels at the same time, making union-find mandatory.

In the second pass, the pixels are relabeled using the equivalency table. Union-Find (disjoint sets) establishes one label as the root label of a tree, and the root is considered the representative label. Each pixel is then labeled with the representative label.

To move to a 3D 26-connected neighborhood, we must first note that RP's method is 4-connected, in that they only examine the pixel to the top and left. In 2D, the 8-connected version would have looked at the top, left, top-left, and top-right. In a 3D 26-connected case, we have to look at the top, left, top-left, top-right, the same pattern shifted by z-1, and the voxel located at (x, y, z-1).

In the literature, modern connected components algorithms appear to do better than the simple one I selected by about 2x-5x depending on the data.
There appear to be some modern competing approaches involving decision trees, and an approach called "Light Speed Labeling". I picked this algorithm mainly because it is easy to understand and implement.

In order to make a reasonably fast implementation, I implemented union-find with union by rank and path compression. I conservatively used two arrays (a uint32 rank array, and the IDs as the image data type) equal to the size of the image for the union-find data structure instead of a sparse map. The union-find data structure plus the output labels means the memory consumption will be input + output + rank + equivalences. If your input labels are 32-bit, the memory usage will be 4x the input size. This becomes more problematic when 64-bit labels are used, but if you know something about your data, you can decrease the size of the union-find data structure.

## References

1. A. Rosenfeld and J. Pfaltz. "Sequential Operations in Digital Picture Processing". Journal of the ACM. Vol. 13, Issue 4, Oct. 1966, Pg. 471-494. doi: 10.1145/321356.321357 ([link](https://dl.acm.org/citation.cfm?id=321357))
2. R. E. Tarjan. "Efficiency of a good but not linear set union algorithm". Journal of the ACM, 22:215-225, 1975. ([link](https://dl.acm.org/citation.cfm?id=321884))



Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

connected-components-3d-1.0.0.tar.gz (151.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

connected_components_3d-1.0.0-cp37-cp37m-win_amd64.whl (89.2 kB view details)

Uploaded CPython 3.7mWindows x86-64

connected_components_3d-1.0.0-cp36-cp36m-macosx_10_13_x86_64.whl (100.7 kB view details)

Uploaded CPython 3.6mmacOS 10.13+ x86-64

File details

Details for the file connected-components-3d-1.0.0.tar.gz.

File metadata

  • Download URL: connected-components-3d-1.0.0.tar.gz
  • Upload date:
  • Size: 151.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5

File hashes

Hashes for connected-components-3d-1.0.0.tar.gz
Algorithm Hash digest
SHA256 5d5df807b4750dd3f3473c1367fb2ce21aa7a940b8dea4d75693dfd7703f44ab
MD5 cbdda656d42687850a09a6e9257c9ba0
BLAKE2b-256 fe7b5186450e797d3ddef1bcfbc2fbbf36f16735acfb5593277dc22e8cd9da05

See more details on using hashes here.

File details

Details for the file connected_components_3d-1.0.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: connected_components_3d-1.0.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 89.2 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5

File hashes

Hashes for connected_components_3d-1.0.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 9ab82e8e8b2cf42f05c3e45642f6227e451b87b2911855512b5f12ac4b47b4e2
MD5 46f059c418a8bbc3482fa2382ae81fd4
BLAKE2b-256 a89c98a90da06a3363a80c50857a4047f4928150755402246808f86dd66d615d

See more details on using hashes here.

File details

Details for the file connected_components_3d-1.0.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for connected_components_3d-1.0.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 bbb738d27a9713a5b3a9eef4f49c515644b82468697868bc7e3dd40e79d0d353
MD5 2f7ae85fa787fce65e15a8a36b887528
BLAKE2b-256 4d4afd278b684bfe49f2d420a0b7c871193919b50fa60fb04f97b1e7bad6196d

See more details on using hashes here.

File details

Details for the file connected_components_3d-1.0.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for connected_components_3d-1.0.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 9142b5ebbf0881854baac5f6c1b22a104c5db410691a991a164846509c2e3639
MD5 77c6a3085f34f08fc66fd75163bb2ae6
BLAKE2b-256 e6896cfcdfd2d2ede732c681996d82ae9be2e712267cf4e27873566e61a21537

See more details on using hashes here.

File details

Details for the file connected_components_3d-1.0.0-cp36-cp36m-macosx_10_13_x86_64.whl.

File metadata

  • Download URL: connected_components_3d-1.0.0-cp36-cp36m-macosx_10_13_x86_64.whl
  • Upload date:
  • Size: 100.7 kB
  • Tags: CPython 3.6m, macOS 10.13+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5

File hashes

Hashes for connected_components_3d-1.0.0-cp36-cp36m-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 6bcfa157116874a7674740df6d1fb9413fb05b9219d8dd9398ea076d450cb7e6
MD5 0dd2e8dc89d9bb1d511f3ce9d4a3f01b
BLAKE2b-256 2b1dc3e4f314612fa2efe76eaea2116b37587edfa54504ea793f1998d06f619a

See more details on using hashes here.

File details

Details for the file connected_components_3d-1.0.0-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for connected_components_3d-1.0.0-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 bc76da195c35809424dcee2e9115c4a7097ff0bb16199eb07de751c0cf8a9ede
MD5 e6d4a669252fc99b30b7d0e745d2543b
BLAKE2b-256 f4f07897a06b6b395a256bb1f669bcf012cc3bc0d239e8cad6124ae7ca0f35f4

See more details on using hashes here.

File details

Details for the file connected_components_3d-1.0.0-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for connected_components_3d-1.0.0-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 63ccdb9c68a902e9ef569c462e8d0d7a0b47f23ce8fa27d6a512e6ee0bdc8f78
MD5 9ce6f9ce712deb4e0f4ce0ab833dfd3b
BLAKE2b-256 7f22415e276a4321c6c82f67b98f40cb341344e0128edb1717737542d0ce5be2

See more details on using hashes here.

File details

Details for the file connected_components_3d-1.0.0-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for connected_components_3d-1.0.0-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6d0c16cc2dfd21b794d15f3acff114453d2d6f972baf97b465f59bca8e250ccf
MD5 ecdd0cb5062570a1fea141348ba0866b
BLAKE2b-256 6513de6b166daa72332ff9b4fb7c21134597426b403e2b4d4b83ef8bf5557ba6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page