Skip to main content

Highly optimized inference engine for binarized neural networks.

Project description

Larq Compute Engine larq logo

Tests PyPI - Python Version PyPI PyPI - License

Larq Compute Engine (LCE) is a highly optimized inference engine for deploying extremely quantized neural networks, such as Binarized Neural Networks (BNNs). It currently supports various mobile platforms and has been benchmarked on a Pixel 1 phone and a Raspberry Pi. LCE provides a collection of hand-optimized TensorFlow Lite custom operators for supported instruction sets, developed in inline assembly or in C++ using compiler intrinsics. LCE leverages optimization techniques such as tiling to maximize the number of cache hits, vectorization to maximize the computational throughput, and multi-threading parallelization to take advantage of multi-core modern desktop and mobile CPUs.

Larq Compute Engine is part of a family of libraries for BNN development; you can also check out Larq for building and training BNNs and Larq Zoo for pre-trained models.

Key Features

  • Effortless end-to-end integration from training to deployment:

    • Tight integration of LCE with Larq and TensorFlow provides a smooth end-to-end training and deployment experience.

    • A collection of Larq pre-trained BNN models for common machine learning tasks is available in Larq Zoo and can be used out-of-the-box with LCE.

    • LCE provides a custom MLIR-based model converter which is fully compatible with TensorFlow Lite and performs additional network level optimizations for Larq models.

  • Lightning fast deployment on a variety of mobile platforms:

    • LCE enables high performance, on-device machine learning inference by providing hand-optimized kernels and network level optimizations for BNN models.

    • LCE currently supports 64-bit ARM-based mobile platforms such as Android phones and Raspberry Pi boards.

    • Thread parallelism support in LCE is essential for modern mobile devices with multi-core CPUs.

Performance

The table below presents single-threaded performance of Larq Compute Engine on different versions of a novel BNN model called QuickNet (trained on ImageNet dataset, released on Larq Zoo) on a Raspberry Pi 4 Model B at 1.5GHz (BCM2711) board, a Pixel 1 Android phone (2016), and a Mac Mini with M1 ARM CPU:

Model Top-1 Accuracy RPi 4B 1.5GHz, 1 thread (ms) Pixel 1, 1 thread (ms) Mac Mini M1, 1 thread (ms)
QuickNetSmall 59.4% 27.7 16.8 4.0
QuickNet 63.3% 45.0 25.5 5.8
QuickNetLarge 66.9% 77.0 44.2 9.9

For reference, dabnn (the other main BNN library) reports an inference time of 61.3 ms for Bi-RealNet (56.4% accuracy) on the Pixel 1 phone, while LCE achieves an inference time of 41.6 ms for Bi-RealNet on the same device. They furthermore present a modified version, BiRealNet-Stem, which achieves the same accuracy of 56.4% in 43.2 ms.

The following table presents multi-threaded performance of Larq Compute Engine on a Pixel 1 phone and a Raspberry Pi 4 Model B at 1.5GHz (BCM2711) board:

Model Top-1 Accuracy RPi 4B 1.5GHz, 4 threads (ms) Pixel 1, 4 threads (ms) Mac Mini M1, 4 threads (ms)
QuickNetSmall 59.4% 12.1 8.9 1.8
QuickNet 63.3% 20.8 12.6 2.5
QuickNetLarge 66.9% 31.7 22.8 3.9

Benchmarked on 2021-06-11 (Pixel 1), 2021-06-13 (Mac Mini M1), and 2022-04-20 (RPi 4B) with LCE custom TFLite Model Benchmark Tool (see here) with XNNPack enabled and BNN models with randomized inputs.

Getting started

Follow these steps to deploy a BNN with LCE:

  1. Pick a Larq model

    You can use Larq to build and train your own model or pick a pre-trained model from Larq Zoo.

  2. Convert the Larq model

    LCE is built on top of TensorFlow Lite and uses the TensorFlow Lite FlatBuffer format to convert and serialize Larq models for inference. We provide an LCE Converter with additional optimization passes to increase the speed of execution of Larq models on supported target platforms.

  3. Build LCE

    The LCE documentation provides the build instructions for Android and 64-bit ARM-based boards such as Raspberry Pi. Please follow the provided instructions to create a native LCE build or cross-compile for one of the supported targets.

  4. Run inference

    LCE uses the TensorFlow Lite Interpreter to perform an inference. In addition to the already available built-in TensorFlow Lite operators, optimized LCE operators are registered to the interpreter to execute the Larq specific subgraphs of the model. An example to create and build an LCE compatible TensorFlow Lite interpreter for your own applications is provided here.

Next steps

About

Larq Compute Engine is being developed by a team of deep learning researchers and engineers at Plumerai to help accelerate both our own research and the general adoption of Binarized Neural Networks.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

larq_compute_engine-0.11.1-cp310-cp310-win_amd64.whl (44.9 MB view details)

Uploaded CPython 3.10Windows x86-64

larq_compute_engine-0.11.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (59.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

larq_compute_engine-0.11.1-cp310-cp310-macosx_11_0_arm64.whl (47.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

larq_compute_engine-0.11.1-cp310-cp310-macosx_10_14_x86_64.whl (57.2 MB view details)

Uploaded CPython 3.10macOS 10.14+ x86-64

larq_compute_engine-0.11.1-cp39-cp39-win_amd64.whl (44.9 MB view details)

Uploaded CPython 3.9Windows x86-64

larq_compute_engine-0.11.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (59.6 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

larq_compute_engine-0.11.1-cp39-cp39-macosx_11_0_arm64.whl (47.5 MB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

larq_compute_engine-0.11.1-cp39-cp39-macosx_10_14_x86_64.whl (57.2 MB view details)

Uploaded CPython 3.9macOS 10.14+ x86-64

larq_compute_engine-0.11.1-cp38-cp38-win_amd64.whl (44.9 MB view details)

Uploaded CPython 3.8Windows x86-64

larq_compute_engine-0.11.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (59.6 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

larq_compute_engine-0.11.1-cp38-cp38-macosx_11_0_arm64.whl (47.5 MB view details)

Uploaded CPython 3.8macOS 11.0+ ARM64

larq_compute_engine-0.11.1-cp38-cp38-macosx_10_14_x86_64.whl (57.2 MB view details)

Uploaded CPython 3.8macOS 10.14+ x86-64

larq_compute_engine-0.11.1-cp37-cp37m-win_amd64.whl (44.9 MB view details)

Uploaded CPython 3.7mWindows x86-64

larq_compute_engine-0.11.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (59.6 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ x86-64

larq_compute_engine-0.11.1-cp37-cp37m-macosx_10_14_x86_64.whl (57.2 MB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

File details

Details for the file larq_compute_engine-0.11.1-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 9f3910de59cc01f8c24e94cedfc7329c42fc5e401a0039343b16c2138e94bd06
MD5 95ae95f6353aef15009e65242ef31882
BLAKE2b-256 d0c1d0a9f04a7051c35e4cbc8a67b26a7b5f545bc640875a1ce84bf23e4c6943

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 23fef1a1b2d4c5fb6510400df949468d68cfb8a289a6702f4df0b03b8551023b
MD5 f8311a3a5fb4a8730ba9993fb8134241
BLAKE2b-256 e314f1b0a1bd512f0f53b3a54d6e1b490c0a76d38d19adabe9aae5c172248dd1

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3ca5085f66c8387bf215e868809c57815d46b0a29c152818c21dd29470977788
MD5 ce4aaca8a842c3501d881fad9ea5b3ac
BLAKE2b-256 efcfe9bad7e19b2d69bbfa151518c7b4f040d8144860a140a1757b2190d1c595

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 97adc1356d651ebb7405af82bba264cd4e36433642356c7d120151008d9581fa
MD5 48047365efd5de3a96040369fd65a512
BLAKE2b-256 93c3eae5ac8c85fdef5042bc512f8502630f535846435865b1376cbc0c87a6b6

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 f6eef134158ad8f4b6b17efbf9983490d2151b0166af0c032bc4ac9ed892add7
MD5 6bcd05ba4031e0df4efcd7f0817acdbb
BLAKE2b-256 37eed67dbf4efca19e08f23ee14e104ffc4333ff3af76474acfb176f3f00da82

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1b420787b6437f9605a4fd6089f7c9017056a9aba42ad802f1a9765a790200ba
MD5 0def3b94e9049f907694aeeff54fc14f
BLAKE2b-256 4ce6c539fff111c5f139d10e0779bc90780ac46733c62bbfdfca47a31d831c89

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 387d16a6105f7cadd162b38b0927d8eddaac230891a5401a1470d848143c1bda
MD5 a4c942b7c567e5bd13a94c66abe9df0c
BLAKE2b-256 589aa336ffe745be9555a10d561a4ac01fe7d06544dc71bdf45f47732e49f455

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8e255b8225fdcb32182078beb4dcd1e91cf7122caea6c740ffbc930e45012697
MD5 eae0b9a63f0cdfa6c707272ee7f97fbb
BLAKE2b-256 fd23160e8e16ada0b7758591cd8189794946941a87b6417f6102fe4f43c79557

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 1aa74c6f9d377d8e015bd7408f5cd116beacb9d4a633e3dcd0d0c3ffea6a8e93
MD5 a54c0341c7a18aa710a3a9100d0a48d3
BLAKE2b-256 2b4ba8c70eb0b72f69822d84667650932c0011415ae32dfad527be1407c17be1

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 98ea32f80c7c7cad4384022295284c71046f712961ac73b3b58f680584c485c6
MD5 06c2d7563806e01007b88aabcf934947
BLAKE2b-256 e5e5529b9f75d433d141ee55fa964a03a89a882ed42447d429b6cba157d7b850

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f7dd84107df606d229867dded927b72cc259185314a814316e876ae5b54ce4af
MD5 d2ab5c929e7c92778fe709aabd2eda27
BLAKE2b-256 4cdc4ad77728104908e5e68901260801b140e870e603361f655bc0f611aeb986

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 56ca00290984e8242454f5afbf354f214b8d87d18a64116c12ca6a2071bbfb61
MD5 0efc51b64a9e24047e13041a38be9d0e
BLAKE2b-256 38cd91fd5314631570363a41d777472c294816977f610d18ef07a0057d5754ff

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 aa0f872faf54b86e925e105ab71ed25949360244a5128c191bfd44d04453af64
MD5 687960c9bd002e5a41e8bb595cefb6a7
BLAKE2b-256 41cd0e8bff359ee450f77786170e7165f1d716aae683783eb50f195e1e20e679

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 769c07c125b63d6fc179ebdf0758bf818d1b0ba7979b7bc5f4c538b15d15d03f
MD5 03dcc36fdb86dce2107b771b9ac0a17c
BLAKE2b-256 f443fef2febf13ac8431cfb4086bbd1133f6a76bde91ddcfdc48ed5cca8c3918

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.11.1-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for larq_compute_engine-0.11.1-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 5f1db185e6918c900a05b65d8b834675be7ec42023eaaa2770bb86a287856267
MD5 99f247184423504fca2a8202a6237f94
BLAKE2b-256 9276fea417f656716f1d05b1aa0cd18439e5feb3b69e3e43ee25fbe77670ba34

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page