Highly optimized inference engine for binarized neural networks.

These details have not been verified by PyPI

Project links

Homepage

Project description

Larq Compute Engine

Larq Compute Engine (LCE) is a highly optimized inference engine for deploying extremely quantized neural networks, such as Binarized Neural Networks (BNNs). It currently supports various mobile platforms and has been benchmarked on a Pixel 1 phone and a Raspberry Pi. LCE provides a collection of hand-optimized TensorFlow Lite custom operators for supported instruction sets, developed in inline assembly or in C++ using compiler intrinsics. LCE leverages optimization techniques such as tiling to maximize the number of cache hits, vectorization to maximize the computational throughput, and multi-threading parallelization to take advantage of multi-core modern desktop and mobile CPUs.

Larq Compute Engine is part of a family of libraries for BNN development; you can also check out Larq for building and training BNNs and Larq Zoo for pre-trained models.

Key Features

Effortless end-to-end integration from training to deployment:
- Tight integration of LCE with Larq and TensorFlow provides a smooth end-to-end training and deployment experience.
- A collection of Larq pre-trained BNN models for common machine learning tasks is available in Larq Zoo and can be used out-of-the-box with LCE.
- LCE provides a custom MLIR-based model converter which is fully compatible with TensorFlow Lite and performs additional network level optimizations for Larq models.
Lightning fast deployment on a variety of mobile platforms:
- LCE enables high performance, on-device machine learning inference by providing hand-optimized kernels and network level optimizations for BNN models.
- LCE currently supports 64-bit ARM-based mobile platforms such as Android phones and Raspberry Pi boards.
- Thread parallelism support in LCE is essential for modern mobile devices with multi-core CPUs.

Performance

The table below presents single-threaded performance of Larq Compute Engine on different versions of a novel BNN model called QuickNet (trained on ImageNet dataset, released on Larq Zoo) on a Raspberry Pi 4 Model B at 1.5GHz (BCM2711) board, a Pixel 1 Android phone (2016), and a Mac Mini with M1 ARM CPU:

Model	Top-1 Accuracy	RPi 4B 1.5GHz, 1 thread (ms)	Pixel 1, 1 thread (ms)	Mac Mini M1, 1 thread (ms)
QuickNetSmall	59.4%	27.7	16.8	4.0
QuickNet	63.3%	45.0	25.5	5.8
QuickNetLarge	66.9%	77.0	44.2	9.9

For reference, dabnn (the other main BNN library) reports an inference time of 61.3 ms for Bi-RealNet (56.4% accuracy) on the Pixel 1 phone, while LCE achieves an inference time of 41.6 ms for Bi-RealNet on the same device. They furthermore present a modified version, BiRealNet-Stem, which achieves the same accuracy of 56.4% in 43.2 ms.

The following table presents multi-threaded performance of Larq Compute Engine on a Pixel 1 phone and a Raspberry Pi 4 Model B at 1.5GHz (BCM2711) board:

Model	Top-1 Accuracy	RPi 4B 1.5GHz, 4 threads (ms)	Pixel 1, 4 threads (ms)	Mac Mini M1, 4 threads (ms)
QuickNetSmall	59.4%	12.1	8.9	1.8
QuickNet	63.3%	20.8	12.6	2.5
QuickNetLarge	66.9%	31.7	22.8	3.9

Benchmarked on 2021-06-11 (Pixel 1), 2021-06-13 (Mac Mini M1), and 2022-04-20 (RPi 4B) with LCE custom TFLite Model Benchmark Tool (see here) with XNNPack enabled and BNN models with randomized inputs.

Getting started

Follow these steps to deploy a BNN with LCE:

Pick a Larq model

You can use Larq to build and train your own model or pick a pre-trained model from Larq Zoo.
Convert the Larq model

LCE is built on top of TensorFlow Lite and uses the TensorFlow Lite FlatBuffer format to convert and serialize Larq models for inference. We provide an LCE Converter with additional optimization passes to increase the speed of execution of Larq models on supported target platforms.
Build LCE

The LCE documentation provides the build instructions for Android and 64-bit ARM-based boards such as Raspberry Pi. Please follow the provided instructions to create a native LCE build or cross-compile for one of the supported targets.
Run inference

LCE uses the TensorFlow Lite Interpreter to perform an inference. In addition to the already available built-in TensorFlow Lite operators, optimized LCE operators are registered to the interpreter to execute the Larq specific subgraphs of the model. An example to create and build an LCE compatible TensorFlow Lite interpreter for your own applications is provided here.

Next steps

Explore Larq pre-trained models.
Learn how to build and train BNNs for your own application with Larq.
If you're a mobile developer, visit Android quickstart.
See our build instructions for Raspberry Pi and 64-bit ARM-based boards here.
Try our example programs.

About

Larq Compute Engine is being developed by a team of deep learning researchers and engineers at Plumerai to help accelerate both our own research and the general adoption of Binarized Neural Networks.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.16.0

Jun 21, 2024

0.13.0

Aug 10, 2023

This version

0.11.1

Jul 12, 2023

0.11.0

Jan 20, 2023

0.8.0

Aug 25, 2022

0.7.0

Apr 25, 2022

0.6.2

Sep 8, 2021

0.6.1

Jul 9, 2021

0.6.0

Jun 11, 2021

0.5.0

Jan 30, 2021

0.4.3

Sep 21, 2020

0.4.2

Sep 10, 2020

0.4.0

Aug 28, 2020

0.3.1

May 26, 2020

0.3.0

May 12, 2020

0.2.1

Apr 20, 2020

0.2.0

Mar 23, 2020

0.1.2

Feb 26, 2020

0.1.1

Feb 20, 2020

0.1.0

Feb 17, 2020

0.1.0rc1 pre-release

Feb 14, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

larq_compute_engine-0.11.1-cp310-cp310-win_amd64.whl (44.9 MB view details)

Uploaded Jul 13, 2023 CPython 3.10Windows x86-64

larq_compute_engine-0.11.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (59.6 MB view details)

Uploaded Jul 13, 2023 CPython 3.10manylinux: glibc 2.17+ x86-64

larq_compute_engine-0.11.1-cp310-cp310-macosx_11_0_arm64.whl (47.5 MB view details)

Uploaded Jul 13, 2023 CPython 3.10macOS 11.0+ ARM64

larq_compute_engine-0.11.1-cp310-cp310-macosx_10_14_x86_64.whl (57.2 MB view details)

Uploaded Jul 12, 2023 CPython 3.10macOS 10.14+ x86-64

larq_compute_engine-0.11.1-cp39-cp39-win_amd64.whl (44.9 MB view details)

Uploaded Jul 13, 2023 CPython 3.9Windows x86-64

larq_compute_engine-0.11.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (59.6 MB view details)

Uploaded Jul 13, 2023 CPython 3.9manylinux: glibc 2.17+ x86-64

larq_compute_engine-0.11.1-cp39-cp39-macosx_11_0_arm64.whl (47.5 MB view details)

Uploaded Jul 13, 2023 CPython 3.9macOS 11.0+ ARM64

larq_compute_engine-0.11.1-cp39-cp39-macosx_10_14_x86_64.whl (57.2 MB view details)

Uploaded Jul 12, 2023 CPython 3.9macOS 10.14+ x86-64

larq_compute_engine-0.11.1-cp38-cp38-win_amd64.whl (44.9 MB view details)

Uploaded Jul 13, 2023 CPython 3.8Windows x86-64

larq_compute_engine-0.11.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (59.6 MB view details)

Uploaded Jul 13, 2023 CPython 3.8manylinux: glibc 2.17+ x86-64

larq_compute_engine-0.11.1-cp38-cp38-macosx_11_0_arm64.whl (47.5 MB view details)

Uploaded Jul 13, 2023 CPython 3.8macOS 11.0+ ARM64

larq_compute_engine-0.11.1-cp38-cp38-macosx_10_14_x86_64.whl (57.2 MB view details)

Uploaded Jul 12, 2023 CPython 3.8macOS 10.14+ x86-64

larq_compute_engine-0.11.1-cp37-cp37m-win_amd64.whl (44.9 MB view details)

Uploaded Jul 13, 2023 CPython 3.7mWindows x86-64

larq_compute_engine-0.11.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (59.6 MB view details)

Uploaded Jul 13, 2023 CPython 3.7mmanylinux: glibc 2.17+ x86-64

larq_compute_engine-0.11.1-cp37-cp37m-macosx_10_14_x86_64.whl (57.2 MB view details)

Uploaded Jul 12, 2023 CPython 3.7mmacOS 10.14+ x86-64

File details

Details for the file larq_compute_engine-0.11.1-cp310-cp310-win_amd64.whl.

File metadata

Download URL: larq_compute_engine-0.11.1-cp310-cp310-win_amd64.whl
Upload date: Jul 13, 2023
Size: 44.9 MB
Tags: CPython 3.10, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for larq_compute_engine-0.11.1-cp310-cp310-win_amd64.whl
Algorithm	Hash digest
SHA256	`9f3910de59cc01f8c24e94cedfc7329c42fc5e401a0039343b16c2138e94bd06`
MD5	`95ae95f6353aef15009e65242ef31882`
BLAKE2b-256	`d0c1d0a9f04a7051c35e4cbc8a67b26a7b5f545bc640875a1ce84bf23e4c6943`

Algorithm	Hash digest
SHA256	`23fef1a1b2d4c5fb6510400df949468d68cfb8a289a6702f4df0b03b8551023b`
MD5	`f8311a3a5fb4a8730ba9993fb8134241`
BLAKE2b-256	`e314f1b0a1bd512f0f53b3a54d6e1b490c0a76d38d19adabe9aae5c172248dd1`

Algorithm	Hash digest
SHA256	`3ca5085f66c8387bf215e868809c57815d46b0a29c152818c21dd29470977788`
MD5	`ce4aaca8a842c3501d881fad9ea5b3ac`
BLAKE2b-256	`efcfe9bad7e19b2d69bbfa151518c7b4f040d8144860a140a1757b2190d1c595`

Algorithm	Hash digest
SHA256	`97adc1356d651ebb7405af82bba264cd4e36433642356c7d120151008d9581fa`
MD5	`48047365efd5de3a96040369fd65a512`
BLAKE2b-256	`93c3eae5ac8c85fdef5042bc512f8502630f535846435865b1376cbc0c87a6b6`

Algorithm	Hash digest
SHA256	`f6eef134158ad8f4b6b17efbf9983490d2151b0166af0c032bc4ac9ed892add7`
MD5	`6bcd05ba4031e0df4efcd7f0817acdbb`
BLAKE2b-256	`37eed67dbf4efca19e08f23ee14e104ffc4333ff3af76474acfb176f3f00da82`

Algorithm	Hash digest
SHA256	`1b420787b6437f9605a4fd6089f7c9017056a9aba42ad802f1a9765a790200ba`
MD5	`0def3b94e9049f907694aeeff54fc14f`
BLAKE2b-256	`4ce6c539fff111c5f139d10e0779bc90780ac46733c62bbfdfca47a31d831c89`

Algorithm	Hash digest
SHA256	`387d16a6105f7cadd162b38b0927d8eddaac230891a5401a1470d848143c1bda`
MD5	`a4c942b7c567e5bd13a94c66abe9df0c`
BLAKE2b-256	`589aa336ffe745be9555a10d561a4ac01fe7d06544dc71bdf45f47732e49f455`

Algorithm	Hash digest
SHA256	`8e255b8225fdcb32182078beb4dcd1e91cf7122caea6c740ffbc930e45012697`
MD5	`eae0b9a63f0cdfa6c707272ee7f97fbb`
BLAKE2b-256	`fd23160e8e16ada0b7758591cd8189794946941a87b6417f6102fe4f43c79557`

Algorithm	Hash digest
SHA256	`1aa74c6f9d377d8e015bd7408f5cd116beacb9d4a633e3dcd0d0c3ffea6a8e93`
MD5	`a54c0341c7a18aa710a3a9100d0a48d3`
BLAKE2b-256	`2b4ba8c70eb0b72f69822d84667650932c0011415ae32dfad527be1407c17be1`

Algorithm	Hash digest
SHA256	`98ea32f80c7c7cad4384022295284c71046f712961ac73b3b58f680584c485c6`
MD5	`06c2d7563806e01007b88aabcf934947`
BLAKE2b-256	`e5e5529b9f75d433d141ee55fa964a03a89a882ed42447d429b6cba157d7b850`

Algorithm	Hash digest
SHA256	`f7dd84107df606d229867dded927b72cc259185314a814316e876ae5b54ce4af`
MD5	`d2ab5c929e7c92778fe709aabd2eda27`
BLAKE2b-256	`4cdc4ad77728104908e5e68901260801b140e870e603361f655bc0f611aeb986`

Algorithm	Hash digest
SHA256	`56ca00290984e8242454f5afbf354f214b8d87d18a64116c12ca6a2071bbfb61`
MD5	`0efc51b64a9e24047e13041a38be9d0e`
BLAKE2b-256	`38cd91fd5314631570363a41d777472c294816977f610d18ef07a0057d5754ff`

Algorithm	Hash digest
SHA256	`aa0f872faf54b86e925e105ab71ed25949360244a5128c191bfd44d04453af64`
MD5	`687960c9bd002e5a41e8bb595cefb6a7`
BLAKE2b-256	`41cd0e8bff359ee450f77786170e7165f1d716aae683783eb50f195e1e20e679`

Algorithm	Hash digest
SHA256	`769c07c125b63d6fc179ebdf0758bf818d1b0ba7979b7bc5f4c538b15d15d03f`
MD5	`03dcc36fdb86dce2107b771b9ac0a17c`
BLAKE2b-256	`f443fef2febf13ac8431cfb4086bbd1133f6a76bde91ddcfdc48ed5cca8c3918`

Algorithm	Hash digest
SHA256	`5f1db185e6918c900a05b65d8b834675be7ec42023eaaa2770bb86a287856267`
MD5	`99f247184423504fca2a8202a6237f94`
BLAKE2b-256	`9276fea417f656716f1d05b1aa0cd18439e5feb3b69e3e43ee25fbe77670ba34`

larq-compute-engine 0.11.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Larq Compute Engine

Key Features

Performance

Getting started

Next steps

About

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes