Pixelwise binarization with selectional auto-encoders in Keras
Project description
Binarization
Binarization for document images
Examples
Introduction
This tool performs document image binarization (i.e. transform colour/grayscale to black-and-white pixels) for OCR using multiple trained models.
The method used is based on Calvo-Zaragoza/Gallego, 2018. A selectional auto-encoder approach for document image binarization.
Installation
Clone the repository, enter it and run
pip install .
Models
Pre-trained models can be downloaded from here:
https://qurator-data.de/sbb_binarization/
Usage
sbb_binarize \
--patches \
-m <directory with models> \
<input image> \
<output image>
Note In virtually all cases, the --patches
flag will improve results.
To use the OCR-D interface:
ocrd-sbb-binarize --overwrite -I INPUT_FILE_GRP -O OCR-D-IMG-BIN -P model "/var/lib/sbb_binarization"
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sbb_binarization-0.0.7.tar.gz
(9.9 kB
view hashes)
Built Distribution
Close
Hashes for sbb_binarization-0.0.7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 73b4f4329447ccbe6163aeabf37cf21d63a3f3a63d17b46d48ea06ae6ebffd3b |
|
MD5 | 03a1416cdebdac563f2bbd4f6b09239d |
|
BLAKE2b-256 | ab6e71c09c580b8e28d821741060f6edf33eb69afb344bf21d7361c83a20185c |