Finds differences between two PDF documents
Project description
pdf-diff
Finds differences between two PDF documents:
- Compares the text layers of two PDF documents and outputs the bounding boxes of changed text in JSON.
- Rasterizes the changed pages in the PDFs to a PNG and draws red outlines around changed text.
The script is written in Python 3, and it relies on the pdftotext program.
Requirements
libxml2 >= 2.7.0, libxslt >= 1.1.23, poppler
Requirements installation for Ubuntu:
sudo apt-get install python3-lxml poppler-utils
Requirements installation for OS X:
brew install libxml2 libxslt poppler
Installation
From PyPI:
pip install pdf-diff
From source:
sudo python3 setup.py install
Running
Turn two PDFs into one large PNG image showing the differences:
pdf-diff before.pdf after.pdf > comparison_output.png
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdf-diff-0.9.1.tar.gz.
File metadata
- Download URL: pdf-diff-0.9.1.tar.gz
- Upload date:
- Size: 8.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.5.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d31fe792a7fe3278e20a3a56c71c50cf895d650d4053cd02f12324137d352fe
|
|
| MD5 |
d988f178c1f03c84ecb123b8440b7b50
|
|
| BLAKE2b-256 |
31e0efcd2a80d5a2ca58265a26598e24b17f0b838ec2677bdab5232ba4e72abe
|
File details
Details for the file pdf_diff-0.9.1-py3.5.egg.
File metadata
- Download URL: pdf_diff-0.9.1-py3.5.egg
- Upload date:
- Size: 15.6 kB
- Tags: Egg
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.5.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
105a8d147552866da6d32860377ec84625044b43407bb59bf1f7ceb66d1f0c93
|
|
| MD5 |
332a7af33443036d87a763db607223b3
|
|
| BLAKE2b-256 |
8744577264de99646cc14786c572b9a15827f5c4ea2d6040c04050c4850334c4
|
File details
Details for the file pdf_diff-0.9.1-py3-none-any.whl.
File metadata
- Download URL: pdf_diff-0.9.1-py3-none-any.whl
- Upload date:
- Size: 11.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.5.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ffc2bcc8a0db1cfb4a6728f3374f8e025cfcffc1e3a3ab9b1245d4964a42ddeb
|
|
| MD5 |
70a052342883854422b2479748570f38
|
|
| BLAKE2b-256 |
d668f212aa12ca9c9b2654b9a42957752b38374f5e61255f6fc8355bf0881b86
|