Skip to main content

Undouble is a Python package to detect (near-)identical images.

Project description

undouble

Python PyPI Version License Github Forks GitHub Open Issues Project Status Sphinx Downloads Downloads Sphinx

undouble is a Python library to detect (near-)identical images. It works using a multi-step process of pre-processing the images (grayscaling, normalizing, and scaling), computing the image hash, and grouping images. A threshold of 0 will group images with an identical image hash. The results can easily be explored by the plotting functionality and images can be moved with the move functionality. When moving images, the image in the group with the largest resolution will be copied, and all other images are moved to the **undouble** subdirectory. ⭐️Star it if you like it⭐️


The following steps are taken in the undouble library:

  • Read all images from the directory recursively with the specified extensions.
  • Compute image hash.
  • Group similar images.
  • Automatically organize the images in your folder if desired.

Blogs

Documentation pages

On the documentation pages you can find detailed information about the working of the undouble with many examples.

Installation

It is advisable to create a new environment (e.g. with Conda).
conda create -n env_undouble python=3.8
conda activate env_undouble
Install bnlearn from PyPI
pip install undouble            # new install
pip install -U undouble         # update to latest version
Directly install from github source
pip install git+https://github.com/erdogant/undouble
Import Undouble package
from undouble import Undouble

Examples:

Example: Grouping similar images of the flower dataset

Example: List all file names that are identifical

Example: Moving similar images in the flower dataset
# -------------------------------------------------
# >You are at the point of physically moving files.
# -------------------------------------------------
# >[7] similar images are detected over [3] groups.
# >[4] images will be moved to the [undouble] subdirectory.
# >[3] images will be copied to the [undouble] subdirectory.

# >[C]ontinue moving all files.
# >[W]ait in each directory.
# >[Q]uit
# >Answer: w

Example: Plot the image hashes

Example: Three different imports

The input can be the following three types:

* Path to directory
* List of file locations
* Numpy array containing images

Example: Finding identical mnist digits


Citation

Please cite in your publications if this is useful for your research (see citation).

Maintainers

Contribute

  • All kinds of contributions are welcome!
  • If you wish to buy me a Coffee for this work, it is very appreciated :)

Licence

See LICENSE for details.

Other interesting stuf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

undouble-1.4.11.tar.gz (20.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

undouble-1.4.11-py3-none-any.whl (19.8 kB view details)

Uploaded Python 3

File details

Details for the file undouble-1.4.11.tar.gz.

File metadata

  • Download URL: undouble-1.4.11.tar.gz
  • Upload date:
  • Size: 20.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for undouble-1.4.11.tar.gz
Algorithm Hash digest
SHA256 cb647d1a3b5a896250f2174c34811c2404083e8e4e1ebb3da6217b5976ae6d3c
MD5 22b6951aff69047c1db6577415584365
BLAKE2b-256 0e4aeed4b79e0acd2cb8028059b28554330bcedee37632f2c1a7525a352b5b0e

See more details on using hashes here.

File details

Details for the file undouble-1.4.11-py3-none-any.whl.

File metadata

  • Download URL: undouble-1.4.11-py3-none-any.whl
  • Upload date:
  • Size: 19.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for undouble-1.4.11-py3-none-any.whl
Algorithm Hash digest
SHA256 bb790d669263daa7003e82fb04984137327bc73562ffae8327c58f1c35aa10d4
MD5 b26e6553a6caf1d6b8bc3f7c154814e4
BLAKE2b-256 de6de1d8221235a3c829beed8c3d0b64b2d7b109aa11e08e385dbc97fefacca1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page