Skip to main content

Simple utility for extracting data from images

Project description

Small utility for retrieving data from figures. Inspired by the Java package of the same name.

Installation

The usual: pip install datathief.

Usage

Unlike the Java DataThief package and similar online tools, here the user manually annotates the figure with the data points of their choosing. This makes it more transparent how the data are being read and makes the results more reproducible. However, it might be annoying for a large amount of data.

If you want to extract a lot of data, or extract data from a continuous line, you are better off using the original Java DataThief package, or one of the many online tools that do exactly this.

To use this tool, first annotate the plot by adding a single pixel at the start and end of the x-axis in a specified color that does not exist anywhere else in the image (default color: pure blue). Do the same for the y-axis (default color: pure red). Then one pixel for each data point you wish to extract (default color: pure green). This function will then return the x and y coordinates of each data point. It will warn you if too many or too few pixels are detected.

For example, running this code:

import datathief as dt
filename = 'du_fig1a_annotated.png'
xlim = [-10, 20]
ylim = [0, 15]
data = dt.datathief(filename, xlim=xlim, ylim=ylim)

On this input (NB, you might need to zoom in to see the individual pixels):

Input

Extracts the data for this plot:

Output

See the examples folder for more information. (Figure courtesy Du et al., https://www.medrxiv.org/content/10.1101/2020.02.19.20025452v4)

More questions? Email info@sciris.org.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datathief-0.3.tar.gz (4.2 kB view hashes)

Uploaded Source

Built Distribution

datathief-0.3-py3-none-any.whl (5.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page