Skip to main content

svgdigitizer is a Python library and command line tool to recover the measured data underlying plots in scientific publications.

Project description

logo

svgdigitizer

License: GPL 3.0 or later DOI

Extract (x,y) data points from SVG files


The svgdigitizer allows recovering data from a curve in a figure, plotted in a 2D coordinate system, which is usually found in scientific publications. The data is accessible either with a command line interface or the API from a specifically prepared scaled vector graphics (SVG) file. The data can be stored as a frictionless datapackage (CSV and JSON) which can be used with unitpackage to access the plots metadata or create a database of such datapackages.

Features

The svgdigitizer has additional features compared to other plot digitizers, such as:

  • supports multiple y (x) values per x (y) value
  • usage of splines allows for very precise retracing of distinct features
  • splines can be digitized with specific sampling intervals
  • supports plots with distorted/skewed axis
  • extracts units from axis labels
  • reconstruct time series with a given scan rate
  • supports scale bars
  • supports scaling factors
  • extracts metadata associated with the plot in the SVG
  • saves data as frictionless datapackage (CSV + JSON) allowing for FAIR data usage
  • inclusion of metadata in the datapackage
  • Python API to interact with the retraced data

Refer to our documentation for more details.

Installation

This package is available on PyPI and can be installed with pip:

pip install svgdigitizer

The package is also available on conda-forge and can be installed with conda

conda install -c conda-forge svgdigitizer

or mamba

mamba install -c conda-forge svgdigitizer

Please consult our documentation for more detailed installation instructions.

Command Line Interface

The CLI allows creating SVG files from PDFs and allows digitizing the processed SVG files. Certain plot types have specific commands to recover different kinds of metadata. Refer to the CLI documentation for more information.

$ svgdigitizer
Usage: svgdigitizer [OPTIONS] COMMAND [ARGS]...

  The svgdigitizer suite.

Options:
  --help  Show this message and exit.

Commands:
  cv        Digitize a cylic voltammogram and create a frictionless datapackage.
  digitize  Digitize a 2D plot.
  figure    Digitize a figure with units on the axis and create a frictionless datapackage.
  paginate  Render PDF pages as individual SVG files with linked PNG images.
  plot      Display a plot of the data traced in an SVG.

$ svgdigitizer figure doc/files/others/looping_scan_rate.svg --sampling-interval 0.01

API

You can also use the svgdigitizer package directly from Python, to access properties of the SVG or additional properties associated with the figure.

>>> from svgdigitizer.svg import SVG
>>> from svgdigitizer.svgplot import SVGPlot
>>> from svgdigitizer.svgfigure import SVGFigure


>>> figure = SVGFigure(SVGPlot(SVG(open('doc/files/others/looping.svg', 'rb')), sampling_interval=0.01))

Examples: figure.df provides a dataframe of the digitized curve. figure.plot() shows a plot of the digitized curve. figure.metadadata provides a dict with metadata of the original plot, such as original units of the axis.

The svgdigitizer can be enhanced with submodules, which are designed to digitize specific plot types, such as the submodule electrochemistry.cv.

This submodule allows digitizing cyclic voltammograms commonly found in the field of electrochemistry.

>>> from svgdigitizer.svg import SVG
>>> from svgdigitizer.svgplot import SVGPlot
>>> from svgdigitizer.electrochemistry.cv import CV

>>> cv_svg = 'doc/files/mustermann_2021_svgdigitizer_1/mustermann_2021_svgdigitizer_1_f2a_blue.svg'
>>> cv = CV(SVGPlot(SVG(open(cv_svg, 'rb')), sampling_interval=0.01))

The resulting cv object has the same properties as the figure object above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

svgdigitizer-0.14.2.tar.gz (114.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

svgdigitizer-0.14.2-py3-none-any.whl (116.1 kB view details)

Uploaded Python 3

File details

Details for the file svgdigitizer-0.14.2.tar.gz.

File metadata

  • Download URL: svgdigitizer-0.14.2.tar.gz
  • Upload date:
  • Size: 114.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for svgdigitizer-0.14.2.tar.gz
Algorithm Hash digest
SHA256 3a320541d3dd4016b1a03e88173dacb0148520bbc04c2ad8a60572e486c0106b
MD5 f5d3b8dcb7fa286add39fa37d5dd7689
BLAKE2b-256 9735b38b53b524254216da2dcdb5667fcb3fe3e265ffd0ccce0915f1e3e3f655

See more details on using hashes here.

File details

Details for the file svgdigitizer-0.14.2-py3-none-any.whl.

File metadata

  • Download URL: svgdigitizer-0.14.2-py3-none-any.whl
  • Upload date:
  • Size: 116.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for svgdigitizer-0.14.2-py3-none-any.whl
Algorithm Hash digest
SHA256 514d9feeb4ba7a20f3d977850aef6b012c7612c1292aa1656175c60396af1655
MD5 a570a94d03c96c9d9ab4cad02cb4bc63
BLAKE2b-256 3d1c019b21ff1a75c8bdcb327457e00d03bbb125a28c054fcfbba7279aa36031

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page