Skip to main content

Extract, clean, transform, hyphenate and metadata for ISBNs (International Standard Book Number).

Project description

Downloads Latest Version Download format License Graph Coverage Built Status

Info

isbnlib is a (pure) python library that provides several useful methods and functions to validate, clean, transform, hyphenate and get metadata for ISBN strings. Its origin was as the core of isbntools.

This short version, is suitable to be include as a dependency in other projects. Has a straightforward setup and a very easy programmatic api.

Typical usage (as library):

#!/usr/bin/env python
import isbnlib
...

Main Functions

is_isbn10(isbn10like)

Validate as ISBN-10.

is_isbn13(isbn13like)

Validate as ISBN-13.

to_isbn10(isbn13)

Transform isbn-13 to isbn-10.

to_isbn13(isbn10)

Transform isbn-10 to isbn-13.

canonical(isbnlike)

Keep only numbers and X. You will get strings like 9780321534965.

clean(isbnlike)

Clean ISBN (only legal characters).

notisbn(isbnlike, level='strict')

Check with the goal to invalidate isbn-like.

get_isbnlike(text, level='normal')

Extract all substrings that seem like ISBNs (very useful for scraping).

get_canonical_isbn(isbnlike, output='bouth')

Extract ISBNs and transform them to the canonical form.

EAN13(isbnlike)

Transform an isbnlike string into an EAN13 number (validated canonical ISBN-13).

info(isbn)

Get language or country assigned to this ISBN.

mask(isbn, separator='-')

Mask (hyphenate) a canonical ISBN.

meta(isbn, service='default', cache='default')

Gives you the main metadata associated with the ISBN. As service parameter you can use: 'wcat' uses worldcat.org (no key is needed), 'goob' uses the Google Books service (no key is needed), 'isbndb' uses the isbndb.com service (an api key is needed), 'openl' uses the OpenLibrary.org api (no key is needed), merge uses a merged record of wcat and goob records (no key is needed) and is the default option. You can get an API key for the isbndb.com service here. You can enter API keys with config.add_apikey(service, apikey). The output can be formatted as bibtex, msword, endnote, refworks, opf or json (BibJSON) bibliographic formats with isbnlib.registry.bibformatters. cache only allows two values: ‘default’ or None. You can change the kind of cache by using isbnlib.registry.set_cache (see below).

editions(isbn)

Return the list of ISBNs of editions related with this ISBN.

isbn_from_words(words)

Return the most probable ISBN from a list of words (for your geographic area).

goom(words)

Return a list of references from Google Books multiple references.

doi(isbn)

Return a DOI’s ISBN-A from a ISBN-13.

ren(filename)

Rename a file using metadata from an ISBN in his filename.

Install

From the command line enter (in some cases you have to preced the command with sudo):

$ pip install isbnlib

or:

$ easy_install isbnlib

For Devs

API’s Main Namespaces

In the namespace isbnlib you have access to the core methods: is_isbn10, is_isbn13, to_isbn10, to_isbn13, canonical, clean, notisbn, get_isbnlike, get_canonical_isbn, mask, meta, info, editions, ren, doi, EAN13 and isbn_from_words.

The exceptions raised by these methods can all be catched using ISBNLibException.

You can extend the lib by using the classes and functions exposed in namespace isbnlib.dev, namely:

  • WEBService a class that handles the access to web services (just by passing an url) and supports gzip. You can subclass it to extend the functionality… but probably you don’t need to use it! It is used in the next class.

  • WEBQuery a class that uses WEBService to retrieve and parse data from a web service. You can build a new provider of metadata by subclassing this class. His main methods allow passing custom functions (handlers) that specialize them to specific needs (data_checker and parser).

  • Metadata a class that structures, cleans and ‘validates’ records of metadata. His method merge allows to implement a simple merging procedure for records from different sources. The main features of this class, can be implemented by a call to the stdmeta function instead!

  • vias exposes several functions to put calls to services, just by passing the name and a pointer to the service’s query function. vias.parallel allows to put threaded calls, however doesn’t implement throttling! You can use vias.serial to make serial calls and vias.multi to use several cores. The default is vias.serial.

  • bouth23 a small module to make it possible the code to run in bouth python 2 and python 3.

The exceptions raised by these methods can all be catched using ISBNLibDevException. You should’t raise this exception in your code, only raise the specific exceptions exposed in isbnlib.dev whose name end in Error.

In isbnlib.dev.helpers you can find several methods, that we found very useful, some of then are only used in isbntools (an app and framework that uses ``isbnlib``).

With isbnlib.registry you can change the metadata service to be used by default (setdefaultservice), add a new service (add_service), access bibliographic formatters for metadata (bibformatters), set the default formatter (setdefaultbibformatter), add new formatters (add_bibformatter) and set a new cache (set_cache) (e.g. to switch off the chache set_cache(None)). The cache only works for calls through isbnlib.meta. These changes only work for the ‘current session’, so should be done always before calling other methods.

Finally, from isbnlib.config with can read and set configuration options. Change timeouts with setsocketstimeout and setthreadstimeout, access api keys with apikeys and add new one with add_apikey and access and set generic and user-defined options with options and set_option.

Merge Metadata

The original quality of metadata, at the several services, is not very good! If you need high quality metadata in your app, the only solution is to use polling & merge of several providers and a lot of cleaning and standardization for fields like Authors and Publisher.

A merge provider is now the default in meta. It gives priority to wcat but overwrites the Authors field with the value from goob. Uses the merge method of Metadata and serial calls to services by default (faster for one-call to services through fast internet connections). You can change that by using vias’s other methods (e.g. isbnlib.config.set_option('VIAS_MERGE', 'multi').

Caveats

  1. These classes are optimized for one-calls to services and not for batch calls. However, is very easy to produce an high volume processing system using these classes (use vias.multi) and Redis.

  2. If you inspect the library, you will see that there are a lot of private modules (their name starts with ‘_’). These modules should not be accessed directly since, with high probability, your program will break with a further version of the library!


Read isbnlib code in a very sctructured way at sourcegraph.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isbnlib-3.3.9.tar.gz (54.5 kB view details)

Uploaded Source

File details

Details for the file isbnlib-3.3.9.tar.gz.

File metadata

  • Download URL: isbnlib-3.3.9.tar.gz
  • Upload date:
  • Size: 54.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for isbnlib-3.3.9.tar.gz
Algorithm Hash digest
SHA256 8fa84e3e8c6458ce73bba832cad03bf5752630a64e85b3f9d5ae1f858c0f519c
MD5 0f51ba0faacb133507e272762b30abf6
BLAKE2b-256 99827c6da0f5c3e213ecbfe8929d9076276644052d954c0c1d218afda2a2f8ec

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page