Skip to main content

programmatic curation of concepticon-data

Project description

pyconcepticon

Tooling to access and curate Concepticon data.

Build Status PyPI

Installation

pyconcepticon can be installed from PyPI running

pip install pyconcepticon

Note that pyconcepticon requires a clone or export of the concepticon data repository.

Usage

To use pyconcepticon you must have a local copy of the Concepticon data, i.e. either

  • the sources of a released version, as provided in the Downloads section of a release, or
  • a clone of this repository (or your personal fork of it).
  • or a released version of the data as archived on ZENODO.

Python API

Assuming you have downloaded release 1.2.0 DOI and unpacked the sources to a directory clld-concepticon-data-41d2bf0, you can access the data as follows:

>>> from pyconcepticon import Concepticon
>>> api = Concepticon('clld-concepticon-data-41d2bf0')
>>> conceptlist = list(api.conceptlists.values())[0]
>>> conceptlist.author
'Perrin, Loïc-Michel'
>>> conceptlist.tags
['annotated']
>>> len(conceptlist.concepts)
110
>>> list(conceptlist.concepts.values())[0]
Concept(
    id='Perrin-2010-110-1', number='1', concepticon_id='1906', concepticon_gloss='SOUR', gloss=None, 
    english='ACID', attributes={'german': 'sauer', 'french': 'acide'}, 
    _list=Conceptlist(
        _api=<pyconcepticon.api.Concepticon object at 0x7f31693be518>, 
        id='Perrin-2010-110', author='Perrin, Loïc-Michel', year=2010, list_suffix='', items=110, 
        tags=['annotated'], source_language=['english', 'french', 'german'], 
        target_language='Global', 
        url='https://journals.dartmouth.edu/cgi-bin/WebObjects/Journals.woa/xmlpage/1/article/353?htmlOnce=yes', 
        refs=['Perrin2010'], pdf=['Perrin2010'], 
        note='This list was used as an initial questionnaire for colexification studies on a world-wide sample of languages.', 
        pages='276f', alias=[], local=False))

Command line interface

Having installed pyconcepticon, you can also directly query concept lists via the terminal command concepticon. To learn about the functionality it provides run

$ concepticon -h
usage: concepticon [-h] [--log-level LOG_LEVEL] [--repos REPOS]
                   [--repos-version REPOS_VERSION]
                   COMMAND ...

optional arguments:
  -h, --help            show this help message and exit
  --log-level LOG_LEVEL
                        log level [ERROR|WARN|INFO|DEBUG] (default: 20)
  --repos REPOS         clone of concepticon/concepticon-data
  --repos-version REPOS_VERSION
                        version of repository data. Requires a git clone!
                        (default: None)

available commands:
  Run "COMAMND -h" to get help for a specific command.

  COMMAND
    attributes          Print all columns in concept lists that contain
                        surplus information.
...

To learn about individual subcommands run concepticon COMMAND -h, e.g.

$ concepticon intersection -h
usage: concepticon intersection [-h] CONCEPTLIST [CONCEPTLIST ...]

Compute the intersection of concepts for a number of concept lists.

Notes
-----
This takes concept relations into account by searching for each concept
set for broader concept sets in the depth of two edges on the network. If
one concept A in one list is broader than concept B in another list, the
concept A will be retained, and this will be marked in output. If two lists
share the same broader concept, they will also be retained, but only, if
none of the narrower concepts match. As a default we use a depth of 2 for
the search.

positional arguments:
  CONCEPTLIST  Path to (or ID of) concept list in TSV format

optional arguments:
  -h, --help   show this help message and exit

An example of the intersection between two lists looks as follows:

$ concepticon --repos=clld-concepticon-data-41d2bf0 intersection Swadesh-1955-100 Swadesh-1952-200

This yields an output of 93 lines, which look as follows:

 69  SKIN                    [763 ] SKIN (HUMAN) (1, Swadesh-1952-200)
 70  SLEEP                   [1585]
 71  SMALL                   [1246]
 72  SMOKE (EXHAUST)         [778 ]

The output can be interpreted as follows: The first number shows the number in the intersection of items (alphabetically ordered, following the Concepticon gloss). The Concepticon gloss is shown as a next item. If it is preceded by an asterisk, this means that the mapping was not complete, as it involves concept relations. The alternative concept sets are then listed in the end of the line. The number in squared brackets indicates the Concepticon concept set ID.

You can use the same technique with the command "union", to obtain the union of two concept lists.

To create a user interface which allows you to explore concepticon concepts in the browser, run

$ concepticon --repos=clld-concepticon-data-41d2bf0 app

Configuration

Python API as well as CLI can lookup the location of the data from a cldfcatalog config file, under the key concepticon.

Such a config file (and the repository clone) can be created automatically, by installing cldfbench and running cldfbench config.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyconcepticon-2.9.0.tar.gz (310.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyconcepticon-2.9.0-py3-none-any.whl (319.8 kB view details)

Uploaded Python 3

File details

Details for the file pyconcepticon-2.9.0.tar.gz.

File metadata

  • Download URL: pyconcepticon-2.9.0.tar.gz
  • Upload date:
  • Size: 310.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.8.10

File hashes

Hashes for pyconcepticon-2.9.0.tar.gz
Algorithm Hash digest
SHA256 f7d9829a9940098986ed1b3c201891c98c0c9652c5ca1c9b9bb11753aa98eb0e
MD5 8f37e539115f448942302bf1938349c8
BLAKE2b-256 4d0b38de18c183249e3aad0e97bbfb7e735d4aa295f77d8cfda1b5b9edec2701

See more details on using hashes here.

File details

Details for the file pyconcepticon-2.9.0-py3-none-any.whl.

File metadata

  • Download URL: pyconcepticon-2.9.0-py3-none-any.whl
  • Upload date:
  • Size: 319.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.8.10

File hashes

Hashes for pyconcepticon-2.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d5af17f5b14723d3f7d0b359cfcba1409dd3b2766f822a77552530b0f54190b6
MD5 66615ce7303c076d4eddbd008ddeaeba
BLAKE2b-256 20fe9c3634357a00dc1472a0a9ff93c5d4affe8440351f7ed7d0774b731dadcc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page