Skip to main content

Interface to WormBase (www.wormbase.org) curation data, including literature management and NLP functions

Project description

WBtools

Interface to WormBase curation database and Text Mining functions

Access WormBase paper corpus information by loading pdf files (converted to txt) and curation info from the WormBase database. The package also exposes text mining functions on papers' fulltext.

Installation

pip install wbtools

Usage example

Get sentences from a WormBase paper

from wbtools.literature.corpus import CorpusManager

paper_id = "000050564"
cm = CorpusManager()
cm.load_from_wb_database(db_name="wb_dbname", db_user="wb_dbuser", db_password="wb_dbpasswd", db_host="wb_dbhost",
                         paper_ids=[paper_id])
sentences = cm.get_paper(paper_id).get_text_docs(split_sentences=True)

Get the latest papers (up to 50) added to WormBase or modified in the last month

from wbtools.literature.corpus import CorpusManager
import datetime

cm = CorpusManager()
cm.load_from_wb_database(db_name="wb_dbname", db_user="wb_dbuser", db_password="wb_dbpasswd", db_host="wb_dbhost",
                         from_date=datetime.datetime.now(), max_num_papers=50)
paper_ids = [paper.paper_id for paper in cm.get_all_papers()]

Get the latest 50 papers added to WormBase or modified that have a final pdf version and have been flagged by WB paper classification pipeline, excluding reviews and papers with temp files only (proofs)

from wbtools.literature.corpus import CorpusManager
import datetime

cm = CorpusManager()
cm.load_from_wb_database(db_name="wb_dbname", db_user="wb_dbuser", db_password="wb_dbpasswd", db_host="wb_dbhost",
                         from_date=datetime.datetime.now(), max_num_papers=50, exclude_not_autclass_flagged=True,
                         exclude_pap_types=['Review'], exclude_temp_pdf=True)
paper_ids = [paper.paper_id for paper in cm.get_all_papers()]

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wbtools-1.0.18.tar.gz (26.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wbtools-1.0.18-py3-none-any.whl (39.3 kB view details)

Uploaded Python 3

File details

Details for the file wbtools-1.0.18.tar.gz.

File metadata

  • Download URL: wbtools-1.0.18.tar.gz
  • Upload date:
  • Size: 26.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.21.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.5

File hashes

Hashes for wbtools-1.0.18.tar.gz
Algorithm Hash digest
SHA256 52dab5e371a7d9c6560891b0c5afb0eaa75a857b1fad80660ae2ea8e58dee6aa
MD5 c453b7dbe554a552819a840e1916b0be
BLAKE2b-256 d95b32931d45d34061301ceaa1ffb458991180c5b519d08ca32fea2536304e91

See more details on using hashes here.

File details

Details for the file wbtools-1.0.18-py3-none-any.whl.

File metadata

  • Download URL: wbtools-1.0.18-py3-none-any.whl
  • Upload date:
  • Size: 39.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.21.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.5

File hashes

Hashes for wbtools-1.0.18-py3-none-any.whl
Algorithm Hash digest
SHA256 1d0a704e8791c059c6849469aeb8c4d15ef6573977d6d3100643ddee8d6cdec7
MD5 42be1192618df01d33a1dabb8f1dab9b
BLAKE2b-256 696fad8e84c8fcc5a3c492a63abadbcb9034bd3760c43fa9128c7087f17a9c64

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page