Skip to main content

Interface to WormBase (www.wormbase.org) curation data, including literature management and NLP functions

Project description

WBtools

Interface to WormBase curation database and Text Mining functions

Access WormBase paper corpus information by loading pdf files (converted to txt) and curation info from the WormBase database. The package also exposes text mining functions on papers' fulltext.

Installation

pip install wbtools

Usage example

Get sentences from a WormBase paper

from wbtools.literature.corpus import CorpusManager

paper_id = "000050564"
cm = CorpusManager()
cm.load_from_wb_database(db_name="wb_dbname", db_user="wb_dbuser", db_password="wb_dbpasswd", db_host="wb_dbhost",
                         paper_ids=[paper_id])
sentences = cm.get_paper(paper_id).get_text_docs(split_sentences=True)

Get the latest papers (up to 50) added to WormBase or modified in the last month

from wbtools.literature.corpus import CorpusManager
import datetime

cm = CorpusManager()
cm.load_from_wb_database(db_name="wb_dbname", db_user="wb_dbuser", db_password="wb_dbpasswd", db_host="wb_dbhost",
                         from_date=datetime.datetime.now(), max_num_papers=50)
paper_ids = [paper.paper_id for paper in cm.get_all_papers()]

Get the latest 50 papers added to WormBase or modified that have a final pdf version and have been flagged by WB paper classification pipeline, excluding reviews and papers with temp files only (proofs)

from wbtools.literature.corpus import CorpusManager
import datetime

cm = CorpusManager()
cm.load_from_wb_database(db_name="wb_dbname", db_user="wb_dbuser", db_password="wb_dbpasswd", db_host="wb_dbhost",
                         from_date=datetime.datetime.now(), max_num_papers=50, exclude_not_autclass_flagged=True,
                         exclude_pap_types=['Review'], exclude_temp_pdf=True)
paper_ids = [paper.paper_id for paper in cm.get_all_papers()]

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wbtools-1.0.20.tar.gz (31.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wbtools-1.0.20-py3-none-any.whl (43.3 kB view details)

Uploaded Python 3

File details

Details for the file wbtools-1.0.20.tar.gz.

File metadata

  • Download URL: wbtools-1.0.20.tar.gz
  • Upload date:
  • Size: 31.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.21.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.5

File hashes

Hashes for wbtools-1.0.20.tar.gz
Algorithm Hash digest
SHA256 1b6ca719a6f0dbb40806c7302c0950e0b8a937bf420a2b85274f1e21145538d2
MD5 2c33474ff017fbef760ee6d7038b6b6f
BLAKE2b-256 d35936e8b9f887aaf9c84c7a028399cb45145828d92bf4bc7326c2f4c1a276a3

See more details on using hashes here.

File details

Details for the file wbtools-1.0.20-py3-none-any.whl.

File metadata

  • Download URL: wbtools-1.0.20-py3-none-any.whl
  • Upload date:
  • Size: 43.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.21.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.5

File hashes

Hashes for wbtools-1.0.20-py3-none-any.whl
Algorithm Hash digest
SHA256 9b7c9aadb470b5cdd35aa380a20cfb1637e6301bef53d8128c3700c765f74227
MD5 e8472a642b77f28ed03a7a1096457e3a
BLAKE2b-256 c9d2e25cd4f1456f458cc390ab3de918ae2b4140c75b3a18963907a03b70186f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page