Skip to main content

Interface to WormBase (www.wormbase.org) curation data, including literature management and NLP functions

Project description

WBtools

Interface to WormBase curation database and Text Mining functions

Access WormBase paper corpus information by loading pdf files (converted to txt) and curation info from the WormBase database. The package also exposes text mining functions on papers' fulltext.

Installation

pip install wbtools

Usage example

Get sentences from a WormBase paper

from wbtools.literature.corpus import CorpusManager

paper_id = "000050564"
cm = CorpusManager()
cm.load_from_wb_database(db_name="wb_dbname", db_user="wb_dbuser", db_password="wb_dbpasswd", db_host="wb_dbhost",
                         paper_ids=[paper_id])
sentences = cm.get_paper(paper_id).get_text_docs(split_sentences=True)

Get the latest papers (up to 50) added to WormBase or modified in the last month

from wbtools.literature.corpus import CorpusManager
import datetime

cm = CorpusManager()
cm.load_from_wb_database(db_name="wb_dbname", db_user="wb_dbuser", db_password="wb_dbpasswd", db_host="wb_dbhost",
                         from_date=datetime.datetime.now(), max_num_papers=50)
paper_ids = [paper.paper_id for paper in cm.get_all_papers()]

Get the latest 50 papers added to WormBase or modified that have a final pdf version and have been flagged by WB paper classification pipeline, excluding reviews and papers with temp files only (proofs)

from wbtools.literature.corpus import CorpusManager
import datetime

cm = CorpusManager()
cm.load_from_wb_database(db_name="wb_dbname", db_user="wb_dbuser", db_password="wb_dbpasswd", db_host="wb_dbhost",
                         from_date=datetime.datetime.now(), max_num_papers=50, exclude_not_autclass_flagged=True,
                         exclude_pap_types=['Review'], exclude_temp_pdf=True)
paper_ids = [paper.paper_id for paper in cm.get_all_papers()]

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wbtools-1.0.23.tar.gz (33.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wbtools-1.0.23-py3-none-any.whl (46.1 kB view details)

Uploaded Python 3

File details

Details for the file wbtools-1.0.23.tar.gz.

File metadata

  • Download URL: wbtools-1.0.23.tar.gz
  • Upload date:
  • Size: 33.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.21.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.5

File hashes

Hashes for wbtools-1.0.23.tar.gz
Algorithm Hash digest
SHA256 720aacbdf9d5fb12b8fc9be1c4c3c1861e68f19abfee878fd2a97054a20899c2
MD5 8fecc115e3965b0ad13182ab78317645
BLAKE2b-256 1267be1976a92c6e4f0d8b812793ce9c6ad1e982a3b7870ceca0cf44afc37bd5

See more details on using hashes here.

File details

Details for the file wbtools-1.0.23-py3-none-any.whl.

File metadata

  • Download URL: wbtools-1.0.23-py3-none-any.whl
  • Upload date:
  • Size: 46.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.21.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.5

File hashes

Hashes for wbtools-1.0.23-py3-none-any.whl
Algorithm Hash digest
SHA256 077c10a004662253a9d68dec6a325daf7a2ca2a57b791f4adeffb58c1c095dc5
MD5 8f1b0393a1bde75479cc2b7fc9e70e8f
BLAKE2b-256 94e82373f745392c7836a7056122e75d75b543bff2e7b681d33af598ee346f10

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page