Skip to main content

Use python to access all U.S. caselaw through the Harvard Law School Library.

Project description

PyClerk

Use python to access all U.S. caselaw through the Harvard Law School Library Caselaw Access Project.

PyClerk is a Python package that simplifies accessing the Caselaw Access Project's Web API (CAPAPI). Its goal it to reduce the necessary overhead to accessing CAPAPI from reading their detailed but dense documentation to simply importing a python package and trying out a few lines of code.

The current alpha version provides this simplicity for the Cases endpoint, especially for the single case API. While this is somewhat limited functionality compared to the full CAPAPI, this initial release will enable users to access the core data of CAP without leaving Python.

Trying Out PyClerk

The the completed project is hosted on the Python Package Index (PyPi).

Installing and trying PyClerk is easy. You'll need Python 3 installed as well as its package manager 'pip'. Some machines come with Python 2 installed as default as an OS dependency. On these, you may need to replace pip in the instructions with pip3. In a terminal window, or in a python virtual environment if you prefer:

  • install PyClerk: pip install pyclerk

  • start a Python console: python3

  • import pyclerk and create a PyClerk instance:

    import pyclerk
    pc = pyclerk.PyClerk()
    json, body = pc.cases.single_case(435800)  # This returns a specific case, with internal id # 435800
    
  • the final command will return two data structures json and body

  • json is the content reply from the Caselaw Access Project API, assuming the API returns a valid status code. If it doesn't, the appropriate error is raised. This contains case metadata as a json object, with the content of the case as an entry in this structure represented as a bytestring.

  • body is the content of the case deparsed to be easier to manipulate in Python.

Getting More Advanced

  • from there, explore the docs to expand to include new parameters, new types of searches, and more!
  • or, browse the CAPAPI root to identify additional functionality that you need for your project.
  • you can also interface directly with the API through PyClerk without using the custom functions or classes:
import pyclerk
pc = pyclerk.PyClerk()

# Write out a custom API Query
custom_url = "https://api.case.law/v1/YOUR CUSTOM REQUEST HERE"
# Send that request to CAPAPI and get a text response
response = pc.custom_endpoint.send_request(custom_url)
# Parse that response into json and a custom body class
json, body = pc.custom_endpoint.format_response(response)

Expanding PyClerk

PyClerk is still under active development. That means you might find a bug or identify new functionality your project needs. The Caselaw Access Project might also update their API to a new version or change various functionality.

  • If you find a bug, please file an issue here.
  • If you need a new feature, you can file a feature request as an issue, or you can go implement it yourself! Just fork the project, add the feature, and submit a pull-request. I'd love some help.
  • If the problem is with the CAPAPI itself, please let them know here.

Documentation

Documentation is super important to this project--the whole goal is ease of use for new coders. That requires good documentation!

Rendered versions of documentation are available through ReadTheDocs. It includes both high-level descriptions and overviews (like the installation and first uses instructions above) and rendered versions of the docstrings that accompany the classes and functions in the package.

To rebulid the documentation:

  • Generate latest raw API docs:

sphinx-apidoc -e -o docs/source/api pyclerk

sphinx-apidoc -e -o docs/source/api/endpoints pyclerk/endpoint_types

  • Build the docs: make html

Future Growth

The obvious case for future growth is the inclusion of all available endpoints, such as:

  • bulk
  • citations
  • courts
  • jurisdictions
  • ngrams
  • reporters
  • user_history
  • volumes
  • and any others CAP may choose to unveil.

Additionally, as we better define the best uses for this kind of data, PyClerk should grow to include pipelines for processing the API data into formats users want most. This might be in line with some of the sample processing functions I've outlined, or it could be something endusers create that I could never have imagined.

PyClerk will also probably need an update whenever CAP decides to move to v2 of their API.

Areas in the source code ripe for expansion are marked with #FUTURE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyclerk-0.0.1.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyclerk-0.0.1-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file pyclerk-0.0.1.tar.gz.

File metadata

  • Download URL: pyclerk-0.0.1.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for pyclerk-0.0.1.tar.gz
Algorithm Hash digest
SHA256 a8e384795260078144c551e825a7a8f5320741e0faecb1ca78722102647701f9
MD5 d24e57fba896581acf812a7933b127db
BLAKE2b-256 3a4c65238b30a1b0bdb552b7af69758dcdcf32ce3e1829a185a7b85fdcb808cf

See more details on using hashes here.

File details

Details for the file pyclerk-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: pyclerk-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 13.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for pyclerk-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 77f8624b487f1ee8a6a0ad5cd681814020843a815dd2a17de070183e31b0ce93
MD5 e03a9a6e36c12fef63fc74f5c3a754e1
BLAKE2b-256 59edcb8a740f02e74d66828b9d18f59b86433025382d9f15dde78107dbdf6eb5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page