Utilty functions to work with TEI/XML-Documents
Project description
acdh-tei-pyutils
Utilty functions to work with TEI Documents
install
run pip install acdh-tei-pyutils
usage
some examples on how to use this package
parse an XML/TEI Document from and URL, string or file:
from acdh_tei_pyutils.tei import TeiReader
doc = TeiReader("https://raw.githubusercontent.com/acdh-oeaw/acdh-tei-pyutils/main/acdh_tei_pyutils/files/tei.xml")
print(doc.tree)
>>> <Element {http://www.tei-c.org/ns/1.0}TEI at 0x7ffb926f9c40>
doc = TeiReader("./acdh_tei_pyutils/files/tei.xml")
doc.tree
>>> <Element {http://www.tei-c.org/ns/1.0}TEI at 0x7ffb926f9c40>
write the current XML/TEI tree object to file
doc.tree_to_file("out.xml")
>>> 'out.xml'
see acdh_tei_pyutils/cli.py for further examples
command line scripts
Batch process a collection of XML/Documents by adding xml:id, xml:base next and prev attributes to the documents root element run:
add-attributes -g "/path/to/your/xmls/*.xml" -b "https://value/of-your/base.com"
add-attributes -g "../../xml/grundbuecher/gb-data/data/editions/*.xml" -b "https://id.acdh.oeaw.ac.at/grundbuecher"
Write mentions as listEvents into index-files:
mentions-to-indices -t "erwähnt in " -i "/path/to/your/xmls/indices/*.xml" -f "/path/to/your/xmls/editions/*.xml"
Write mentions as listEvents of index-files and copy enriched index entries into files
# docs
uv run denormalize-indices --help
# examples
uv run denormalize-indices -f "../../xml/schnitzler/schnitzler-tagebuch-data-public/editions/*.xml" -i "../../xml/schnitzler/schnitzler-tagebuch-data-public/indices/*.xml"
uv run denormalize-indices -f "./data/*/*.xml" -i "./data/indices/*.xml" -m ".//*[@key]/@key" -x ".//tei:title[@level='a']/text()"
uv run denormalize-indices -f "./data/*/*.xml" -i "./data/indices/*.xml" -m ".//*[@key]/@key" -x ".//tei:title[@level='a']/text()" -b pmb2121 -b pmb10815 -b pmb50
uv run denormalize-indices -f "./data/*/*.xml" -i "./data/indices/*.xml" --standoff # writes entity-lists into a tei:standOff element and not in a back element.
develop
- project uses uv
- linting/formatting
uv run ruff check .uv run ruff format . - before commiting run
flake8to check linting anduv run coverage run -m pytest -vto run the tests
bump version
uv version --bump minor
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file acdh_tei_pyutils-2.3.0.tar.gz.
File metadata
- Download URL: acdh_tei_pyutils-2.3.0.tar.gz
- Upload date:
- Size: 12.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db237dd1a007ae9e274d06930f12676e44a8c868eacc50ab0cdf01406e65adf2
|
|
| MD5 |
e402418f60f63d5a413773e863e9da6b
|
|
| BLAKE2b-256 |
d785b7ebcf29d1268c91d3a3088d2267ad5929a238a42dac8861d980612073b4
|
File details
Details for the file acdh_tei_pyutils-2.3.0-py3-none-any.whl.
File metadata
- Download URL: acdh_tei_pyutils-2.3.0-py3-none-any.whl
- Upload date:
- Size: 19.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e7f3b5c80ce79932085f759c3a2489f8ef9cc59cea7e2adf553f3d3ed2f95cfe
|
|
| MD5 |
eca3dc1bca36886294317e672960dd99
|
|
| BLAKE2b-256 |
427a32028689ac653fdcb0337281d64166361924b35ef758cf5c0c359be354b0
|