Skip to main content

A simple, purely python, WikiText parsing tool.

Project description

A simple, purely python, WikiText parsing tool.

The purpose is to allow users easily extract and/or manipulate templates, template parameters, parser functions, tables, external links, wikilinks, etc. in wikitexts.

Installation

Use pip install wikitextparser

Usage

Here is a short demo of some of the functionalities:

>>> import wikitextparser as wtp
>>> # wikitextparser can detect sections, parserfunctions, templates,
>>> # wikilinks, external links, arguments, and HTML comments in
>>> # your wikitext:
>>> wt = wtp.parse("""
== h2 ==
t2

=== h3 ===
t3

== h22 ==
t22

{{text|value1{{text|value2}}}}

[[A|B]]""")
>>>
>>> wt.templates
[Template('{{text|value2}}'), Template('{{text|value1{{text|value2}}}}')]
>>> wt.templates[1].arguments
[Argument("|value1{{text|value2}}")]
>>> wt.templates[1].arguments[0].value = 'value3'
>>> print(wt)

== h2 ==
t2

=== h3 ===
t3

== h22 ==
t22

{{text|value3}}

[[A|B]]
>>> # It provides easy-to-use properties so you can get or set
>>> # name or value of templates, arguments, wikilinks, etc.
>>> wt.wikilinks
[WikiLink("[[A|B]]")]
>>> wt.wikilinks[0].target = 'Z'
>>> wt.wikilinks[0].text = 'X'
>>> wt.wikilinks[0]
WikiLink('[[Z|X]]')
>>>
>>> from pprint import pprint
>>> pprint(wt.sections)
[Section('\n'),
 Section('== h2 ==\nt2\n\n=== h3 ===\nt3\n\n'),
 Section('=== h3 ===\nt3\n\n'),
 Section('== h22 ==\nt22\n\n{{text|value3}}\n\n[[Z|X]]')]
>>>
>>> wt.sections[1].title = 'newtitle'
>>> print(wt)

==newtitle==
t2

=== h3 ===
t3

== h22 ==
t22

{{text|value3}}

[[Z|X]]
>>> # There is a pprint function that pretty-prints templates.
>>> p = wtp.parse('{{t1 |b=b|c=c| d={{t2|e=e|f=f}} }}')
>>> t2, t1 = p.templates
>>> print(t2.pprint())
{{t2
    |e=e
    |f=f
}}
>>> print(t1.pprint())
{{t1
    |b=b
    |c=c
    |d={{t2
        |e=e
        |f=f
    }}
}}
>>> # If you are dealing with
>>> # [[Category:Pages using duplicate arguments in template calls]],
>>> # there are two functions that may be helpful:
>>> t = wtp.Template('{{t|a=a|a=b|a=a}}')
>>> t.rm_dup_args_safe()
>>> t
Template('{{t|a=b|a=a}}')
>>> t = wtp.Template('{{t|a=a|a=b|a=a}}')
>>> t.rm_first_of_dup_args()
>>> t
Template('{{t|a=a}}')
>>> # Extract cell values of a table
>>> p = wtp.parse("""{|
|  Orange    ||   Apple   ||   more
|-
|   Bread    ||   Pie     ||   more
|-
|   Butter   || Ice cream ||  and more
|}""")
>>> pprint(p.tables[0].getdata)
[['Orange', 'Apple', 'more'],
 ['Bread', 'Pie', 'more'],
 ['Butter', 'Ice cream', 'and more']]
>>> # It can even rearrage cells according to cellspan and colspan values.
>>> t = wtp.Table("""{| class="wikitable sortable"
|-
! a !! b !! c
|-
!colspan = "2" | d || e
|-
|}""")
>>> t.getdata(span=True)
[['a', 'b', 'c'], ['d', 'd', 'e']]
>>> # Have a look at test modules for more details and probable pitfalls.
>>>

See also:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wikitextparser-0.5.8.zip (38.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wikitextparser-0.5.8.win32.exe (175.2 kB view details)

Uploaded Source

File details

Details for the file wikitextparser-0.5.8.zip.

File metadata

  • Download URL: wikitextparser-0.5.8.zip
  • Upload date:
  • Size: 38.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for wikitextparser-0.5.8.zip
Algorithm Hash digest
SHA256 dff8e60ff016458214bfd97cf0a6dad5b00593f31ca5d06090a929e88dc7f6af
MD5 e68128e6020548603f154ff3dbdff19e
BLAKE2b-256 d3484b34d6259572f8d2dc313e06c19886810cd147026a9c7975a579d7b1ca91

See more details on using hashes here.

File details

Details for the file wikitextparser-0.5.8.win32.exe.

File metadata

File hashes

Hashes for wikitextparser-0.5.8.win32.exe
Algorithm Hash digest
SHA256 7ace3559a290adbe52300fba270ca97456c704005a6c23c1cf2271982cee7c37
MD5 681c74590aac94ae5f1edc82271a402c
BLAKE2b-256 eec8747fe951d941f71e3b2083a84c3f2b4e90c9d8ecac3381cb73ead0e68ae2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page