WeTextProcessing

WeTextProcessing, including TN & ITN

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Text Normalization & Inverse Text Normalization

1. How To Use

1.1 pip install

pip install WeTextProcessing

# tn
from tn.chinese.normalizer import Normalizer
normalizer = Normalizer()
normalizer.normalize("2.5平方电线")
# itn
from itn.chinese.inverse_normalizer import InverseNormalizer
invnormalizer = InverseNormalizer()
invnormalizer.normalize("二点五平方电线")

1.2 source code compilation

git clone https://github.com/wenet-e2e/WeTextProcessing.git
cd WeTextProcessing

python normalize.py --text "2.5平方电线"
python inverse_normalize.py --text "二点五平方电线"

2. TN Pipeline

Please refer to TN.README

3. ITN Pipeline

Please refer to ITN.README

Acknowledge

Thank the authors of foundational libraries like OpenFst & Pynini.
Thank NeMo team & NeMo open-source community.
Thank Zhenxiang Ma, Jiayu Du, and SpeechColab organization.
Referred Pynini for reading the FAR, and printing the shortest path of a lattice in the C++ runtime.
Referred TN of NeMo for the data to build the tagger graph.
Referred ITN of chinese_text_normalization for the data to build the tagger graph.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.0.1

Jun 6, 2024

1.0.0

Jun 5, 2024

0.2.1

Jun 5, 2024

0.2.0

Jun 5, 2024

0.1.12

Mar 10, 2024

0.1.11

Dec 27, 2023

0.1.10

Nov 15, 2023

0.1.9

Nov 13, 2023

0.1.8

Nov 6, 2023

0.1.7

Oct 30, 2023

0.1.6

Oct 26, 2023

0.1.5

Oct 12, 2023

0.1.4

Sep 26, 2023

0.1.3

Sep 18, 2023

0.1.2

Jul 11, 2023

0.1.1

May 17, 2023

0.1.0

Dec 7, 2022

0.0.8

Dec 7, 2022

0.0.7

Dec 5, 2022

0.0.6

Oct 18, 2022

0.0.5

Oct 10, 2022

0.0.4

Sep 29, 2022

This version

0.0.3

Sep 27, 2022

0.0.2

Sep 19, 2022

0.0.1

Sep 13, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

WeTextProcessing-0.0.3.tar.gz (1.2 MB view hashes)

Uploaded Sep 27, 2022 Source

Built Distribution

WeTextProcessing-0.0.3-py3-none-any.whl (1.3 MB view hashes)

Uploaded Sep 27, 2022 Python 3

Hashes for WeTextProcessing-0.0.3.tar.gz

Hashes for WeTextProcessing-0.0.3.tar.gz
Algorithm	Hash digest
SHA256	`29f9cf652f1946691af845c1d2760a5cf92fdd259fe86e82fd559c9c6a53eab2`
MD5	`109b4e3c396c26322950a6d7c3c9a033`
BLAKE2b-256	`4c7e7f21687da3a3692fbc726cad11fe6dfdecbcec5526380442d49e8ac8080d`

Hashes for WeTextProcessing-0.0.3-py3-none-any.whl

Hashes for WeTextProcessing-0.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`39986bead86dae0e63529dd95a0b810ca5bb8118e11d724ed0f8b2c66647de9b`
MD5	`df358c0e0778ee41ffd01631ff965112`
BLAKE2b-256	`cb5946691312b5cc29c3ec784d9131fc874f2328c47d12cb5408e3e0b7107a50`