Skip to main content

Parse and convert numbers written in French into their digit representation.

Project description

Documentation Status

text2num is a python package that provides functions and parser classes for:

  • parsing numbers expressed as words in French and convert them to integer values;

  • detect ordinal, cardinal and decimal numbers in a stream of French words and get their decimal digit representations.

Compatibility

Tested on python 3.6, 3.7.

License

This sofware is distributed under the MIT license of which you should have received a copy (see LICENSE file in this repository).

Installation

text2num does not depend on any other third party package.

To install text2num in your (virtual) environment:

pip install text2num

That’s all folks!

Usage examples

Parse and convert

>>> from text_to_num import text2num
>>> text2num('quatre-vingt-quinze')
95

>>> text2num('nonante-cinq')
95

>>> text2num('mille neuf cent quatre-vingt dix-neuf')
1999

>>> text2num('dix-neuf cent quatre-vingt dix-neuf')
1999

>>> text2num("cinquante et un million cinq cent soixante dix-huit mille trois cent deux")
51578302

>>> text2num('mille mille deux cents')
ValueError: invalid literal for text2num: 'mille mille deux cent'

Find and transcribe

Any numbers, even ordinals.

>>> from text_to_num import alpha2digit
>>> sentence = (
...         "Huit cent quarante-deux pommes, vingt-cinq chiens, mille trois chevaux, "
...         "douze mille six cent quatre-vingt-dix-huit clous.\n"
...         "Quatre-vingt-quinze vaut nonante-cinq. On tolère l'absence de tirets avant les unités : "
...         "soixante seize vaut septante six.\n"
...         "Nombres en série : douze quinze zéro zéro quatre vingt cinquante-deux cent trois cinquante deux "
...         "trente et un.\n"
...         "Ordinaux: cinquième troisième vingt et unième centième mille deux cent trentième.\n"
...         "Décimaux: douze virgule quatre-vingt dix-neuf, cent vingt virgule zéro cinq ; "
...         "mais soixante zéro deux."
...     )
>>> print(alpha2digit(sentence))
842 pommes, 25 chiens, 1003 chevaux, 12698 clous.
95 vaut 95. On tolère l'absence de tirets avant les unités : 76 vaut 76.
Nombres en série : 12 15 004 20 52 103 52 31.
Ordinaux: 5ème 3ème 21ème 100ème 1230ème.
Décimaux: 12,99, 120,05 ; mais 60 02.

Read the complete documentation on ReadTheDocs.

Contribute

Join us on https://github.com/allo-media/text2num

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

text2num-1.1.0.tar.gz (9.4 kB view details)

Uploaded Source

File details

Details for the file text2num-1.1.0.tar.gz.

File metadata

  • Download URL: text2num-1.1.0.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/28.8.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.1

File hashes

Hashes for text2num-1.1.0.tar.gz
Algorithm Hash digest
SHA256 8866be0619d8e34d2361799f5cabe9bad21493a39aff4f94533c920c96ae2892
MD5 843fb34d754232e77e9e8988fd27416f
BLAKE2b-256 e85cbf1e36f5ab0b872443be40bfc5f4cd635e400933946af6554c504b008b27

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page