Skip to main content

OpenCC made with Python

Project description

開放中文轉換(Pure Python)

Open Chinese convert (OpenCC) in pure Python.

Introduction 簡介

opencc-python 是用純 Python 所寫,使用由 BYVoid(byvoid.kcp@gmail.com) 所開發的 OpenCC 中的字典檔案。

opencc-python 可以支援 Python2.7 及 Python3.x。

opencc-python is made by pure Python with the dictionary files of OpenCC which is developed by BYVoid(byvoid.kcp@gmail.com).

opencc-python can run with Python2.7 and Python3.x.

Installation 安裝

opencc 這個目錄複製到你正在開發的專案中即可,或是執行(需要管理者權限):

python setup.py install

套件也可從 PyPI 安裝,使用指令:

pip install opencc-python-reimplemented

Copy the opencc folder to your project, or run (admin required)

python setup.py install

The package can also be installed from PyPI by issuing:

pip install opencc-python-reimplemented

Usage 使用方式

Code

from opencc import OpenCC

cc = OpenCC('s2t')  # convert from Simplified Chinese to Traditional Chinese

# can also set conversion by calling set_conversion

# cc.set_conversion('s2tw')

to_convert = '开放中文转换'

converted = cc.convert(to_convert)

Command Line

usage: python -m opencc [-h] [-i <file>] [-o <file>] [-c <conversion>]

                        [--in-enc <encoding>] [--out-enc <encoding>]



optional arguments:

  -h, --help            show this help message and exit

  -i <file>, --input <file>

                        Read original text from <file>. (default: None = STDIN)

  -o <file>, --output <file>

                        Write converted text to <file>. (default: None = STDOUT)

  -c <conversion>, --config <conversion>

                        Conversion (default: None)

  --in-enc <encoding>   Encoding for input (default: UTF-8)

  --out-enc <encoding>  Encoding for output (default: UTF-8)



example with UTF-8 encoded file:



  python -m opencc -c s2t -i my_simplified_input_file.txt -o my_traditional_output_file.txt



See https://docs.python.org/3/library/codecs.html#standard-encodings for list of encodings.

Conversions 轉換

  • hk2s: Traditional Chinese (Hong Kong standard) to Simplified Chinese

  • s2hk: Simplified Chinese to Traditional Chinese (Hong Kong standard)

  • s2t: Simplified Chinese to Traditional Chinese

  • s2tw: Simplified Chinese to Traditional Chinese (Taiwan standard)

  • s2twp: Simplified Chinese to Traditional Chinese (Taiwan standard, with phrases)

  • t2hk: Traditional Chinese to Traditional Chinese (Hong Kong standard)

  • t2s: Traditional Chinese to Simplified Chinese

  • t2tw: Traditional Chinese to Traditional Chinese (Taiwan standard)

  • tw2s: Traditional Chinese (Taiwan standard) to Simplified Chinese

  • tw2sp: Traditional Chinese (Taiwan standard) to Simplified Chinese (with phrases)

Issues 問題

當轉換有兩個以上的字詞可能時,程式只會使用第一個。

When there is more than one conversion available, only the first one is taken.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opencc-python-reimplemented-0.1.7.tar.gz (482.6 kB view hashes)

Uploaded Source

Built Distribution

opencc_python_reimplemented-0.1.7-py2.py3-none-any.whl (481.8 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page