Skip to main content

A Cython wrapper for MeCab

Project description

Current PyPI packages

fugashi

Fugashi by Irasutoya

Fugashi is a Cython wrapper for MeCab, a Japanese tokenizer and morphological analysis tool. Wheels are provided for Linux, OSX, and Win64, and UniDic is easy to install (see docs below).

See the blog post for background on why Fugashi exists and some of the design decisions.

If you are on an unsupported platform (like PowerPC), you'll need to install MeCab first. It's recommended you install from source.

Usage

from fugashi import Tagger

tagger = Tagger('-Owakati')
text = "麩菓子(ふがし)は、麩を主材料とした日本の菓子。"
tagger.parse(text)
# => '麩 菓子 ( ふ が し ) は 、 麩 を 主材 料 と し た 日本 の 菓子 。'
for word in tagger(text):
    print(word, word.feature.lemma, word.pos, sep='\t')
    # "feature" is the Unidic feature data as a named tuple

Installing a Dictionary

Fugashi requires a dictionary. UniDic is recommended, and two easy-to-install versions are provided.

  • unidic-lite, a 2013 version of Unidic that's relatively small
  • unidic, the latest UniDic 2.3.0, which is 1GB on disk and requires a separate download step

If you just want to make sure things work you can start with unidic-lite, but for more serious processing unidic is recommended. For production use you'll generally want to generate your own dictionary too; for details see the MeCab documentation.

To get either of these dictionaries, you can install them directly using pip or do the below:

pip install fugashi[unidic-lite]

# The full version of UniDic requires a separate download step
pip install fugashi[unidic]
python -m unidic download

Dictionary Use

Fugashi is written with the assumption you'll use Unidic to process Japanese, but it supports arbitrary dictionaries.

If you're using a dictionary besides Unidic you can use the GenericTagger like this:

from fugashi import GenericTagger
tagger = GenericTagger()

# parse can be used as normal
tagger.parse('something')
# features from the dictionary can be accessed by field numbers
for word in tagger(text):
    print(word.surface, word.feature[0])

You can also create a dictionary wrapper to get feature information as a named tuple.

from fugashi import GenericTagger, create_feature_wrapper
CustomFeatures = create_feature_wrapper('CustomFeatures', 'alpha beta gamma')
tagger = GenericTagger(wrapper=CustomFeatures)
for word in tagger.parseToNodeList(text):
    print(word.surface, word.feature.alpha)

Alternatives

If you have a problem with Fugashi feel free to open an issue. However, there are some cases where it might be better to use a different library.

  • If you want to use MeCab on a platform we don't have wheels for, but don't have a C compiler, use natto-py.
  • If you don't want to deal with installing MeCab at all, try SudachiPy.
  • If you need to work with Korean, try KoNLPy.

Notice

MeCab is copyrighted free software by Taku Kudo taku@chasen.org and Nippon Telegraph and Telephone Corporation, and is redistributed under the BSD License.

Project details


Release history Release notifications | RSS feed

This version

0.2.3

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fugashi-0.2.3.tar.gz (334.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fugashi-0.2.3-cp38-cp38-win_amd64.whl (498.1 kB view details)

Uploaded CPython 3.8Windows x86-64

fugashi-0.2.3-cp38-cp38-manylinux1_x86_64.whl (479.7 kB view details)

Uploaded CPython 3.8

fugashi-0.2.3-cp38-cp38-macosx_10_14_x86_64.whl (280.0 kB view details)

Uploaded CPython 3.8macOS 10.14+ x86-64

fugashi-0.2.3-cp37-cp37m-win_amd64.whl (497.2 kB view details)

Uploaded CPython 3.7mWindows x86-64

fugashi-0.2.3-cp37-cp37m-manylinux1_x86_64.whl (467.2 kB view details)

Uploaded CPython 3.7m

fugashi-0.2.3-cp37-cp37m-macosx_10_14_x86_64.whl (279.2 kB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

fugashi-0.2.3-cp36-cp36m-win_amd64.whl (497.1 kB view details)

Uploaded CPython 3.6mWindows x86-64

fugashi-0.2.3-cp36-cp36m-manylinux1_x86_64.whl (467.0 kB view details)

Uploaded CPython 3.6m

fugashi-0.2.3-cp36-cp36m-macosx_10_14_x86_64.whl (280.1 kB view details)

Uploaded CPython 3.6mmacOS 10.14+ x86-64

fugashi-0.2.3-cp35-cp35m-win_amd64.whl (495.9 kB view details)

Uploaded CPython 3.5mWindows x86-64

fugashi-0.2.3-cp35-cp35m-manylinux1_x86_64.whl (463.2 kB view details)

Uploaded CPython 3.5m

fugashi-0.2.3-cp35-cp35m-macosx_10_14_x86_64.whl (278.2 kB view details)

Uploaded CPython 3.5mmacOS 10.14+ x86-64

File details

Details for the file fugashi-0.2.3.tar.gz.

File metadata

  • Download URL: fugashi-0.2.3.tar.gz
  • Upload date:
  • Size: 334.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.3

File hashes

Hashes for fugashi-0.2.3.tar.gz
Algorithm Hash digest
SHA256 55c06c512c63731e3f10e08a0cdae316dac79385ef10bebc5e2e8937370f5162
MD5 8cec434476033610f5956a48b85735ea
BLAKE2b-256 e211dce1ab3fed5b5211d28c135d0c459ab12b719f848a70f0640b6bf6a6a442

See more details on using hashes here.

File details

Details for the file fugashi-0.2.3-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: fugashi-0.2.3-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 498.1 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.3

File hashes

Hashes for fugashi-0.2.3-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 e2eaab85bf7ff55e9e82317732f17db1492e537681526b196dfae7fb2f3db35e
MD5 e127a3b08403d5d55c53573a597d5873
BLAKE2b-256 88d20200457f32fc3111b33b476e8be23019eafa8f8fdea8b7e24099c5093816

See more details on using hashes here.

File details

Details for the file fugashi-0.2.3-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-0.2.3-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 479.7 kB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.3

File hashes

Hashes for fugashi-0.2.3-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 d6644be929eb211f2363b00f020aadebd9c027b734c4d1e339834ed18a8b107d
MD5 db5549714002daf3e359a648d6b1f51f
BLAKE2b-256 9cb72d637e66e61b285a0cce6301dce54657b61203c6a43f3092e2f71cc7aa4f

See more details on using hashes here.

File details

Details for the file fugashi-0.2.3-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-0.2.3-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 280.0 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.3

File hashes

Hashes for fugashi-0.2.3-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f0965ac6bb3f9ab7941d5bbbe3a12ef3faacdfc232c1def4ddff99da9ff8b63d
MD5 779123fb77b7c1fa0be6213260544f5b
BLAKE2b-256 130fc81a17646faf2b0bc609d789aaaf1e41316314226ea8fb9f05244d3a3a48

See more details on using hashes here.

File details

Details for the file fugashi-0.2.3-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: fugashi-0.2.3-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 497.2 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.3

File hashes

Hashes for fugashi-0.2.3-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 656a158c3418254d93ccbbf5c0aa40b83f8cf1c91d1c2be1513a8a47dd441285
MD5 2cf0860fffbce0af792b7cf8522c033c
BLAKE2b-256 91f10ffb38f073a8ca3bfaabf091cc7efcf735927f429e26de7eb06dfc395ba3

See more details on using hashes here.

File details

Details for the file fugashi-0.2.3-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-0.2.3-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 467.2 kB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.3

File hashes

Hashes for fugashi-0.2.3-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 3bf63ec5ca524325d2f9719f77d8a904ea1ee1a7503ed0364bfe67e781bbc3c9
MD5 ecfc2952bf5818a27ea7d23a09f562a3
BLAKE2b-256 5aaae46219dea27ffc2a79ea2a60863d4fe73f791145a8a313ee96a46df870eb

See more details on using hashes here.

File details

Details for the file fugashi-0.2.3-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-0.2.3-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 279.2 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.3

File hashes

Hashes for fugashi-0.2.3-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 cef97fcda96d7ab268171ca7034390861910eb0e8a6c4a143c05ad0c5fab6718
MD5 935071d81eea13ed171672dd1227fdf3
BLAKE2b-256 8c6eeaf7876f2e8afe72783b706dbe55004bacc253e71d2a9ae58f1ee1f4f4b1

See more details on using hashes here.

File details

Details for the file fugashi-0.2.3-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: fugashi-0.2.3-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 497.1 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.3

File hashes

Hashes for fugashi-0.2.3-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 c6b56dae4e2954d94f64141ef1bc105e7e51f57f762f49dd32b2abd16a92d4df
MD5 bf24a507cf9478eecc65fc9540b5bf05
BLAKE2b-256 54fef38ae6eb6fe0b30c712b3cd11c1c65de2dd71327c31cb904e7fbe5f03a8b

See more details on using hashes here.

File details

Details for the file fugashi-0.2.3-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-0.2.3-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 467.0 kB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.3

File hashes

Hashes for fugashi-0.2.3-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 7fe9f97014b2b339bba5a8781a9bb0f1ae3cd8419b187b9dc0f2f9945630cd47
MD5 42a9f89a86737e5b18098fa117a3dad7
BLAKE2b-256 3438d7fa77d515f5bdc6e08cb2d6aad01a28f141a05d672ba629d24bed70386b

See more details on using hashes here.

File details

Details for the file fugashi-0.2.3-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-0.2.3-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 280.1 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.3

File hashes

Hashes for fugashi-0.2.3-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 0e65c9e72092c5d05106567935e620dd3bd4d60ba6c91311a525b3b07da70e5f
MD5 c38009c90a36d90de63b8f34a1a57504
BLAKE2b-256 5baf31cb6ca5bb7082e8a6f8729d12af1b018f648e357698f6770b769c2dfd0e

See more details on using hashes here.

File details

Details for the file fugashi-0.2.3-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: fugashi-0.2.3-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 495.9 kB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.3

File hashes

Hashes for fugashi-0.2.3-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 e03a8449dd5b19503b9d2e38d03a58131ea5d808d692eb8d3d86386177ffd829
MD5 fae57b679315b212e6636a7b10fd3dd1
BLAKE2b-256 9993882ece5331c77d5f54125ce13c6b0a5b93ba93b55b833285e800bcb5e7f4

See more details on using hashes here.

File details

Details for the file fugashi-0.2.3-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-0.2.3-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 463.2 kB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.3

File hashes

Hashes for fugashi-0.2.3-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 824e30c04c71596feed3e11db0968a441c095eee28fe623917bb526faaf66bd8
MD5 68f395daba2c9fb4237454b31901384f
BLAKE2b-256 834c64333a230bc80207c6a514cf99e708f26c5ed9e713500042126322754e7f

See more details on using hashes here.

File details

Details for the file fugashi-0.2.3-cp35-cp35m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-0.2.3-cp35-cp35m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 278.2 kB
  • Tags: CPython 3.5m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.3

File hashes

Hashes for fugashi-0.2.3-cp35-cp35m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 62b827bc2732efd917c0af10ab7d88045bd3481d0b9aef94397285f2ec773c13
MD5 45ada2baef0ecd41c278c8c6125a2d9b
BLAKE2b-256 6f52d0d2602e4dd89e2bf8877c707daf2d717581b25ddebda7f1bd8c31db218c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page