Skip to main content

A Cython MeCab wrapper for fast, pythonic Japanese tokenization.

Project description

Current PyPI packages Test Status PyPI - Downloads Supported Platforms

fugashi

Fugashi by Irasutoya

Fugashi is a Cython wrapper for MeCab, a Japanese tokenizer and morphological analysis tool. Wheels are provided for Linux, OSX, and Win64, and UniDic is easy to install.

issueを英語で書く必要はありません。

See the blog post for background on why Fugashi exists and some of the design decisions, or see this guide for a basic introduction to Japanese tokenization.

If you are on an unsupported platform (like PowerPC), you'll need to install MeCab first. It's recommended you install from source.

Usage

from fugashi import Tagger

tagger = Tagger('-Owakati')
text = "麩菓子は、麩を主材料とした日本の菓子。"
tagger.parse(text)
# => '麩 菓子 は 、 麩 を 主材 料 と し た 日本 の 菓子 。'
for word in tagger(text):
    print(word, word.feature.lemma, word.pos, sep='\t')
    # "feature" is the Unidic feature data as a named tuple

Installing a Dictionary

Fugashi requires a dictionary. UniDic is recommended, and two easy-to-install versions are provided.

  • unidic-lite, a 2013 version of Unidic that's relatively small
  • unidic, the latest UniDic 2.3.0, which is 1GB on disk and requires a separate download step

If you just want to make sure things work you can start with unidic-lite, but for more serious processing unidic is recommended. For production use you'll generally want to generate your own dictionary too; for details see the MeCab documentation.

To get either of these dictionaries, you can install them directly using pip or do the below:

pip install fugashi[unidic-lite]

# The full version of UniDic requires a separate download step
pip install fugashi[unidic]
python -m unidic download

Dictionary Use

Fugashi is written with the assumption you'll use Unidic to process Japanese, but it supports arbitrary dictionaries.

If you're using a dictionary besides Unidic you can use the GenericTagger like this:

from fugashi import GenericTagger
tagger = GenericTagger()

# parse can be used as normal
tagger.parse('something')
# features from the dictionary can be accessed by field numbers
for word in tagger(text):
    print(word.surface, word.feature[0])

You can also create a dictionary wrapper to get feature information as a named tuple.

from fugashi import GenericTagger, create_feature_wrapper
CustomFeatures = create_feature_wrapper('CustomFeatures', 'alpha beta gamma')
tagger = GenericTagger(wrapper=CustomFeatures)
for word in tagger.parseToNodeList(text):
    print(word.surface, word.feature.alpha)

Alternatives

If you have a problem with Fugashi feel free to open an issue. However, there are some cases where it might be better to use a different library.

  • If you don't want to deal with installing MeCab at all, try SudachiPy.
  • If you need to work with Korean, try KoNLPy.

License and Copyright Notice

Fugashi is released under the terms of the MIT license. Please copy it far and wide.

Fugashi is a wrapper for MeCab, and Fugashi wheels include MeCab binaries. MeCab is copyrighted free software by Taku Kudo <taku@chasen.org> and Nippon Telegraph and Telephone Corporation, and is redistributed under the BSD License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fugashi-1.0.5a5-cp39-cp39-win_amd64.whl (500.1 kB view details)

Uploaded CPython 3.9Windows x86-64

fugashi-1.0.5a5-cp39-cp39-manylinux1_x86_64.whl (477.0 kB view details)

Uploaded CPython 3.9

fugashi-1.0.5a5-cp39-cp39-macosx_10_14_x86_64.whl (284.5 kB view details)

Uploaded CPython 3.9macOS 10.14+ x86-64

fugashi-1.0.5a5-cp38-cp38-win_amd64.whl (500.1 kB view details)

Uploaded CPython 3.8Windows x86-64

fugashi-1.0.5a5-cp38-cp38-manylinux1_x86_64.whl (487.9 kB view details)

Uploaded CPython 3.8

fugashi-1.0.5a5-cp38-cp38-macosx_10_14_x86_64.whl (283.1 kB view details)

Uploaded CPython 3.8macOS 10.14+ x86-64

fugashi-1.0.5a5-cp37-cp37m-win_amd64.whl (499.1 kB view details)

Uploaded CPython 3.7mWindows x86-64

fugashi-1.0.5a5-cp37-cp37m-manylinux1_x86_64.whl (477.3 kB view details)

Uploaded CPython 3.7m

fugashi-1.0.5a5-cp37-cp37m-macosx_10_14_x86_64.whl (282.4 kB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

fugashi-1.0.5a5-cp36-cp36m-win_amd64.whl (499.0 kB view details)

Uploaded CPython 3.6mWindows x86-64

fugashi-1.0.5a5-cp36-cp36m-manylinux1_x86_64.whl (476.8 kB view details)

Uploaded CPython 3.6m

fugashi-1.0.5a5-cp36-cp36m-macosx_10_14_x86_64.whl (283.3 kB view details)

Uploaded CPython 3.6mmacOS 10.14+ x86-64

fugashi-1.0.5a5-cp35-cp35m-win_amd64.whl (497.5 kB view details)

Uploaded CPython 3.5mWindows x86-64

fugashi-1.0.5a5-cp35-cp35m-manylinux1_x86_64.whl (473.2 kB view details)

Uploaded CPython 3.5m

fugashi-1.0.5a5-cp35-cp35m-macosx_10_14_x86_64.whl (280.8 kB view details)

Uploaded CPython 3.5mmacOS 10.14+ x86-64

File details

Details for the file fugashi-1.0.5a5-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 500.1 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a5-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 d902f831312ec9af8d3d51b83cec0bc5ab5bd4b664c0b26314cc3936a5154409
MD5 ccaaa50aab9f3fb5bd6bd0def31301f6
BLAKE2b-256 bfbbf1038d880bae959a0f920822bd30df0577f7dd5e9db0b278078e3076786b

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp39-cp39-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp39-cp39-manylinux1_x86_64.whl
  • Upload date:
  • Size: 477.0 kB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a5-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 4681c8ba34b2361e87dc2b95ba465f92a282363aa3fc776315b3b360f9f9b626
MD5 79b781ea63b8799b101a684551ceecd5
BLAKE2b-256 26708e96524275f7f6e02a11983978e2478bb1ad526b2805c362d40dff1477b3

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp39-cp39-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 284.5 kB
  • Tags: CPython 3.9, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a5-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2f27e327a1605cf7fd4a25b0dc574328f76afa334885ad88f848a7bbea1ff864
MD5 e131488cc11c999c3dd01f64fa6618c0
BLAKE2b-256 916d7f65fe9dd7bb06cd7937ca8ea6519d9c92141e13333dcc0b8b92791099e9

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 500.1 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.6

File hashes

Hashes for fugashi-1.0.5a5-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 70d0ebf0f9c55993283eb427a12f85450381673eb5a62ac0a2b3da332bdd317c
MD5 6f7cb7efb83b29c9ec92b24227ee5b53
BLAKE2b-256 7b547031197831a5e62ba55c5f0dd23e6e98f4887e5f78fe98bb156a3039cf00

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 487.9 kB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a5-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 5f06e64d9b59cd7461a07062c5a2d47d073dc6476e361c337262845c859705d3
MD5 31fbf9b69249bde035dd36199d84afaf
BLAKE2b-256 235cfff20a11b586c9dbabe3bbded3bf83799e86b0a6c2c9779d8314d39b808c

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 283.1 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.6

File hashes

Hashes for fugashi-1.0.5a5-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1d161754bcaa5ee0e76bb67bc4250082bec6d74428826c651bd13506b9b31803
MD5 38284c219551a9f92e1b850fec74b695
BLAKE2b-256 a55ab742f66df3dce575745d4ceb7cf8aaffdf512b11b3817246d980ef5c06a2

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 499.1 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.7.9

File hashes

Hashes for fugashi-1.0.5a5-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 9584c69c844e9546d8d65fb14d255ea3193b9a33989c6c18f9bf394828a84d12
MD5 8e79b2e63c25b9907a4b92631ab840d9
BLAKE2b-256 ce102cbebcc4f97e9445f0736f6cbad215be8571b1894b7d2468392e1e6af6d1

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 477.3 kB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a5-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 1c6f9439c6b01f7b0b75dab003e495faf0ca9efd59067779b3b8048b1daab679
MD5 c881672837f23dd10b23f67843302e7f
BLAKE2b-256 3ae53c159a14e2555b4bdf375be01c959014a55284078b8db91bf33255209cd7

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 282.4 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.7.9

File hashes

Hashes for fugashi-1.0.5a5-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 95a18f1d299d6353afe2fe5d9e3e3389c48331d47bc67d3fefd99d0eb2a37919
MD5 5da17013d9f09891e3b68c8cf10ac497
BLAKE2b-256 38c21f8c7e036bae6c1d66a1247a707c5399ce0fa9fbbda0cb7d3ea84b52cb16

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 499.0 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.6.8

File hashes

Hashes for fugashi-1.0.5a5-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 3b347199ff73a2714466849519e81b22caec5f7bf41656d53719ac4964378d22
MD5 03d1a5d554ec6a90c63fd569dd9f33c2
BLAKE2b-256 6733c85f55467960ee39fdd176f999cabc966d31751f52e2b902c47b0736f06b

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 476.8 kB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a5-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 332e3d61b899d623fd97c0dc8f709f7d2891130cb0fb9f95453b7dbe10a32c23
MD5 c67eef0f571e6be3144d829228d3c7ae
BLAKE2b-256 c300f3bf4ca498d7412c2261ec22873a77c0f8d73a3d029bc1fecfd66a8834da

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 283.3 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.6.12

File hashes

Hashes for fugashi-1.0.5a5-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 04aa2d8e7b880bf966ae2ddabe5507fe8dd044885eda53e82e75f1637899d8e1
MD5 67ef0bfeca3fa7d988773d8c6565fbf5
BLAKE2b-256 cbfb2b8e7ae4e5e5779f25b209e1841be3837212ac0f9179ab3639f1646b87c4

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 497.5 kB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.5.4

File hashes

Hashes for fugashi-1.0.5a5-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 555bca8fb60f36cc927ba22bf6dfd4d5b45c0b989e7a841c4a84a5b6fd3e35b0
MD5 e1a2e16f923a7f4f13290aa83ec08016
BLAKE2b-256 608a417eb028878b812603aa1f5c98a7e49bfe782d650d86ccf1b6cf676e45b2

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 473.2 kB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a5-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 91c2d9a2c63a2808434a417ff8b07c772017332b8ace16de36fd25ed38ebcfce
MD5 e02bc6fa4fd124105f9a52c647ce510e
BLAKE2b-256 6c299e5424d1622a1213df3e6b47cc0b6017c4bce8afae1ba667845a2667546d

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a5-cp35-cp35m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a5-cp35-cp35m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 280.8 kB
  • Tags: CPython 3.5m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.5.10

File hashes

Hashes for fugashi-1.0.5a5-cp35-cp35m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e16fbcdff6e9752a1e60766a30de30324d7e8e6d34304a13f2fb09f9c6e42187
MD5 d92cdd30c7e106c0486315fedf42989b
BLAKE2b-256 b875cea035b575ca47d3a3717d0945c34cbbd346e0a0b7e37e713b268efd5ab5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page