Skip to main content

A Cython MeCab wrapper for fast, pythonic Japanese tokenization.

Project description

Current PyPI packages Test Status PyPI - Downloads Supported Platforms

fugashi

Fugashi by Irasutoya

Fugashi is a Cython wrapper for MeCab, a Japanese tokenizer and morphological analysis tool. Wheels are provided for Linux, OSX, and Win64, and UniDic is easy to install.

issueを英語で書く必要はありません。

See the blog post for background on why Fugashi exists and some of the design decisions, or see this guide for a basic introduction to Japanese tokenization.

If you are on an unsupported platform (like PowerPC), you'll need to install MeCab first. It's recommended you install from source.

Usage

from fugashi import Tagger

tagger = Tagger('-Owakati')
text = "麩菓子は、麩を主材料とした日本の菓子。"
tagger.parse(text)
# => '麩 菓子 は 、 麩 を 主材 料 と し た 日本 の 菓子 。'
for word in tagger(text):
    print(word, word.feature.lemma, word.pos, sep='\t')
    # "feature" is the Unidic feature data as a named tuple

Installing a Dictionary

Fugashi requires a dictionary. UniDic is recommended, and two easy-to-install versions are provided.

  • unidic-lite, a 2013 version of Unidic that's relatively small
  • unidic, the latest UniDic 2.3.0, which is 1GB on disk and requires a separate download step

If you just want to make sure things work you can start with unidic-lite, but for more serious processing unidic is recommended. For production use you'll generally want to generate your own dictionary too; for details see the MeCab documentation.

To get either of these dictionaries, you can install them directly using pip or do the below:

pip install fugashi[unidic-lite]

# The full version of UniDic requires a separate download step
pip install fugashi[unidic]
python -m unidic download

Dictionary Use

Fugashi is written with the assumption you'll use Unidic to process Japanese, but it supports arbitrary dictionaries.

If you're using a dictionary besides Unidic you can use the GenericTagger like this:

from fugashi import GenericTagger
tagger = GenericTagger()

# parse can be used as normal
tagger.parse('something')
# features from the dictionary can be accessed by field numbers
for word in tagger(text):
    print(word.surface, word.feature[0])

You can also create a dictionary wrapper to get feature information as a named tuple.

from fugashi import GenericTagger, create_feature_wrapper
CustomFeatures = create_feature_wrapper('CustomFeatures', 'alpha beta gamma')
tagger = GenericTagger(wrapper=CustomFeatures)
for word in tagger.parseToNodeList(text):
    print(word.surface, word.feature.alpha)

Alternatives

If you have a problem with Fugashi feel free to open an issue. However, there are some cases where it might be better to use a different library.

  • If you don't want to deal with installing MeCab at all, try SudachiPy.
  • If you need to work with Korean, try KoNLPy.

License and Copyright Notice

Fugashi is released under the terms of the MIT license. Please copy it far and wide.

Fugashi is a wrapper for MeCab, and Fugashi wheels include MeCab binaries. MeCab is copyrighted free software by Taku Kudo <taku@chasen.org> and Nippon Telegraph and Telephone Corporation, and is redistributed under the BSD License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fugashi-1.0.5a1.tar.gz (335.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fugashi-1.0.5a1-cp39-cp39-win_amd64.whl (500.1 kB view details)

Uploaded CPython 3.9Windows x86-64

fugashi-1.0.5a1-cp39-cp39-macosx_10_14_x86_64.whl (284.5 kB view details)

Uploaded CPython 3.9macOS 10.14+ x86-64

fugashi-1.0.5a1-cp38-cp38-win_amd64.whl (500.1 kB view details)

Uploaded CPython 3.8Windows x86-64

fugashi-1.0.5a1-cp38-cp38-manylinux1_x86_64.whl (487.9 kB view details)

Uploaded CPython 3.8

fugashi-1.0.5a1-cp38-cp38-macosx_10_14_x86_64.whl (283.1 kB view details)

Uploaded CPython 3.8macOS 10.14+ x86-64

fugashi-1.0.5a1-cp37-cp37m-win_amd64.whl (499.1 kB view details)

Uploaded CPython 3.7mWindows x86-64

fugashi-1.0.5a1-cp37-cp37m-manylinux1_x86_64.whl (477.3 kB view details)

Uploaded CPython 3.7m

fugashi-1.0.5a1-cp37-cp37m-macosx_10_14_x86_64.whl (282.4 kB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

fugashi-1.0.5a1-cp36-cp36m-win_amd64.whl (499.0 kB view details)

Uploaded CPython 3.6mWindows x86-64

fugashi-1.0.5a1-cp36-cp36m-manylinux1_x86_64.whl (476.8 kB view details)

Uploaded CPython 3.6m

fugashi-1.0.5a1-cp36-cp36m-macosx_10_14_x86_64.whl (283.3 kB view details)

Uploaded CPython 3.6mmacOS 10.14+ x86-64

fugashi-1.0.5a1-cp35-cp35m-win_amd64.whl (497.5 kB view details)

Uploaded CPython 3.5mWindows x86-64

fugashi-1.0.5a1-cp35-cp35m-manylinux1_x86_64.whl (473.2 kB view details)

Uploaded CPython 3.5m

fugashi-1.0.5a1-cp35-cp35m-macosx_10_14_x86_64.whl (280.8 kB view details)

Uploaded CPython 3.5mmacOS 10.14+ x86-64

File details

Details for the file fugashi-1.0.5a1.tar.gz.

File metadata

  • Download URL: fugashi-1.0.5a1.tar.gz
  • Upload date:
  • Size: 335.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a1.tar.gz
Algorithm Hash digest
SHA256 f5b72baa5e5c3a4a290f62c09b7c82b049c73168874647009cb03ab955678a5c
MD5 9a67e607bf2545ed8bc3eff782c59377
BLAKE2b-256 df799682a7cc1e7491f0b64f70f068ed922ccca976d45df77021bbdf106eceb8

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 500.1 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 71d9dfcf0a762aad01dd042bfbe8a835a9fde6b4fc641aad7cc552c55a430c32
MD5 a7f89117a66ad89867e5b1d5805569b3
BLAKE2b-256 3f89cbce8f57c48484e23445b56ae7a5a3b4585bd1346e139b208cea7f3b9b2d

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp39-cp39-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 284.5 kB
  • Tags: CPython 3.9, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a1-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b141285448274d8e811e571f1c95eb8aaf9fa44d8a5acabd6deb9240aaff88f3
MD5 3dec79fb55ac40045b41bdca19b16072
BLAKE2b-256 6ecd095392f908ceca9f6e21dd6344f82f38718c094b9567b0c447f6936eb3c1

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 500.1 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.6

File hashes

Hashes for fugashi-1.0.5a1-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 cc7a07b49cbee655b06029c339dd35c315d685c10b03fbb5b84e7b7629b84f36
MD5 7ba0430f8138d852f5db611724483301
BLAKE2b-256 e1a75d4c69d8ba312a368cea96434ded71a58b09e35c607d42e2ecfbadcb9c17

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 487.9 kB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a1-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 35ad732c50cb5bec6b9d1ee26e85e5f83440eb6faad037a16b479bfe7d24fb4f
MD5 a96199f39157533b07e43569b6844033
BLAKE2b-256 b229b963906885a389c961462d1399d0e77d2fd64ae5744c4a365712d71ae66c

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 283.1 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.6

File hashes

Hashes for fugashi-1.0.5a1-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f930e0011598266055beb88a47d420f93b84f43080cd7d5c444c7027dd103a4e
MD5 42cf11b9e0a4d788813db819d243de35
BLAKE2b-256 c272ea83c20f3edc329d1639603ab54831a171b3a14841749cd3fd214800e8e6

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 499.1 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.7.9

File hashes

Hashes for fugashi-1.0.5a1-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 48119aacd029a102851846994f533b1c8815b39230ce72b33b6bf1b4e0cc5c35
MD5 902097fde98827b8b3b82782f025a02f
BLAKE2b-256 29026427b1f27a73dd75accd8e1fbdeae1dd7d65b292ce56fd8f3855ae8a11c6

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 477.3 kB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a1-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 1549d4840d68bf75835494ff6b5ce4dab4115d0fa4abc43fe192dfb944453e88
MD5 dac863caaa392d11856db575eb41a74d
BLAKE2b-256 26f12a371458854b445ca902c43ef06a97921da9bacf08a75a5849b80de130ba

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 282.4 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.7.9

File hashes

Hashes for fugashi-1.0.5a1-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 22de5dac8d14a129fcde491439f8725d30ae71e5bb98a6178af0f42790e8bce3
MD5 1b3cf4aaf975d1be0f5adca8bf13fb3b
BLAKE2b-256 ae40f629e38a3dd0a14845ee55417584a77fd04a1f2e8cc270748fd1076aa19e

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 499.0 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.6.8

File hashes

Hashes for fugashi-1.0.5a1-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 a1cb6c11684feb742c2cf27a875b7965b99b73711ce3364f637de1f5c9f1081e
MD5 5a3412221193ee40005b2170c2a8554c
BLAKE2b-256 8674669c582d2a206306f31c6d867b331c87908a22d9084f10cd6adc75ffd296

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 476.8 kB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a1-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 71568003c404e83e9f5270a65dcf3470f3061830a74761a1e15c6ffecf00fd85
MD5 51f02360e30483b9c4c866e9cb441763
BLAKE2b-256 da30d3ceb908f18573a9eaf2199c5905ff52a3e243ad33f10e060b9a45eac33c

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 283.3 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.6.12

File hashes

Hashes for fugashi-1.0.5a1-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 81898b6399beaba340f5c50b8a1baa93194c3756d0fd9174ca357190cc2e7696
MD5 f7bafc34e90fc100a867eb067b87a3a3
BLAKE2b-256 d2c3ad499f2e5ba425ec4450c773cf8d55d664bd16443bf68a1f6ca61dbb43ea

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 497.5 kB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.5.4

File hashes

Hashes for fugashi-1.0.5a1-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 20313ae83a2ff76377b0f6828ee26c90a42b54267fd449f189ca50ee921bb5e2
MD5 40b8d188ab3d8d21c67238c6148550cd
BLAKE2b-256 e4ed1579732a238113815fb9b16c95288e0a96113619d0985733c10ac122761f

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 473.2 kB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for fugashi-1.0.5a1-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0d36ea0bff219ac445218a99ea5c8621e7ad66e0d9186dd64f2e06abd8406e1b
MD5 44e31776224560b037c9673a4159764e
BLAKE2b-256 81697dcc9cd4af7193d0f2e9f2b1b12f4fe430bb747ee33bf698e58e893310b9

See more details on using hashes here.

File details

Details for the file fugashi-1.0.5a1-cp35-cp35m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.5a1-cp35-cp35m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 280.8 kB
  • Tags: CPython 3.5m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.5.10

File hashes

Hashes for fugashi-1.0.5a1-cp35-cp35m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 297949e293f6a8e6fd01cb06ad2a1d6e89fe695f851b7453b0b0f308258e65e2
MD5 8b651f2717b0a997089f4e1cccaf727e
BLAKE2b-256 02312ab269a569993ade947a439002c4845b5119d9e98f6a861d16ceaaddd392

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page