Skip to main content

A Cython wrapper for MeCab

Project description

Current PyPI packages

fugashi

Fugashi by Irasutoya

Fugashi is a Cython wrapper for MeCab.

See the blog post for background on why Fugashi exists and some of the design decisions.

Any reasonable version of MeCab should work, but it's recommended you install from source.

Usage

from fugashi import Tagger

tagger = Tagger('-Owakati')
text = "麩菓子(ふがし)は、麩を主材料とした日本の菓子。"
tagger.parse(text)
# => '麩 菓子 ( ふ が し ) は 、 麩 を 主材 料 と し た 日本 の 菓子 。'
for word in tagger.parseToNodeList(text):
    print(word, word.feature.lemma, word.pos, sep='\t')
    # "feature" is the Unidic feature data as a named tuple

Dictionary Use

Fugashi is written with the assumption you'll use Unidic to process Japanese, but it supports arbitrary dictionaries.

If you're using a dictionary besides Unidic you can use the GenericTagger like this:

from fugashi import GenericTagger
tagger = GenericTagger()

# parse can be used as normal
tagger.parse('something')
# features from the dictionary can be accessed by field numbers
for word in tagger.parseToNodeList(text):
    print(word.surface, word.feature[0])

You can also create a dictionary wrapper to get feature information as a named tuple.

from fugashi import GenericTagger, create_feature_wrapper
CustomFeatures = create_feature_wrapper('CustomFeatures', 'alpha beta gamma')
tagger = GenericTagger(wrapper=CustomFeatures)
for word in tagger.parseToNodeList(text):
    print(word.surface, word.feature.alpha)

Alternatives

If you have a problem with Fugashi feel free to open an issue. However, there are some cases where it might be better to use a different library.

  • If you want to use MeCab but don't have a C compiler, use natto-py.
  • If you don't want to deal with installing MeCab at all, try SudachiPy.

Note that these are both slower than Fugashi according to a benchmark I wrote.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fugashi-0.1.10rc1-cp38-cp38-win_amd64.whl (497.3 kB view details)

Uploaded CPython 3.8Windows x86-64

fugashi-0.1.10rc1-cp38-cp38-manylinux1_x86_64.whl (469.4 kB view details)

Uploaded CPython 3.8

fugashi-0.1.10rc1-cp37-cp37m-win_amd64.whl (496.1 kB view details)

Uploaded CPython 3.7mWindows x86-64

fugashi-0.1.10rc1-cp37-cp37m-manylinux1_x86_64.whl (463.9 kB view details)

Uploaded CPython 3.7m

File details

Details for the file fugashi-0.1.10rc1-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: fugashi-0.1.10rc1-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 497.3 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.1

File hashes

Hashes for fugashi-0.1.10rc1-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 960f047f72c0072c37adbbfe9125293638b9309d9a059666aba82f307a182a62
MD5 c911610fb8f90d307d5c937de9dfe662
BLAKE2b-256 5144e2c31d601a114006dc30270dc27b545f229ad70dc0e8b1215900bafedd9a

See more details on using hashes here.

File details

Details for the file fugashi-0.1.10rc1-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-0.1.10rc1-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 469.4 kB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.1

File hashes

Hashes for fugashi-0.1.10rc1-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 9bbf84e6717cb5bee7d4e2aca9f3b9bd187e83f6df092afaf005a7c79f0dbbdb
MD5 167b151d962337166ccde92e3c544185
BLAKE2b-256 f58a8fc0f772b8337d1bf177577c317fcf434a05fa477150a135f2fb58b673a3

See more details on using hashes here.

File details

Details for the file fugashi-0.1.10rc1-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: fugashi-0.1.10rc1-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 496.1 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.1

File hashes

Hashes for fugashi-0.1.10rc1-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 e6024efaf9512d13b34a1a32cdb1910d7217a10cdeaffcae454bfc60ad84c023
MD5 b27982b68caca462622047ea0d993b42
BLAKE2b-256 9a9b174020452695848386f7062dc3d0a9cfaae6ec6ddcfe3fa1058cc7f4d6c2

See more details on using hashes here.

File details

Details for the file fugashi-0.1.10rc1-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-0.1.10rc1-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 463.9 kB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.8.1

File hashes

Hashes for fugashi-0.1.10rc1-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 2b088328b5e5380fda3834cbcf1ae4979cba1c45820769ffca1b791d7137179a
MD5 c886878888c0885d73b00ad9a47d5327
BLAKE2b-256 c34e9740795ae6bfe2f4e0c1f2957d4493b6ae3d3babfc43b931f3281e865653

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page