Skip to main content

Official Python bindings for PocketSphinx

Project description

PocketSphinx 5.0.0

This is PocketSphinx, one of Carnegie Mellon University's open source large vocabulary, speaker-independent continuous speech recognition engines.

Although this was at one point a research system, active development has largely ceased and it has become very, very far from the state of the art. I am making a release, because people are nonetheless using it, and there are a number of historical errors in the build system and API which needed to be corrected.

The version number is strangely large because there was a "release" that people are using called 5prealpha, and we will use proper semantic versioning from now on.

Please see the LICENSE file for terms of use.

Installation

You should be able to install this with pip for recent platforms and versions of Python:

pip3 install pocketsphinx

Alternately, you can also compile it from the source tree. I highly suggest doing this in a virtual environment (replace ~/ve_pocketsphinx with the virtual environment you wish to create), from the top level directory:

python3 -m venv ~/ve_pocketsphinx
. ~/ve_pocketsphinx/bin/activate
pip3 install .

On GNU/Linux and maybe other platforms, you must have PortAudio installed for the LiveSpeech class to work (we may add a fall-back to sox in the near future). On Debian-like systems this can be achieved by installing the libportaudio2 package:

sudo apt-get install libportaudio2

Usage

See the examples directory for a number of examples of using the library from Python. You can also read the documentation for the Python API or the C API.

It also mostly supports the same APIs as the previous pocketsphinx-python module, as described below.

LiveSpeech

An iterator class for continuous recognition or keyword search from a microphone. For example, to do speech-to-text with the default (some kind of US English) model:

from pocketsphinx import LiveSpeech
for phrase in LiveSpeech(): print(phrase)

Or to do keyword search:

from pocketsphinx import LiveSpeech

speech = LiveSpeech(keyphrase='forward', kws_threshold=1e-20)
for phrase in speech:
    print(phrase.segments(detailed=True))

With your model and dictionary:

import os
from pocketsphinx import LiveSpeech, get_model_path

speech = LiveSpeech(
    sampling_rate=16000,  # optional
    hmm=get_model_path('en-us'),
    lm=get_model_path('en-us.lm.bin'),
    dic=get_model_path('cmudict-en-us.dict')
)

for phrase in speech:
    print(phrase)

AudioFile

This is an iterator class for continuous recognition or keyword search from a file. Currently it supports only raw, single-channel, 16-bit PCM data in native byte order.

from pocketsphinx import AudioFile
for phrase in AudioFile("goforward.raw"): print(phrase) # => "go forward ten meters"

An example of a keyword search:

from pocketsphinx import AudioFile

audio = AudioFile("goforward.raw", keyphrase='forward', kws_threshold=1e-20)
for phrase in audio:
    print(phrase.segments(detailed=True)) # => "[('forward', -617, 63, 121)]"

With your model and dictionary:

import os
from pocketsphinx import AudioFile, get_model_path

model_path = get_model_path()

config = {
    'verbose': False,
    'audio_file': 'goforward.raw',
    'hmm': get_model_path('en-us'),
    'lm': get_model_path('en-us.lm.bin'),
    'dict': get_model_path('cmudict-en-us.dict')
}

audio = AudioFile(**config)
for phrase in audio:
    print(phrase)

Convert frame into time coordinates:

from pocketsphinx import AudioFile

# Frames per Second
fps = 100

for phrase in AudioFile(frate=fps):  # frate (default=100)
    print('-' * 28)
    print('| %5s |  %3s  |   %4s   |' % ('start', 'end', 'word'))
    print('-' * 28)
    for s in phrase.seg():
        print('| %4ss | %4ss | %8s |' % (s.start_frame / fps, s.end_frame / fps, s.word))
    print('-' * 28)

# ----------------------------
# | start |  end  |   word   |
# ----------------------------
# |  0.0s | 0.24s | <s>      |
# | 0.25s | 0.45s | <sil>    |
# | 0.46s | 0.63s | go       |
# | 0.64s | 1.16s | forward  |
# | 1.17s | 1.52s | ten      |
# | 1.53s | 2.11s | meters   |
# | 2.12s |  2.6s | </s>     |
# ----------------------------

Authors

PocketSphinx is ultimately based on Sphinx-II which in turn was based on some older systems at Carnegie Mellon University, which were released as free software under a BSD-like license thanks to the efforts of Kevin Lenzo. Much of the decoder in particular was written by Ravishankar Mosur (look for "rkm" in the comments), but various other people contributed as well, see the AUTHORS file for more details.

David Huggins-Daines (the author of this document) is guilty^H^H^H^H^Hresponsible for creating PocketSphinx which added various speed and memory optimizations, fixed-point computation, JSGF support, portability to various platforms, and a somewhat coherent API. He then disappeared for a while.

Nickolay Shmyrev took over maintenance for quite a long time afterwards, and a lot of code was contributed by Alexander Solovets, Vyacheslav Klimkov, and others. The pocketsphinx-python module was originally written by Dmitry Prazdnichnov.

Currently this is maintained by David Huggins-Daines again.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pocketsphinx-5.0.0.tar.gz (33.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pocketsphinx-5.0.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

pocketsphinx-5.0.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (29.1 MB view details)

Uploaded PyPymacOS 10.9+ x86-64

pocketsphinx-5.0.0-cp310-cp310-win_amd64.whl (29.0 MB view details)

Uploaded CPython 3.10Windows x86-64

pocketsphinx-5.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

pocketsphinx-5.0.0-cp310-cp310-macosx_10_9_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

pocketsphinx-5.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

pocketsphinx-5.0.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.17+ x86-64

File details

Details for the file pocketsphinx-5.0.0.tar.gz.

File metadata

  • Download URL: pocketsphinx-5.0.0.tar.gz
  • Upload date:
  • Size: 33.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for pocketsphinx-5.0.0.tar.gz
Algorithm Hash digest
SHA256 4676f610a997c1f3f40d96d039d5d1149c7758a7bfdbc003af1ea0fd54a3eac6
MD5 e025aeedecbb0a40d00ba251c6b26184
BLAKE2b-256 96b890c83b446e20b6d1449676a833c4b1be96a0839e25ee561417baaaa55755

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c6cd4e4e88ec3dc38b7799f1d8ce15186d26a26a86f9cb4ee058ff04e8cede0d
MD5 8562a913973b211bbd003073ce24b7e4
BLAKE2b-256 827c8cef9648304ea8cd3c3343521940c0973b3bc5d6c14aa3ee8d3c1d5ab08b

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6c76f805ac357bf4ede348e91134831124113e96e3a2d2a86dbd6d5758f7b9af
MD5 5d7c67904f95b0f313749890f57f7b00
BLAKE2b-256 7e2b3b1d0b345ae2d6bf8acc38204051bccec72872130433474807b8aa2a66e7

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 f5b769fceb52f67b4a4ed725c901a1534bd804836c1ef9d7fee84b62835f9168
MD5 391b4bba7eb24dfc4d83f1be69afd843
BLAKE2b-256 dc488aa741d612312b2658bc011c821f196f346996b870faf624630706427c1a

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 96d938a379bb5be0fb3feedaf41f2def780c774224e9b840e6ec20f918f8f401
MD5 4f1fbe03f533fa7a024e93df2a9a583e
BLAKE2b-256 cd6b1ca680f83201eed4a15a08ca1e149c3985c64fe60756dbdf8e049777e07e

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 1afafd2e35b0d1a64b5d7d5e1aace9879873cbcab632ae24d20b33557b9c4085
MD5 9e1a920cb2caa5be2214acf15e3f37ca
BLAKE2b-256 377216ec143a387e8246621cc53abda52260371ee753d48f9f485b53cc97c128

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d3191d201c28e23e7275d9afd5dddbfddce809f36a3126df93f65ba4ffb0a436
MD5 0f02bafd03b89032d43fd26fea1ff23d
BLAKE2b-256 6de0e3f5b9b84bf29ebb75d22f4fb060a9c293124c1c17107e3a7137c5f4bc58

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 52c744f2aea1489c29b65b1215ffffa023c539c08c5962d443ca281a3482888c
MD5 ab2f833284f034aa0fddede75220ad77
BLAKE2b-256 2df2218f69cdf697d981d8bd22be5719c298a76ae5ee702a0fcdb8d713caaf85

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page