disvoice-prosody

A pip installable version of the prosody function from jcvazquezc's DisVoice library

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Prosody features

prosody.py

Compute prosody features from continuous speech based on duration, fundamental frequency and energy.

Static or dynamic features can be computed:

Static matrix is formed with 103 features and include

Num Feature Description

                            Features based on F0

1-6 F0-contour Avg., Std., Max., Min., Skewness, Kurtosis

7-12 Tilt of a linear estimation of F0 for each voiced segment Avg., Std., Max., Min., Skewness, Kurtosis

13-18 MSE of a linear estimation of F0 for each voiced segment Avg., Std., Max., Min., Skewness, Kurtosis

19-24 F0 on the first voiced segment Avg., Std., Max., Min., Skewness, Kurtosis

25-30 F0 on the last voiced segment Avg., Std., Max., Min., Skewness, Kurtosis

                            Features based on energy

31-34 energy-contour for voiced segments Avg., Std., Skewness, Kurtosis

35-38 Tilt of a linear estimation of energy contour for V segments Avg., Std., Skewness, Kurtosis

39-42 MSE of a linear estimation of energy contour for V segment Avg., Std., Skewness, Kurtosis

43-48 energy on the first voiced segment Avg., Std., Max., Min., Skewness, Kurtosis

49-54 energy on the last voiced segment Avg., Std., Max., Min., Skewness, Kurtosis

55-58 energy-contour for unvoiced segments Avg., Std., Skewness, Kurtosis

59-62 Tilt of a linear estimation of energy contour for U segments Avg., Std., Skewness, Kurtosis

63-66 MSE of a linear estimation of energy contour for U segments Avg., Std., Skewness, Kurtosis

67-72 energy on the first unvoiced segment Avg., Std., Max., Min., Skewness, Kurtosis

73-78 energy on the last unvoiced segment Avg., Std., Max., Min., Skewness, Kurtosis

                            Features based on duration

79 Voiced rate Number of voiced segments per second

80-85 Duration of Voiced Avg., Std., Max., Min., Skewness, Kurtosis

86-91 Duration of Unvoiced Avg., Std., Max., Min., Skewness, Kurtosis

92-97 Duration of Pauses Avg., Std., Max., Min., Skewness, Kurtosis

98-103 Duration ratios Pause/(Voiced+Unvoiced), Pause/Unvoiced, Unvoiced/(Voiced+Unvoiced), Voiced/(Voiced+Unvoiced), Voiced/Puase, Unvoiced/Pause

The dynamic feature matrix is formed with 13 features computed for each voiced segment and contains:

1 Duration of the voiced segment
2-7. Coefficients of 5-degree Lagrange polynomial to model F0 contour
8-13. Coefficients of 5-degree Lagrange polynomial to model energy contour

Dynamic prosody features are based on Najim Dehak, "Modeling Prosodic Features With Joint Factor Analysis for Speaker Verification", 2007

Notes:

The fundamental frequency is computed the PRAAT algorithm. To use the RAPT method, change the "self.pitch method" variable in the class constructor.
When Kaldi output is set to "true" two files will be generated, the ".ark" with the data in binary format and the ".scp" Kaldi script file

Runing

Script is called as follows

python prosody.py <file_or_folder_audio> <file_features> <static (true or false)> <plots (true or false)> <format (csv, txt, npy, kaldi, torch)>

Examples:

Extract features in the command line

python prosody.py "../audios/001_ddk1_PCGITA.wav" "prosodyfeaturesAst.txt" "true" "true" "txt"
python prosody.py "../audios/001_ddk1_PCGITA.wav" "prosodyfeaturesUst.csv" "true" "true" "csv"
python prosody.py "../audios/001_ddk1_PCGITA.wav" "prosodyfeaturesUdyn.pt" "false" "true" "torch"

python prosody.py "../audios/" "prosodyfeaturesst.txt" "true" "false" "txt"
python prosody.py "../audios/" "prosodyfeaturesst.csv" "true" "false" "csv"
python prosody.py "../audios/" "prosodyfeaturesdyn.pt" "false" "false" "torch"
python prosody.py "../audios/" "prosodyfeaturesdyn.csv" "false" "false" "csv"

KALDI_ROOT=/home/camilo/Camilo/codes/kaldi-master2
export PATH=$PATH:$KALDI_ROOT/src/featbin/
python prosody.py "../audios/001_ddk1_PCGITA.wav" "prosodyfeaturesUdyn" "false" "false" "kaldi"

python prosody.py "../audios/" "prosodyfeaturesdyn" "false" "false" "kaldi"

Extract features directly in Python

from prosody import Prosody
prosody=Prosody()
file_audio="../audios/001_ddk1_PCGITA.wav"
features1=prosody.extract_features_file(file_audio, static=True, plots=True, fmt="npy")
features2=prosody.extract_features_file(file_audio, static=True, plots=True, fmt="dataframe")
features3=prosody.extract_features_file(file_audio, static=False, plots=True, fmt="torch")
prosody.extract_features_file(file_audio, static=False, plots=False, fmt="kaldi", kaldi_file="./test")

Jupyter notebook

Results:

Prosody analysis from continuous speech static

References

[1]. N., Dehak, P. Dumouchel, and P. Kenny. "Modeling prosodic features with joint factor analysis for speaker verification." IEEE Transactions on Audio, Speech, and Language Processing 15.7 (2007): 2095-2103.

[2]. J. R. Orozco-Arroyave, J. C. Vásquez-Correa et al. "NeuroSpeech: An open-source software for Parkinson's speech analysis." Digital Signal Processing (2017).

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.0.5

Dec 5, 2020

This version

0.0.4

Dec 5, 2020

0.0.3

Dec 5, 2020

0.0.2

Dec 5, 2020

0.0.1

Dec 5, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

disvoice-prosody-0.0.4.tar.gz (16.2 kB view details)

Uploaded Dec 5, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

disvoice_prosody-0.0.4-py3-none-any.whl (17.3 kB view details)

Uploaded Dec 5, 2020 Python 3

File details

Details for the file disvoice-prosody-0.0.4.tar.gz.

File metadata

Download URL: disvoice-prosody-0.0.4.tar.gz
Upload date: Dec 5, 2020
Size: 16.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.8.6

File hashes

Hashes for disvoice-prosody-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`6ed3a5eec4c2483874cd228c3c3c94c3abc8828e5868b33188e6a42afec6891f`
MD5	`5ce67a83de1d89b6431816966d3f939f`
BLAKE2b-256	`315cfbe0e1a9aebad775eeb388c5aa8d3f41dd73182de6cbaad4acd16f29a711`

See more details on using hashes here.

File details

Details for the file disvoice_prosody-0.0.4-py3-none-any.whl.

File metadata

Download URL: disvoice_prosody-0.0.4-py3-none-any.whl
Upload date: Dec 5, 2020
Size: 17.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.8.6

File hashes

Hashes for disvoice_prosody-0.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f7f514cd8f431eb0f1dedf5851d49ff52828e54c03ebeaed8728a8305f7c6bf6`
MD5	`dd4fd1dcb714cae1408b12ec69392d1a`
BLAKE2b-256	`e6a4ffb4e603788a7da1c5f73bddddf3caedd8801e5a38184a2fe24f96e79354`

See more details on using hashes here.

disvoice-prosody 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Prosody features

Num Feature Description

Notes:

Runing

Examples:

Results:

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes