Skip to main content

Emotion expression capture from multiple modalities.

Project description

Multimodal Emotion Expression Capture Amsterdam

github license badge RSD read the docs badge fair-software badge workflow scq badge workflow scc badge build cffconvert markdown-link-check DOI docker hub badge docker build badge

mexca is an open-source Python package which aims to capture human emotion expressions from videos in a single pipeline.

How To Use Mexca

mexca implements the customizable yet easy-to-use Multimodal Emotion eXpression Capture Amsterdam (MEXCA) pipeline for extracting emotion expression features from videos. It contains building blocks that can be used to extract features for individual modalities (i.e., facial expressions, voice, and dialogue/spoken text). The blocks can also be integrated into a single pipeline to extract the features from all modalities at once. Next to extracting features, mexca can also identify the speakers shown in the video by clustering speaker and face representations. This allows users to compare emotion expressions across speakers, time, and contexts.

Please cite mexca if you use it for scientific or commercial purposes.

Quick Installation

Here, we explain briefly how to install mexca on your system. Detailed instructions can be found in the Installation Details section. mexca can be installed on Windows, macOS and Linux. We recommend Windows 10, macOS 12.6.x, or Ubuntu.

The package contains five components that must be explicitly installed [^1]. By default, only the base package is installed (which requires only a few dependencies). The components can still be used through Docker containers which must be downloaded from Docker Hub. We recommend this setup for users with little experience with installing Python packages or who simply want to quickly try out the package. Using the containers also adds stability to your program.

Requirements

mexca requires Python version >= 3.7 and <= 3.9. It further depends on FFmpeg (for video and audio processing), which is usually automatically installed through the MoviePy package (i.e., its imageio dependency). In case the automatic install fails, it must be installed manually.

To download and run the components as Docker containers, Docker must be installed on your system. Instructions on how to install Docker Desktop can be found here.

All components but the VoiceExtractor depend on PyTorch (version 1.12). Usually, it should be automatically installed when specifying any of these components. In case the installation fails, see the installation instructions on the PyTorch web page.

For the SpeakerIdentifier component, the library libsndfile must also be installed on Linux systems.

The SentimentExtractor component depends on the sentencepiece library, which is automatically installed if Git is installed on the system.

Installation

We recommend installing mexca in a new virtual environment to avoid dependency conflicts. The base package can be installed from PyPI via pip:

pip install mexca

The dependencies for the additional components can be installed via:

pip install mexca[vid,spe,voi,tra,sen]

The abbreviations indicate:

  • vid: FaceExtractor
  • spe: SpeakerIdentifier
  • voi: VoiceExtractor
  • tra: AudioTranscriber
  • sen: SentimentExtractor

To run the demo and example notebooks, install the Jupyter requirements via:

pip install mexca[demo]

Getting Started

If you would like to learn how to use mexca, take a look at our example notebook.

Note: mexca builds on pretrained models from the pyannote.audio package. Since release 2.1.1, downloading the pretrained models requires the user to accept two user agreements on Hugging Face hub and generate an authentication token. Therefore, to run the mexca pipeline, please accept the user agreements on here and here. Then, generate an authentication token here. Use this token to login to Hugging Face hub by running notebook_login() (from a jupyter notebook) or huggingface-cli login (from the command line). You only need to login when running mexca for the first time. See this link for details. When running container components, you need to supply the token excplicitly as value for the use_auth_token argument. We recommend storing the token on your system and accessing it from Python.

To create and apply the MEXCA pipeline with container components to a video file run the following code in a Jupyter notebook or a Python script (requires the base package and Docker):

from mexca.container import (AudioTranscriberContainer, FaceExtractorContainer,
                             SentimentExtractorContainer, SpeakerIdentifierContainer, 
                             VoiceExtractorContainer)
from mexca.pipeline import Pipeline

# Set path to video file
filepath = 'path/to/video'

# Create standard pipeline with two faces and speakers
pipeline = Pipeline(
    face_extractor=FaceExtractorContainer(num_faces=2),
    speaker_identifier=SpeakerIdentifierContainer(
        num_speakers=2,
        use_auth_token="HF_TOKEN" # Replace this string with your token
    ),
    voice_extractor=VoiceExtractorContainer(),
    audio_transcriber=AudioTranscriberContainer(),
    sentiment_extractor=SentimentExtractorContainer()
)

# Apply pipeline to video file at `filepath`
result = pipeline.apply(
    filepath,
    frame_batch_size=5,
    skip_frames=5
)

# Print merged features
print(result.features)

The result should be a pandas data frame printed to the console or notebook output. Details on the output and extracted features can be found here.

Components

The pipeline components are described here.

Documentation

The documentation of mexca can be found on Read the Docs.

Contributing

If you want to contribute to the development of mexca, have a look at the contribution guidelines.

License

The code is licensed under the Apache 2.0 License. This means that mexca can be used, modified and redistributed for free, even for commercial purposes.

Credits

Mexca is being developed by the Netherlands eScience Center in collaboration with the Hot Politics Lab at the University of Amsterdam.

This package was created with Cookiecutter and the NLeSC/python-template.

[^1]: We explain the rationale for this setup in the Docker section.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mexca-0.3.0.tar.gz (55.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mexca-0.3.0-py3-none-any.whl (48.8 kB view details)

Uploaded Python 3

File details

Details for the file mexca-0.3.0.tar.gz.

File metadata

  • Download URL: mexca-0.3.0.tar.gz
  • Upload date:
  • Size: 55.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for mexca-0.3.0.tar.gz
Algorithm Hash digest
SHA256 97e0887e4ee1f0281085da52dd2d927a9f8a85a4dc7cb2b56300f5508c18a865
MD5 3d44f1f0a88d9fdef9b172bc468dde03
BLAKE2b-256 826567402c4b525cce0926adad3e72b44985824d9afdfe19e160e25adc99da72

See more details on using hashes here.

File details

Details for the file mexca-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: mexca-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 48.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for mexca-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a60ac6fe2fc1082e2f4a531bb8ce85fd016b5622525b54f54e0a4b8883a92e19
MD5 98fb85530a9dbf587b403a9719a9cceb
BLAKE2b-256 c262e0b25c8cb62482ab4336f195aadfabfde36ba94316d56033341c309d2e1e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page