Skip to main content

Convert WebVTT to JSON, optionally removing duplicate lines

Project description

webvtt-to-json

PyPI Changelog Tests License

Convert WebVTT to JSON, optionally removing duplicate lines

Installation

Install this tool using pip:

pip install webvtt-to-json

Usage

To output JSON for a WebVTT file:

webvtt-to-json subtitles.vtt

This will output to standard output. Use -o filename to send it to a specified file.

Subtitles can often include duplicate lines. Add -d or --dedupe to attempt to remove those duplicates from the output:

webvtt-to-json --dedupe subtitles.vtt

Use -s or --single to output single "line" keys instead of a "lines" array.

You can also use:

python -m webvtt_to_json ...

Output

Standard output:

[
    {
        "start": "00:00:00.000",
        "end": "00:00:01.829",
        "lines": [
            " ",
            "my<00:00:00.160><c> career</c><00:00:00.480><c> in</c><00:00:00.640><c> side</c><00:00:00.880><c> projects</c><00:00:01.280><c> and</c><00:00:01.520><c> open</c>"
        ]
    }
]

--dedupe output:

[
    {
        "start": "00:00:01.829",
        "end": "00:00:01.839",
        "lines": ["my career in side projects and open"]
    }
]

--dedupe --single output:

[
    {
        "start": "00:00:01.829",
        "end": "00:00:01.839",
        "line": "my career in side projects and open"
    }
]

Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

cd webvtt-to-json
python -m venv venv
source venv/bin/activate

Now install the dependencies and test dependencies:

pip install -e '.[test]'

To run the tests:

pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webvtt-to-json-0.2.tar.gz (6.9 kB view hashes)

Uploaded Source

Built Distribution

webvtt_to_json-0.2-py3-none-any.whl (7.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page