Convert WebVTT to JSON, optionally removing duplicate lines
Project description
webvtt-to-json
Convert WebVTT to JSON, optionally removing duplicate lines
Installation
Install this tool using pip
:
pip install webvtt-to-json
Usage
To output JSON for a WebVTT file:
webvtt-to-json subtitles.vtt
This will output to standard output. Use -o filename
to send it to a specified file.
Subtitles can often include duplicate lines. Add -d
or --dedupe
to attempt to remove those duplicates from the output:
webvtt-to-json --dedupe subtitles.vtt
Use -s
or --single
to output single "line"
keys instead of a "lines"
array.
You can also use:
python -m webvtt_to_json ...
Output
Standard output:
[
{
"start": "00:00:00.000",
"end": "00:00:01.829",
"lines": [
" ",
"my<00:00:00.160><c> career</c><00:00:00.480><c> in</c><00:00:00.640><c> side</c><00:00:00.880><c> projects</c><00:00:01.280><c> and</c><00:00:01.520><c> open</c>"
]
}
]
--dedupe
output:
[
{
"start": "00:00:01.829",
"end": "00:00:01.839",
"lines": ["my career in side projects and open"]
}
]
--dedupe --single
output:
[
{
"start": "00:00:01.829",
"end": "00:00:01.839",
"line": "my career in side projects and open"
}
]
Development
To contribute to this tool, first checkout the code. Then create a new virtual environment:
cd webvtt-to-json
python -m venv venv
source venv/bin/activate
Now install the dependencies and test dependencies:
pip install -e '.[test]'
To run the tests:
pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for webvtt_to_json-0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 28dc9b38318854850e237a28a5350e06a40711e5616b70e86bb37a42bdf0f276 |
|
MD5 | ef014aadc081f3b67b0d5f5a674bbddf |
|
BLAKE2b-256 | 5ee7dd630a459f8bac81373a5fd0ba706f68be2eff12d6aff3e14cf3a14c9012 |