Download YouTube metadata for videos relating to a search query
Project description
Download YouTube metadata for videos relating to a search query
This is a Python script that can download metadata (including comments and likes) for YouTube videos relating to a search query. Uses the Youtube Data API v3. Metadata is saved in a PostgreSQL database.
Metatube is conceived in a fashion that it pauses retrieval once your daily quota is used up (the default as of this writing is 10,000 requests per day) and waits until quota refill. If interrupted, metatube will, upon restart, first fill gaps in the download history, then continue downloading “into the future”. Once caught up to within ten minutes of the current time, metatube exits.
If you use metatube for scientific research, please cite it in your publication:
Fink, C. (2020): metatube: Python script to download YouTube metadata. doi:10.5281/zenodo.3773302.
Dependencies
The script is written in Python 3 and depends on the Python modules dateparser, psycopg2, PyYaml and Requests.
To install dependencies on a Debian-based system, run:
apt-get update -y &&
apt-get install -y python3-dev python3-pip python3-virtualenv
(There’s an Archlinux AUR package pulling in all dependencies, see further down)
Installation
- using
pip
or similar:
pip3 install metatube
-
OR: manually:
- Clone this repository
git clone https://gitlab.com/helics-lab/metatube.git
- Change to the cloned directory
- Use the Python
setuptools
to install the package:
cd metatube python3 ./setup.py install
-
OR: (Arch Linux only) from AUR:
# e.g. using yay
yay python-metatube
Configuration
Copy the example configuration file metatube.yml.example to a suitable location, depending on your operating system:
- on Linux systems:
- system-wide configuration:
/etc/metatube.yml
- per-user configuration:
~/.config/metatube.yml
OR${XDG_CONFIG_HOME}/metatube.yml
- system-wide configuration:
- on MacOS systems:
- per-user configuration:
${XDG_CONFIG_HOME}/metatube.yml
- per-user configuration:
- on Microsoft Windows systems:
- per-user configuration:
%APPDATA%\metatube.yml
- per-user configuration:
Adapt the configuration:
- Configure a PostgreSQL connection string (
connection_string
), pointing to an existing database - Configure an API access key to the Youtube Data API v3 (
youtube_api_key
). - Define search terms (
search_terms
)
All of these configuration options can alternatively be supplied as command line arguments to metatube
(see Usage) or as a config
dict
directly to the constructor of YoutubeVideoMetadataDownloader
. Command line options (see metatube --help
) or config
dict
both override config file.
Usage
Command line executable
metatube \
--postgresql-connection-string "dbname=metatube" \
--youtube-api-key "abcdefghijklmn" \
"how to build a tallbike"
Python
Import the metatube
module. Instantiate a YoutubeVideoMetadataDownloader
, optionally supply a config
dictionary. Then run the instance’s download()
method.
import metatube
# config from config file
downloader = YoutubeVideoDownloader()
downloader.download()
# config from config file,
# overriding `search_terms`
downloader = YoutubeVideoDownloader({
"search_terms": "Critical Mass Vladivostok"
})
downloader.download()
# entire config from dictionary
downloader = YoutubeVideoDownloader({
"youtube_api_key": "opqrstuvwxyz",
"connection_string": "dbname=metatube host=server1 user=bicyclelover123",
"search_terms": "dashcam bicycle commute albuquerque"
})
downloader.download()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.