Skip to main content

Download YouTube metadata for videos relating to a search query

Project description

Download YouTube metadata for videos relating to a search query

This is a Python script that can download metadata (including comments and likes) for YouTube videos relating to a search query. Uses the Youtube Data API v3. Metadata is saved in a PostgreSQL database.

Metatube is conceived in a fashion that it pauses retrieval once your daily quota is used up (the default as of this writing is 10.000 requests per day) and waits until quota refill.

If you use metatube for scientific research, please cite it in your publication:
Fink, C. (2020): metatube: Python script to download YouTube metadata. doi:10.5281/zenodo.3773302.

Dependencies

The script is written in Python 3 and depends on the Python modules dateparser, psycopg2, PyYaml and Requests.

To install dependencies on a Debian-based system, run:

apt-get update -y &&
apt-get install -y python3-dev python3-pip python3-virtualenv

(There’s an Archlinux AUR package pulling in all dependencies, see further down)

Installation

  • using pip or similar:
pip3 install metatube
  • OR: manually:

    • Clone this repository
    git clone https://gitlab.com/helics-lab/metatube.git
    
    • Change to the cloned directory
    • Use the Python setuptools to install the package:
    cd metatube
    python3 ./setup.py install
    
  • OR: (Arch Linux only) from AUR:

# e.g. using yay
yay python-metatube

Configuration

Copy the example configuration file metatube.yml.example to a suitable location, depending on your operating system:

  • on Linux systems:
    • system-wide configuration: /etc/metatube.yml
    • per-user configuration:
      • ~/.config/metatube.yml OR
      • ${XDG_CONFIG_HOME}/metatube.yml
  • on MacOS systems:
    • per-user configuration:
      • ${XDG_CONFIG_HOME}/metatube.yml
  • on Microsoft Windows systems:
    • per-user configuration: %APPDATA%\metatube.yml

Adapt the configuration:

  • Configure a PostgreSQL connection string (connection_string), pointing to an existing database
  • Configure an API access key to the Youtube Data API v3 (youtube_api_key).
  • Define search terms (search_terms)

All of these configuration options can alternatively be supplied as command line arguments to metatube (see Usage) or as a config dict directly to the constructor of YoutubeVideoMetadataDownloader. Command line options (see metatube --help) or config dict both override config file.

Usage

Command line executable

metatube \
    --postgresql-connection-string "dbname=metatube" \
    --youtube-api-key "abcdefghijklmn" \
    "how to build a tallbike"

Python

Import the metatube module. Instantiate a YoutubeVideoMetadataDownloader, optionally supply a config dictionary. Then run the instance’s download() method.

import metatube

# config from config file
downloader = YoutubeVideoDownloader()
downloader.download()

# config from config file, 
# overriding `search_terms`
downloader = YoutubeVideoDownloader({
    "search_terms": "Critical Mass Vladivostok"
})
downloader.download()

# entire config from dictionary
downloader = YoutubeVideoDownloader({
    "youtube_api_key": "opqrstuvwxyz",
    "connection_string": "dbname=metatube host=server1 user=bicyclelover123",
    "search_terms": "dashcam bicycle commute albuquerque"
})
downloader.download()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metatube-0.0.5.tar.gz (43.0 kB view hashes)

Uploaded Source

Built Distribution

metatube-0.0.5-py3-none-any.whl (43.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page