Skip to main content

Webcomic downloader

Project description

webcomix

Build StatusCoverage StatusPyPI version

Description

webcomix is a webcomic downloader that can additionally create a .cbz (Comic Book ZIP) file once downloaded.

Notice

This program is for personal use only. Please be aware that by making the downloaded comics publicly available without the permission of the author, you may be infringing upon various copyrights.

Installation

Dependencies

  • Python (3.5 or newer)
  • click
  • scrapy (Some additional steps might be required to include this package and can be found here)
  • scrapy-splash
  • scrapy-fake-useragent
  • tqdm

Process

End user

  1. Install Python 3
  2. Install the command line interface tool with pip install webcomix

Developer

  1. Install Python 3
  2. Clone this repository and open a terminal in its directory
  3. Install poetry with pip install poetry
  4. Download the dependencies by running poetry install
  5. Install pre-commit hooks with pre-commit install

Usage

webcomix [OPTIONS] COMMAND [ARGS]

Global Flags

help

Show the help message and exit.

Version

Show the version number and exit.

Commands

comics

Shows all predefined comics which can be used with the download command.

download

Downloads a predefined comic. Supports the --cbz flag, which creates a .cbz archive of the downloaded comic.

search

Searches for an XPath that can download the whole comic. Supports the --cbz flag, which creates a .cbz archive of the downloaded comic,-s, which verifies only the provided page of the comic, and -y, which skips the verification prompt.

custom

Downloads a user-defined comic. To download a specific comic, you'll need a link to the first page, an XPath expression giving out the link to the next page and an XPath expression giving out the link to the image. More info here. Supports the --cbz flag, which creates a .cbz archive of the downloaded comic, -s, which verifies only the provided page of the comic, and -y, which skips the verification prompt.

Examples

  • webcomix download xkcd
  • webcomix search xkcd --start-url=http://xkcd.com/1/
  • webcomix custom --cbz (You will be prompted about other needed arguments)
  • webcomix custom xkcd --start-url=http://xkcd.com/1/ --next-page-xpath="//a[@rel='next']/@href" --image-xpath="//div[@id='comic']//img/@src" --cbz (Same as before, but with all arguments declared beforehand)

Making an XPath selector

Using an HTML inspector, spot a html path to the next link's href attribute/comic image's src attribute.

e.g.: //div[@class='foo']/img/@src This will select the src attribute of the first image whose class is: foo

Note: webcomix works best on static websites, since scrapy(the framework we use to travel web pages) doesn't process Javascript.

To make sure your XPath is correct, you have to go into scrapy shell, which should be downloaded when you've installed webcomix.

scrapy shell <website> --> Use the website's url to go to it.
> response.body --> Will give you the html from the website.
> response.xpath --> Test an xpath selection. If you get [], this means your XPath expression hasn't gotten anything from the webpage.

Downloading comics on Javascript-heavy websites

If the webcomic's website uses javascript to render its images, you won't be able to download it using the default configuration. webcomix now has an optional flag -j on both the custom and search command to execute the javascript using scrapy-splash. In order to use it, you'll need to have Docker installed and run the following command before trying to download the comic:

docker run -p 8050:8050 scrapinghub/splash

Contribution

The procedure depends on the type of contribution:

  • If you simply want to request the addition of a comic to the list of supported comics, make an issue with the label "Enhancement".
  • If you want to request the addition of a feature to the system or a bug fix, make an issue with the appropriate label.

Running the tests

To run the tests, you have to use the pytest command in the webcomix folder.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webcomix-3.3.0.tar.gz (331.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

webcomix-3.3.0-py3-none-any.whl (340.9 kB view details)

Uploaded Python 3

File details

Details for the file webcomix-3.3.0.tar.gz.

File metadata

  • Download URL: webcomix-3.3.0.tar.gz
  • Upload date:
  • Size: 331.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.9 CPython/3.5.6 Linux/4.15.0-1028-gcp

File hashes

Hashes for webcomix-3.3.0.tar.gz
Algorithm Hash digest
SHA256 986cde4642dc1da30409f86c14071a34467123e7be7520bdbd05f47ecf1960d5
MD5 ddd2d50b1f68e1b066b09dcc9736c47a
BLAKE2b-256 a2586d987d09a074a2a9568831e8d0a03dfc558b6cbd96da11ff69a1f8e722da

See more details on using hashes here.

File details

Details for the file webcomix-3.3.0-py3-none-any.whl.

File metadata

  • Download URL: webcomix-3.3.0-py3-none-any.whl
  • Upload date:
  • Size: 340.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.9 CPython/3.5.6 Linux/4.15.0-1028-gcp

File hashes

Hashes for webcomix-3.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 23c9c7e64702d90571f336536906ecac8f1e953a486076189cc174ab33e82ba0
MD5 464fa0dd19265b24349ccb2b1d720bfc
BLAKE2b-256 496e74765f2a8ff58c137cb8cc60d595b99486a95a9a1bd6e5ffa94010bb5915

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page