Skip to main content
Avatar for Pablo Hoffman from gravatar.com

Pablo Hoffman

Username    pablohoffman

18 projects

parsel

Last released

Parsel is a library to extract data from HTML and XML using XPath and CSS selectors

Scrapy

Last released

A high-level Web Crawling and Web Scraping framework

shub

Last released

Scrapinghub Command Line Client

dateparser

Last released

Date parsing library designed to parse dates from HTML pages

scrapyd

Last released

A service for running Scrapy spiders, with an HTTP API

w3lib

Last released

Library of web-related functions

queuelib

Last released

Collection of persistent (disk-based) and non-persistent (memory-based) queues

scrapy-crawlera

Last released

Crawlera middleware for Scrapy

splash

Last released

A javascript rendered with a HTTP API

scrapely

Last released

A pure-python HTML screen-scraping library

slybot

Last released

Slybot crawler

frontera

Last released

A scalable frontier for web crawlers

webstruct

Last released

A library for creating statistical NER systems that work on HTML data

hubstorage

Last released

Client interface for Scrapinghub HubStorage

scrapylib

Last released

Scrapy helper functions and processors

adblockparser

Last released

Parser for Adblock Plus rules

scrapy-dotpersistence

Last released

Scrapy extension to sync `.scrapy` folder to an S3 bucket

scrapyjs

Last released

JavaScript support for Scrapy using Splash

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page