Skip to main content

An Easy-to-use and Fast Python Spider Framework

Project description

Distributed🌍 - Asynchronous🏃 - Light☁️ - Fast⚡️ - Easy😄

AirSpider🕷️, a Light and Fast Python Web Crawler Framework Based on Redis🕷️


Overview👀

  • AirSpider is a high-performance asynchronous crawler framework for developers 🚀
  • Based on Redis: task distribution, task deduplication, and distributed ☁️

Requirements☁️

  • Python 3.6➕
  • Works on Linux, Windows, macOS🍎

Features🌲

  • Quick to Start ☑️
  • Low Coupling ☑️
  • High Cohesion ☑️
  • Easy Expansion ☑️
  • Orderly Workflow ☑️

Installation🔨

---------------------------

# For Linux && MacOS🔥
pip3 install airspider

---------------------------

# For Windows🔥
pip3 install airspider

---------------------------
  • Documents🔥

    Topics

    • Item:定义爬虫的目标字段
    • Selector:从HTML中提取出目标字段
    • Request:请求并抓取目标网站资源
    • Response:进一步封装响应内容
    • Middleware:使爬虫支持第三方扩展
    • Spider:爬虫程序的入口

TODO✈️

  • Complete Plugins of Redis🔥
  • Complete Distributed Architecture☁️

Contributing👬

AirSpider🕷️ is still under Developing🔨

Feel free to open issues💬 and pull requests💗

  • Report or Fix bugs🌈
  • Build Powerful plugins🔥
  • Make documentation Better📖
  • Add Examples of Spiders 🕷️

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AirSpider-2.0.2.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

AirSpider-2.0.2-py2.py3-none-any.whl (19.1 kB view details)

Uploaded Python 2Python 3

File details

Details for the file AirSpider-2.0.2.tar.gz.

File metadata

  • Download URL: AirSpider-2.0.2.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.3

File hashes

Hashes for AirSpider-2.0.2.tar.gz
Algorithm Hash digest
SHA256 049cc0c21e6fe6b9bcfa30c2e48e4419e8b32f02662b8849d4d26f6221b45caa
MD5 dcf6ceaa7857675c664b410f34dc0182
BLAKE2b-256 30b02b2782e6370fc7582220a6ea61736d4a5b89a5840c735548b4958840e37f

See more details on using hashes here.

File details

Details for the file AirSpider-2.0.2-py2.py3-none-any.whl.

File metadata

  • Download URL: AirSpider-2.0.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 19.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.3

File hashes

Hashes for AirSpider-2.0.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 fe383d810f859cd8c91b063745c6b14bfbea440d57366dc62f7fc145fefa8002
MD5 7610646f1bd5610ca6c643e733e58205
BLAKE2b-256 9786b4d268cd3793a4729926dbbabd92b7e5204ee088353b209725ce970d3667

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page