seesaw

ArchiveTeam seesaw kit

Project description

Seesaw toolkit
==============

An asynchronous toolkit for distributed web processing. Written in Python and named after its behavior, it supports concurrent downloads, uploads, etc.

This toolkit is well-known for [Archive Team projects](http://archiveteam.org). It also powers the [Archive Team warrior](http://archiveteam.org/index.php?title=Warrior).

[![Build Status](https://secure.travis-ci.org/ArchiveTeam/seesaw-kit.png)](http://travis-ci.org/ArchiveTeam/seesaw-kit)

Installation
------------

Requires Python 2 or 3.

Needs the Tornado library for event-driven I/O. The complete list of Python modules needed are listed in requirements.txt.

How to try it out
-----------------

To run the example pipeline:

sudo pip install -r requirements.txt
./run-pipeline --help
./run-pipeline examples/example-pipeline.py someone

Point your browser to `http://127.0.0.1:8001/`.

You can also use `run-pipeline2` or `run-pipeline3` to be explicit for the Python version.

Overview
--------

General idea: a set of `Task`s that can be combined into a `Pipeline` that processes `Item`s:

* An `Item` is a thing that needs to be downloaded (a user, for example). It has properties that are filled by the `Task`s.
* A `Task` is a step in the download process: it takes an item, does something with it and passes it on. Example Tasks: getting an item name from the tracker, running a download script, rsyncing the result, notifying the tracker that it's done.
* A `Pipeline` represents a sequence of `Task`s. To make a seesaw script for a new project you'd specify a new `Pipeline`.

A `Task` can work on multiple `Item`s at a time (e.g., multiple Wget downloads). The concurrency can be limited by wrapping the task in a `LimitConcurrency` `Task`: this will queue the items and run them one-by-one (e.g., a single Rsync upload).

The `Pipeline` needs to be fed empty `Item` objects; by controlling the number of active `Item`s you can limit the number of items. (For example, add a new item each time an item leaves the pipeline.)

With the `ItemValue`, `ItemInterpolation` and `ConfigValue` classes it is possible to pass item-specific arguments to the `Task` objects. The value of these objects will be re-evaluated for each item. Examples: a path name that depends on the item name, a configurable bandwidth limit, the number of concurrent downloads.

Consult [the wiki](https://github.com/ArchiveTeam/seesaw-kit/wiki) for more information.

Project details

Release history Release notifications | RSS feed

0.10.3

Mar 24, 2019

0.10.2

Mar 20, 2019

0.10.0

May 26, 2018

0.9.5

Apr 16, 2018

0.9.4

Oct 14, 2017

0.9.2

Apr 12, 2016

0.9.1

Mar 28, 2016

0.9

Apr 12, 2015

0.8.5

Jan 6, 2015

0.8.4

Jan 4, 2015

0.8.3

Dec 13, 2014

0.8.2

Dec 10, 2014

0.8.1

Dec 5, 2014

0.8

Nov 12, 2014

This version

0.7

Oct 3, 2014

0.6.1

Sep 19, 2014

0.6

Sep 13, 2014

0.5

Sep 8, 2014

0.4

Aug 24, 2014

0.3.1

Aug 17, 2014

0.3

Aug 17, 2014

0.2.2

Aug 13, 2014

0.2.1

Jul 28, 2014

0.2

Jun 24, 2014

0.1.7

Apr 17, 2014

0.1.6

Feb 26, 2014

0.1.5

Jan 9, 2014

0.1.4

Dec 10, 2013

0.1.2

Nov 21, 2013

0.0.16

Nov 15, 2013

0.0.15

Apr 21, 2013

0.0.14

Apr 3, 2013

0.0.13

Mar 25, 2013

0.0.12

Jan 26, 2013

0.0.10

Oct 19, 2012

0.0.9

Oct 16, 2012

0.0.7

Oct 9, 2012

0.0.5

Oct 8, 2012

0.0.4

Oct 5, 2012

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seesaw-0.7.tar.gz (112.4 kB view details)

Uploaded Oct 3, 2014 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

seesaw-0.7-py2.7.egg (173.1 kB view details)

Uploaded Oct 3, 2014 Egg

File details

Details for the file seesaw-0.7.tar.gz.

File metadata

Download URL: seesaw-0.7.tar.gz
Upload date: Oct 3, 2014
Size: 112.4 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for seesaw-0.7.tar.gz
Algorithm	Hash digest
SHA256	`556813cbc0145cc32a61f99e1601407db1c25a15d2d29788a566a19a3885e650`
MD5	`242f408d301a62817a96cfcd49a86ded`
BLAKE2b-256	`fb767fe7315e09ae32356ddf948f2931a5a4499549a485ab3cbd5c5052f65e97`

See more details on using hashes here.

File details

Details for the file seesaw-0.7-py2.7.egg.

File metadata

Download URL: seesaw-0.7-py2.7.egg
Upload date: Oct 3, 2014
Size: 173.1 kB
Tags: Egg
Uploaded using Trusted Publishing? No

File hashes

Hashes for seesaw-0.7-py2.7.egg
Algorithm	Hash digest
SHA256	`178d388cfeb566b48ebd72aaeb08476fc3f29e669a3f003c100db5ddd5c2264e`
MD5	`3dee1926d7988424c860f704567479c7`
BLAKE2b-256	`99c8baa3ee7aef7c4103a2e899411f885074cb2d42785620933bbcc06be567d4`

See more details on using hashes here.

seesaw 0.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes