Skip to main content

Mirroring tool that implements the client (mirror) side of PEP 381

Project description

Code style: black Build Status codecov.io Documentation Status Updates Downloads


This is a PyPI mirror client according to PEP 381 http://www.python.org/dev/peps/pep-0381/.

bandersnatch maintainers are looking for more help! Please refer to our MAINTAINER documentation to see the roles and responsibilities. We would also ask you read our Mission Statement to ensure it aligns with your thoughts for this project.

  • If interested contact @cooperlees

Installation

The following instructions will place the bandersnatch executable in a virtualenv under bandersnatch/bin/bandersnatch.

  • bandersnatch requires >= Python 3.6.1

pip

This installs the latest stable, released version.

  $ python3.6 -m venv bandersnatch
  $ bandersnatch/bin/pip install bandersnatch
  $ bandersnatch/bin/bandersnatch --help

Quickstart

  • Run bandersnatch mirror - it will create an empty configuration file for you in /etc/bandersnatch.conf.
  • Review /etc/bandersnatch.conf and adapt to your needs.
  • Run bandersnatch mirror again. It will populate your mirror with the current status of all PyPI packages. Current mirror package size can be seen here: https://pypi.org/stats/
  • A blacklist or whitelist can be created to cut down your mirror size. You might want to Analyze PyPI downloads to determine which packages to add to your list.
  • Run bandersnatch mirror regularly to update your mirror with any intermediate changes.

Webserver

Configure your webserver to serve the web/ sub-directory of the mirror. For nginx it should look something like this:

    server {
        listen 127.0.0.1:80;
        server_name <mymirrorname>;
        root <path-to-mirror>/web;
        autoindex on;
        charset utf-8;
    }
  • Note that it is a good idea to have your webserver publish the HTML index files correctly with UTF-8 as the charset. The index pages will work without it but if humans look at the pages the characters will end up looking funny.

  • Make sure that the webserver uses UTF-8 to look up unicode path names. nginx gets this right by default - not sure about others.

Cron jobs

You need to set up one cron job to run the mirror itself.

Here's a sample that you could place in /etc/cron.d/bandersnatch:

    LC_ALL=en_US.utf8
    */2 * * * * root bandersnatch mirror |& logger -t bandersnatch[mirror]

This assumes that you have a logger utility installed that will convert the output of the commands to syslog entries.

Maintenance

bandersnatch does not keep much local state in addition to the mirrored data. In general you can just keep rerunning bandersnatch mirror to make it fix errors.

If you want to force bandersnatch to check everything against the master PyPI:

  • run bandersnatch mirror --force-check to move status files if they exist in your mirror directory in order get a full sync.

Be aware that full syncs likely take hours depending on PyPI's performance and your network latency and bandwidth.

Operational notes

Case-sensitive filesystem needed

You need to run bandersnatch on a case-sensitive filesystem.

OS X natively does this OK even though the filesystem is not strictly case-sensitive and bandersnatch will work fine when running on OS X. However, tarring a bandersnatch data directory and moving it to, e.g. Linux with a case-sensitive filesystem will lead to inconsistencies. You can fix those by deleting the status files and have bandersnatch run a full check on your data.

Many sub-directories needed

The PyPI has a quite extensive list of packages that we need to maintain in a flat directory. Filesystems with small limits on the number of sub-directories per directory can run into a problem like this:

2013-07-09 16:11:33,331 ERROR: Error syncing package: zweb@802449
OSError: [Errno 31] Too many links: '../pypi/web/simple/zweb'

Specifically we recommend to avoid using ext3. Ext4 and newer does not have the limitation of 32k sub-directories.

Client Compatibility

A bandersnatch static mirror is compatible only to the "static", cacheable parts of PyPI that are needed to support package installation. It does not support more dynamic APIs of PyPI that maybe be used by various clients for other purposes.

An example of an unsupported API is PyPI's XML-RPC interface, which is used when running pip search.

Bandersnatch Mission

The bandersnatch project strives to:

  • Mirror all static objects of the Python Package Index (https://pypi.org/)
  • bandersnatch's main goal is to support the main global index to local syncing only
  • This will allow organizations to have lower latency access to PyPI and save bandwidth on their WAN connections and more importantly the PyPI CDN
  • Custom features and requests may be accepted if they can be of a plugin form
    • e.g. refer to the blacklist and whitelist plugins

Contact

If you have questions or comments, please submit a bug report to https://github.com/pypa/bandersnatch/issues/new

  • IRC: #bandersnatch on Freenode (You can use webchat if you don't have an IRC client)

Code of Conduct

Everyone interacting in the bandersnatch project's codebases, issue trackers, chat rooms, and mailing lists is expected to follow the PyPA Code of Conduct.

Kudos

This client is based on the original pep381client by Martin v. Loewis.

Richard Jones was very patient answering questions at PyCon 2013 and made the protocol more reliable by implementing some PyPI enhancements.

Christian Theune for creating and maintaining bandersnatch for many years!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bandersnatch-3.5.0.tar.gz (40.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bandersnatch-3.5.0-py3-none-any.whl (43.0 kB view details)

Uploaded Python 3

File details

Details for the file bandersnatch-3.5.0.tar.gz.

File metadata

  • Download URL: bandersnatch-3.5.0.tar.gz
  • Upload date:
  • Size: 40.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.7.3

File hashes

Hashes for bandersnatch-3.5.0.tar.gz
Algorithm Hash digest
SHA256 6daf29672a5a70306a6513e8e98b4b3e9105fd321f8a51c8513f3c5349ceef50
MD5 400f062f2252f1b5c8ed2c8a68a3398f
BLAKE2b-256 dbe2079dfd56ece802606707a587394d09dee0428a587baae30da5dfd102ebf7

See more details on using hashes here.

File details

Details for the file bandersnatch-3.5.0-py3-none-any.whl.

File metadata

  • Download URL: bandersnatch-3.5.0-py3-none-any.whl
  • Upload date:
  • Size: 43.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.7.3

File hashes

Hashes for bandersnatch-3.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a446aec4bccbb21605b162cdc25005807cffa9a4a4e003a7c7f1e0527f4c4800
MD5 83e6e7e1ace04b3a53fd56cb97924669
BLAKE2b-256 85d9c846b8e0d4b3502b67d93906dcc8a186e707c7ea13661e16e14ee585d946

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page