Skip to main content

A Django/Django-Storages threaded S3 chunk uploader

Project description

A Django file handler to manage piping uploaded files directly to S3 without passing through the server’s file system. The uploader uses multiple threads to speed up the upload of larger files. This package relies on Django and Django-Storages, allowing the use of the S3 storages FileField but changes the upload behaviour to bypass local file system.

Quick start

  1. Install the package:

    pip install s3chunkuploader

  2. Set the Django FILE_UPLOAD_HANDLERS setting:

    FILE_UPLOAD_HANDLERS = (‘s3chunkuploader.file_handler.S3FileUploadHandler’,)

How it works

The File Handler intercepts the file upload multipart request at the door, and as chunks of the file are received from the browser, they are collectd into an internal queue within custom ThreadPoolWorker. When the queue surpasses a configurable size (by default 5MB which is the minimum Part size for S3 multipart upload), it is submitted to the Thread Pool as a Future which will then resolve. Once all the chunks are uploaded and all the futures are resolved the upload is complete. By default 10 threads are used which means a 100MB file upload can be potentially sent as 20 5MB parts to S3.

The FileHandler ultimately returns a ‘dummy’ django-storages S3Boto3StorageFile which is compatible with the storages S3 File Field, but was not actually used to upload a full file. The file is also enhanced with two additional attributes:

original_name: The original file name uploaded file_size: The actual full file size uploaded

It is recommended to bypass csrf checks on the upload file view as the csrf check will read the POST params before the handler is used. A replacement file field S3FileField is provided in fields.py and is satisfied with the S3 object key

By default the S3 key will be generated based on the settings provided. However, it is possible to define a custom function to derive the S3 object key by providing a full dot notated path to the function in the S3_GENERATE_OBJECT_KEY_FUNCTION settings parameter.

Settings

The following settings are expected in your Django application (only 2 are required)

Setting

Description

CHUNK_UPLOADER_AWS_ACCESS_KEY_ID

Required. Your AWS access key

CHUNK_UPLOADER_AWS_SECRET_ACCESS_KEY

Required. The AWS secret

CHUNK_UPLOADER_AWS_AWS_STORAGE_BUCKET_NAME

Required. The S3 bucket to use

CHUNK_UPLOADER_AWS_REGION

Optional. Region of S3 bucket

CHUNK_UPLOADER_S3_DOCUMENT_ROOT_DIRECTORY

Optional. Document root for all uploads (prefix)

CHUNK_UPLOADER_S3_APPEND_DATETIME_ON_UPLOAD

Optional [True]. Append the current datetime sring to the uploaded file name

CHUNK_UPLOADER_S3_PREFIX_QUERY_PARAM_NAME

Optional [__prefix]. A query param key name which provides additional prefix for the object key on S3

CHUNK_UPLOADER_S3_MIN_PART_SIZE

Optional [5MB]. The part size in bytes to upload to S3

CHUNK_UPLOADER_MAX_UPLOAD_SIZE

Optional [None]. The maximum file size in bytes for an individual file.

CHUNK_UPLOADER_AWS_S3_REGION_NAME

Optional [None]. The s3 endpoint url which overrides the default

CHUNK_UPLOADER_CLEAN_FILE_NAME

Optional [False]. When True, runs the filename through Django’s slugify function to sanitise it.

CHUNK_UPLOADER_S3_GENERATE_OBJECT_KEY_FUNCTION

Optional [None]. A function to generate the S3 key, receiving the request object and filename as arguments.

CHUNK_UPLOADER_AWS_S3_ENDPOINT_URL

Optional [None]. A full custom S3 endpoint url (was S3_ENDPOINT_URL in previous version)

Unit Tests

Unit tests can be executing by running python -m unittest from the projects root

Change Log

  • 0.9: The optional setting S3_ENDPOINT_URL was renamed to AWS_S3_ENDPOINT_URL to align with django-storages .

  • 0.10: If content_length is not provided MAX_UPLOAD_SIZE cannot evaluate against it.

  • 0.11: Prefixed settings keys

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3chunkuploader-0.15.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

s3chunkuploader-0.15-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file s3chunkuploader-0.15.tar.gz.

File metadata

  • Download URL: s3chunkuploader-0.15.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.9.0

File hashes

Hashes for s3chunkuploader-0.15.tar.gz
Algorithm Hash digest
SHA256 e0367293dba94ee632c67a5bf130becbb4b0c911c507a401d60acd6ea97383c8
MD5 2882c1706085fb5e3bc3c138086cd6d9
BLAKE2b-256 6c4344ff1c7046303b1af4dec959884747413426be98e407057850291795a5b5

See more details on using hashes here.

File details

Details for the file s3chunkuploader-0.15-py3-none-any.whl.

File metadata

  • Download URL: s3chunkuploader-0.15-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.9.0

File hashes

Hashes for s3chunkuploader-0.15-py3-none-any.whl
Algorithm Hash digest
SHA256 8e242b92521f0ccf0470ac32727277d61387fb2996837d80fe7f59f819b4ddd3
MD5 22484c48d83724de5d2a4608559d5543
BLAKE2b-256 8fb2f570e4daa72337d7d3435cf66fea7ecea96bb46296a59530b2d169f888ef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page