Skip to main content

A CLI to configure pyspark for use with s3 on localstack

Project description

localstack-s3-pyspark

test-distribute

This package provides a CLI for configuring pyspark to use localstack for the S3 file system. This is intended for testing packages locally (or in your CI/CD pipeline) which you intend to deploy on an Amazon EMR cluster.

Installation

Execute the following command, replacing pip3 with the executable appropriate for the environment where you want to configure pyspark to use localstack:

pip3 install localstack-s3-pyspark

Configure Spark's Defaults

If you've installed localstack-s3-pyspark in a Dockerfile or virtual environment, just run the following command:

localstack-s3-pyspark configure-defaults

If you've installed localstack-s3-pyspark in an environment with multiple python 3.x versions, you may instead want to run an appropriate variation of the following command (replacing python3 with the command used to access the python executable for which you want to configure pyspark):

python3 -m localstack_s3_pyspark configure-defaults

Tox

Please note that if you are testing your packages with tox (highly recommended), you will need to:

  • Include "localstack-s3-pyspark" in your tox deps
  • Include localstack-s3-pyspark configure-defaults in your tox commands_pre (or by other means execute this command prior to your tests)

Here is an example tox.ini which starts up localstack using the localstack CLI (you could also use docker-compose or just docker run, if you need greater control or fewer python dependencies, see the the localstack documentation "Getting Started" page for details):

[tox]
envlist = pytest

[testenv:pytest]
deps =
  localstack-s3-pyspark
  localstack
commands_pre =
    localstack-s3-pyspark configure-defaults
    localstack start -d
    sleep 20
commands =
    py.test
commands_post =
    localstack stop

Patch boto3

If your tests interact with S3 using boto3, you can patch boto3 from within your unit tests as follows:

from localstack_s3_pyspark.boto3 import use_localstack
use_localstack()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

localstack-s3-pyspark-0.11.1.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

localstack_s3_pyspark-0.11.1-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file localstack-s3-pyspark-0.11.1.tar.gz.

File metadata

  • Download URL: localstack-s3-pyspark-0.11.1.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.0

File hashes

Hashes for localstack-s3-pyspark-0.11.1.tar.gz
Algorithm Hash digest
SHA256 0ddf003061f6492b4e79634734d6ccbb58e3ad93c7355736ae1ffa099e0b0bd0
MD5 b75411f0811fa497501f4d5a598bfa28
BLAKE2b-256 0dd58784e8ef2e09556e983d584573bb778d73353894d129ac89009b3f9fe922

See more details on using hashes here.

File details

Details for the file localstack_s3_pyspark-0.11.1-py3-none-any.whl.

File metadata

File hashes

Hashes for localstack_s3_pyspark-0.11.1-py3-none-any.whl
Algorithm Hash digest
SHA256 45b45fa161c26bf25392fd3e9df42f450f8d286dfb8afee6949236301082cb17
MD5 5d3b4166e59d24d1b7b0aa21a0ff712e
BLAKE2b-256 95ca47bf680ff18d76fa1f42de4424e3dbf200ec052576a0d291d77de3e9f834

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page