Skip to main content

Unofficial AWS Glue Ray.io packaging tool. Pure python or cross platform.

Project description

raypack

Raypack will create a package for AWS Glue, Ray.io tasks. This automates this documentation page that has some handwavy descriptions of shell commands. AWS Lambdas also call for a similar type of packaging, but as far as I know it is no sort of standard.

Use this if you have private dependencies, native dependencies, or you want to package your own code as a python package. AWS Glue can't handle anything without a binary wheel or private package repositories, gcc or other build tools are not in Glue runtime images.

See below for build options.

raypack is not supported by Amazon, AWS, nor Anyscale, Inc the makers of ray.io. Some code generate with ChatGPT (OpenAI)

Libraries.io dependency status for latest release Downloads

Installation

You are encouraged to install with pipx so that the CLI tools dependencies do not conflict with your project dependencies.

pipx install raypack

Usage

raypack [--verbose]
python -m raypack  [--verbose]

Configuration. If none specified, defaults are as below.

[tool.raypack]
exclude_packaging_cruft = true
outer_folder_name = "venv"
source_venv = ".venv"
venv_tool = "poetry"

Build Options

If your dependencies are all pure python, the packaging will work on any machine. However, if your dependencies have any native code:

Arm64 built on an Arm64 machine

  • An actual arm64 build runner with gcc - Best option, will allow compiling native code correctly.
  • Docker e.g. FROM public.ecr.aws/lambda/python:3.9-arm64 - Second best option, I haven't tried it, not sure if it works on all build runners that are not actually arm64 CPUs. example

Mac Binaries

  • An arm64 machine e.g. mac - Next best option, not sure if it will work.

Precompiled Binaries

  • Any machine or an arm64 machine like a mac- Would work in limited situations, namely when there are precompiled binaries (wheels) or all packages are pure python.

The last option works by telling pip to just download and unzip the arm64 wheels. If there aren't wheels or if the wheels weren't compiled for arm64, then you have to consider finding a different machine or convincing package maintainers to support wheels and more kinds of wheels.

Capabilities

  • TODO: Warn if not python 3.9 or other glue compatible version
  • TODO: create a venv for 3.9 if current venv is not 3.9
  • Calls poetry to create a virtualenv without dev dependencies
  • TODO: support pip, pipenv to create virtualenv.
  • Finds site-packages
  • Zips virtualenv and zips own package
  • TODO: support single file modules, eg. mymodule.py
  • Skips cruft
  • Run as few subprocesses as possible
  • config using pyproject.toml or CLI args
  • TODO: Uploads to s3
  • pipx installable
  • works on any OS as well as is possible (can't handle linux binaries on windows for example)
  • Remove packages AWS includes

How it works

On native arm64 machine

  1. Gather info from pyproject.toml or CLI args, but not both.
  2. Create a local .venv and .whl using poetry.
  3. Create a new zip file with an extra top level folder.
  4. Find the site-packages folder and copy to a new zip
  5. Find the module contents in .whl and copy to a new zip
  6. Upload to s3
  7. Use s3 py modules "--s3-py-modules", "s3://s3bucket/pythonPackage.zip"

On non-arm64 machine

  1. Use poetry lock file to generate requirements.txt
  2. Use pip's download target with specified platform (arm64) to simulate creating a venv
  3. Combine with own code as above
  4. Upload to s3 as above

Contributing

To install and run tests and linting tools.

poetry install --with dev
make check

To see if the app can package up other apps

poetry build
# exist poetry shell so that pipx can install with the right base python
exit 
pipx install /e/github/raypack/dist/raypack-0.1.0-py3-none-any.whl

And then in a different project with a pyproject.toml file, run

raypack

Prior Art

Random scripts in comments

Some make file script

Similar to PEX or other venv zip tools, which as far as I know are not AWS aware, or they don't include all the dependencies, or they are more interested in making the archive file executable or self-extracting.

AWS Lambdas also have to go through a similar ad hoc zip process.

Documentation

Change Log

  • 0.1.0 - Idea and reserve package name.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raypack-0.4.0.tar.gz (14.4 kB view hashes)

Uploaded Source

Built Distribution

raypack-0.4.0-py3-none-any.whl (14.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page