Skip to main content

A standard API and diverse set of reference environments for reinforcement learning and planning in Partially Observable Stochastic Games (POSGs).

Project description

License: MIT

POSGGym

POSGGym is an open source Python library providing implementations of Partially Observable Stochastic Game (POSG) environments coupled with dynamic models of each environment, all under a unified API. While there are a number of amazing open-source implementations for POSG environments, very few have support for dynamic models that can be used for planning. The aim of this library is to fill this gap. Another aim it to provide open-source implementations for many of the environments commonly used in the Partially-Observable multi-agent planning literature. While some open-source implementations exist for some of the common environments, we hope to provide a central repository, with easy to understand and use implementations in order to make reproducibility easier and to aid in faster research.

POSGGym is directly inspired by and adapted from the Gymnasium (formerly Open AI Gym) and PettingZoo libraries for reinforcement learning. The key addition in POSGGym is the support for environment models. POSGGym's API aims to mimic the Gymnasium API as much as possible while incorporating multiple-agents.

Documentation

The documentation for the project is available at posggym.readthedocs.io/.

Installation

The latest version of POSGGym can be installed by running:

pip install posggym

This will install the base dependencies for running the main environments, but may not include all dependencies for all environments or for rendering some environments. You can install all dependencies for a family of environments like pip install posggym[grid_world] or dependencies for all environments using pip install posggym[all].

We support and test for Python>=3.8.

Environments

POSGGym includes the following families of environments. The code for implemented environments are located in the posggym/envs/ subdirectory.

  • Classic - These are classic POSG problems from the literature.
  • Grid-World - These environments are all based in a 2D Gridworld.

You can see a list of all environments by running:

import posggym
posggym.pprint_registry()

Environment API

POSGGym models each environment as a python env class. Creating environment instances and interacting with them is very simple, and flows almost identically to the Gymnasium user flow. Here's an example using the PredatorPrey-v0 environment:

import posggym
env = posggym.make("PredatorPrey-v0")

observations, info = env.reset(seed=42)

for t in range(50):
    actions = {i: env.action_spaces[i].sample() for i in env.agents}
    observations, rewards, terminated, truncated, done, info = env.step(actions)

    if done:
        observation, info = env.reset()

env.close()

Model API

Every environment provides access to a model of the environment in the form of a python model class. Each model implements a generative model, which can be used for planning, along with functions for sampling initial states. Some environments also implement a full POSG model including the transition, joint observation and joint reward functions.

The following is an example of accessing and using the environment model:

import posggym
env = posggym.make("PredatorPrey-v0")
model = env.model

model.seed(seed=42)

state = model.sample_initial_state()
observations = model.sample_initial_obs(state)

for t in range(50):
    actions = {i: env.action_spaces[i].sample() for i in model.get_agents(state)}
    state, observations, rewards, terminated, truncated, all_done, info = model.step(state, actions)

    if all_done:
        state = model.sample_initial_state()
        observations = model.sample_initial_obs(state)

The base model API is very similar to the environment API. The key difference that all methods are stateless so can be used repeatedly for planning. Indeed the env class for the built-in environments are mainly just a wrappers over the underlying model class that manage the state and add support for rendering.

Note that unlike for the env class, for convenience the output of the model.step() method is a dataclass instance and so it's components can be accessed as attributes. For example:

result = model.step(state, actions)
observations = result.observations
info = result.info

Both the env and model classes support a number of other methods, please see the documentation TODO for details.

Authors

Jonathon Schwartz - Jonathon.schwartz@anu.edu.au

License

MIT © 2022, Jonathon Schwartz

Versioning

The POSGGym library uses semantic versioning.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

posggym-0.3.2.tar.gz (202.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

posggym-0.3.2-py3-none-any.whl (217.8 kB view details)

Uploaded Python 3

File details

Details for the file posggym-0.3.2.tar.gz.

File metadata

  • Download URL: posggym-0.3.2.tar.gz
  • Upload date:
  • Size: 202.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.15

File hashes

Hashes for posggym-0.3.2.tar.gz
Algorithm Hash digest
SHA256 1362ec2017337780d95d2d54cb08641b03c85046e8ebcf07f38638fc14e7caf4
MD5 90cd0306acc77226a37b1e215298282d
BLAKE2b-256 cdcb101c68278d035e7a6b819efcf9e02d9b9b81c124d20376105d08d66d625d

See more details on using hashes here.

File details

Details for the file posggym-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: posggym-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 217.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.15

File hashes

Hashes for posggym-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 44fe82ebc7748a420598b8f6534a9f4e0b736d733ed1aaa520ba5781ce6b37d4
MD5 c7a9c4e5dc2ab608d78854b634729b17
BLAKE2b-256 4ec2765f46ed489da47d8c2bf6611d3dc65259ff353608ad631e64297ae2b839

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page