Parallel API#
In addition to the main API, we have a secondary parallel API for environments where all agents have simultaneous actions and observations. An environment with parallel API support can be created via <game>.parallel_env()
. This API is based around the paradigm of Partially Observable Stochastic Games (POSGs) and the details are similar to RLlib’s MultiAgent environment specification, except we allow for different observation and action spaces between the agents.
For a comparison with the AEC API, see About AEC. For more information, see PettingZoo: A Standard API for Multi-Agent Reinforcement Learning.
PettingZoo Wrappers can be used to convert between Parallel and AEC environments, with some restrictions (e.g., an AEC env must only update once at the end of each cycle).
Examples#
PettingZoo Butterfly provides standard examples of Parallel environments, such as Pistonball.
We provide tutorials for creating two custom Parallel environments: Rock-Paper-Scissors (Parallel), and a simple gridworld environment
Usage#
Parallel environments can be interacted with as follows:
from pettingzoo.butterfly import pistonball_v6
parallel_env = pistonball_v6.parallel_env(render_mode="human")
observations, infos = parallel_env.reset(seed=42)
while parallel_env.agents:
# this is where you would insert your policy
actions = {agent: parallel_env.action_space(agent).sample() for agent in parallel_env.agents}
observations, rewards, terminations, truncations, infos = parallel_env.step(actions)
parallel_env.close()
ParallelEnv#
- class pettingzoo.utils.env.ParallelEnv[source]#
Parallel environment class.
It steps every live agent at once. If you are unsure if you have implemented a ParallelEnv correctly, try running the parallel_api_test in the Developer documentation on the website.
- agents#
A list of the names of all current agents, typically integers. These may be changed as an environment progresses (i.e. agents can be added or removed).
- Type:
list[AgentID]
- num_agents#
The length of the agents list.
- Type:
int
- possible_agents#
A list of all possible_agents the environment could generate. Equivalent to the list of agents in the observation and action spaces. This cannot be changed through play or resetting.
- Type:
list[AgentID]
- max_num_agents#
The length of the possible_agents list.
- Type:
int
- observation_spaces#
A dict of the observation spaces of every agent, keyed by name. This cannot be changed through play or resetting.
- Type:
Dict[AgentID, gym.spaces.Space]
- action_spaces#
A dict of the action spaces of every agent, keyed by name. This cannot be changed through play or resetting.
- Type:
Dict[AgentID, gym.spaces.Space]
- step(actions: dict[AgentID, ActionType]) tuple[dict[AgentID, ObsType], dict[AgentID, float], dict[AgentID, bool], dict[AgentID, bool], dict[AgentID, dict]] [source]#
Receives a dictionary of actions keyed by the agent name.
Returns the observation dictionary, reward dictionary, terminated dictionary, truncated dictionary and info dictionary, where each dictionary is keyed by the agent.
- reset(seed: int | None = None, options: dict | None = None) tuple[dict[AgentID, ObsType], dict[AgentID, dict]] [source]#
Resets the environment.
And returns a dictionary of observations (keyed by the agent name)
- render() None | np.ndarray | str | list [source]#
Displays a rendered frame from the environment, if supported.
Alternate render modes in the default environments are ‘rgb_array’ which returns a numpy array and is supported by all environments outside of classic, and ‘ansi’ which returns the strings printed (specific to classic environments).
- state() ndarray [source]#
Returns the state.
State returns a global view of the environment appropriate for centralized training decentralized execution methods like QMIX