Simple World Comm#

This environment is part of the MPE environments. Please read that page first for general information.

Import	`from pettingzoo.mpe import simple_world_comm_v3`
Actions	Discrete/Continuous
Parallel API	Yes
Manual Control	No
Agents	`agents=[leadadversary_0, adversary_0, adversary_1, adversary_3, agent_0, agent_1]`
Agents	6
Action Shape	(5),(20)
Action Values	Discrete(5),(20)/Box(0.0, 1.0, (5)), Box(0.0, 1.0, (9))
Observation Shape	(28),(34)
Observation Values	(-inf,inf)
State Shape	(192,)
State Values	(-inf,inf)

This environment is similar to simple_tag, except there is food (small blue balls) that the good agents are rewarded for being near, there are ‘forests’ that hide agents inside from being seen, and there is a ‘leader adversary’ that can see the agents at all times and can communicate with the other adversaries to help coordinate the chase. By default, there are 2 good agents, 3 adversaries, 1 obstacles, 2 foods, and 2 forests.

In particular, the good agents reward, is -5 for every collision with an adversary, -2 x bound by the bound function described in simple_tag, +2 for every collision with a food, and -0.05 x minimum distance to any food. The adversarial agents are rewarded +5 for collisions and -0.1 x minimum distance to a good agent. s

Good agent observations: [self_vel, self_pos, landmark_rel_positions, other_agent_rel_positions, other_agent_velocities, self_in_forest]

Normal adversary observations:[self_vel, self_pos, landmark_rel_positions, other_agent_rel_positions, other_agent_velocities, self_in_forest, leader_comm]

Adversary leader observations: [self_vel, self_pos, landmark_rel_positions, other_agent_rel_positions, other_agent_velocities, leader_comm]

Note that when the forests prevent an agent from being seen, the observation of that agents relative position is set to (0,0).

Good agent action space: [no_action, move_left, move_right, move_down, move_up]

Normal adversary action space: [no_action, move_left, move_right, move_down, move_up]

Adversary leader discrete action space: [say_0, say_1, say_2, say_3] X [no_action, move_left, move_right, move_down, move_up]

Where X is the Cartesian product (giving a total action space of 50).

Adversary leader continuous action space: [no_action, move_left, move_right, move_down, move_up, say_0, say_1, say_2, say_3]

Arguments#

simple_world_comm_v3.env(num_good=2, num_adversaries=4, num_obstacles=1,
                num_food=2, max_cycles=25, num_forests=2, continuous_actions=False)

num_good: number of good agents

num_adversaries: number of adversaries

num_obstacles: number of obstacles

num_food: number of food locations that good agents are rewarded at

max_cycles: number of frames (a step for each agent) until game terminates

num_forests: number of forests that can hide agents inside from being seen

continuous_actions: Whether agent action spaces are discrete(default) or continuous

Usage#

AEC#

from pettingzoo.mpe import simple_world_comm_v3

env = simple_world_comm_v3.env(render_mode="human")
env.reset(seed=42)

for agent in env.agent_iter():
    observation, reward, termination, truncation, info = env.last()

    if termination or truncation:
        action = None
    else:
        # this is where you would insert your policy
        action = env.action_space(agent).sample()

    env.step(action)
env.close()

Parallel#

from pettingzoo.mpe import simple_world_comm_v3

env = simple_world_comm_v3.parallel_env(render_mode="human")
observations, infos = env.reset()

while env.agents:
    # this is where you would insert your policy
    actions = {agent: env.action_space(agent).sample() for agent in env.agents}

    observations, rewards, terminations, truncations, infos = env.step(actions)
env.close()

API#

class pettingzoo.mpe.simple_world_comm.simple_world_comm.raw_env(num_good=2, num_adversaries=4, num_obstacles=1, num_food=2, max_cycles=25, num_forests=2, continuous_actions=False, render_mode=None)[source]#

action_spaces: dict[AgentID, gymnasium.spaces.Space]#

agent_selection: AgentID#

agents: list[AgentID]#

infos: dict[AgentID, dict[str, Any]]#

observation_spaces: dict[AgentID, gymnasium.spaces.Space]#

possible_agents: list[AgentID]#

rewards: dict[AgentID, float]#

terminations: dict[AgentID, bool]#

truncations: dict[AgentID, bool]#