Simple World Comm#
 
This environment is part of the MPE environments. Please read that page first for general information.
| Import | 
 | 
|---|---|
| Actions | Discrete/Continuous | 
| Parallel API | Yes | 
| Manual Control | No | 
| Agents | 
 | 
| Agents | 6 | 
| Action Shape | (5),(20) | 
| Action Values | Discrete(5),(20)/Box(0.0, 1.0, (5)), Box(0.0, 1.0, (9)) | 
| Observation Shape | (28),(34) | 
| Observation Values | (-inf,inf) | 
| State Shape | (192,) | 
| State Values | (-inf,inf) | 
This environment is similar to simple_tag, except there is food (small blue balls) that the good agents are rewarded for being near, there are ‘forests’ that hide agents inside from being seen, and there is a ‘leader adversary’ that can see the agents at all times and can communicate with the other adversaries to help coordinate the chase. By default, there are 2 good agents, 3 adversaries, 1 obstacles, 2 foods, and 2 forests.
In particular, the good agents reward, is -5 for every collision with an adversary, -2 x bound by the bound function described in simple_tag, +2 for every collision with a food, and -0.05 x minimum distance to any food. The adversarial agents are rewarded +5 for collisions and -0.1 x minimum
distance to a good agent. s
Good agent observations: [self_vel, self_pos, landmark_rel_positions, other_agent_rel_positions, other_agent_velocities, self_in_forest]
Normal adversary observations:[self_vel, self_pos, landmark_rel_positions, other_agent_rel_positions, other_agent_velocities, self_in_forest, leader_comm]
Adversary leader observations: [self_vel, self_pos, landmark_rel_positions, other_agent_rel_positions, other_agent_velocities, leader_comm]
Note that when the forests prevent an agent from being seen, the observation of that agents relative position is set to (0,0).
Good agent action space: [no_action, move_left, move_right, move_down, move_up]
Normal adversary action space: [no_action, move_left, move_right, move_down, move_up]
Adversary leader discrete action space: [say_0, say_1, say_2, say_3] X [no_action, move_left, move_right, move_down, move_up]
Where X is the Cartesian product (giving a total action space of 50).
Adversary leader continuous action space: [no_action, move_left, move_right, move_down, move_up, say_0, say_1, say_2, say_3]
Arguments#
simple_world_comm_v3.env(num_good=2, num_adversaries=4, num_obstacles=1,
                num_food=2, max_cycles=25, num_forests=2, continuous_actions=False)
num_good:  number of good agents
num_adversaries:  number of adversaries
num_obstacles:  number of obstacles
num_food:  number of food locations that good agents are rewarded at
max_cycles:  number of frames (a step for each agent) until game terminates
num_forests: number of forests that can hide agents inside from being seen
continuous_actions: Whether agent action spaces are discrete(default) or continuous
Usage#
AEC#
from pettingzoo.mpe import simple_world_comm_v3
env = simple_world_comm_v3.env(render_mode="human")
env.reset(seed=42)
for agent in env.agent_iter():
    observation, reward, termination, truncation, info = env.last()
    if termination or truncation:
        action = None
    else:
        # this is where you would insert your policy
        action = env.action_space(agent).sample()
    env.step(action)
env.close()
Parallel#
from pettingzoo.mpe import simple_world_comm_v3
env = simple_world_comm_v3.parallel_env(render_mode="human")
observations, infos = env.reset()
while env.agents:
    # this is where you would insert your policy
    actions = {agent: env.action_space(agent).sample() for agent in env.agents}
    observations, rewards, terminations, truncations, infos = env.step(actions)
env.close()
API#
- class pettingzoo.mpe.simple_world_comm.simple_world_comm.raw_env(num_good=2, num_adversaries=4, num_obstacles=1, num_food=2, max_cycles=25, num_forests=2, continuous_actions=False, render_mode=None)[source]#
- action_spaces: dict[AgentID, gymnasium.spaces.Space]#
 - agent_selection: AgentID#
 - agents: list[AgentID]#
 - infos: dict[AgentID, dict[str, Any]]#
 - observation_spaces: dict[AgentID, gymnasium.spaces.Space]#
 - possible_agents: list[AgentID]#
 - rewards: dict[AgentID, float]#
 - terminations: dict[AgentID, bool]#
 - truncations: dict[AgentID, bool]#
 
