Simple Reference

../../../_images/mpe_simple_reference.gif

This environment is part of the MPE environments. Please read that page first for general information.

Import

from pettingzoo.mpe import simple_reference_v3

Actions

Discrete/Continuous

Parallel API

Yes

Manual Control

No

Agents

agents= [agent_0, agent_1]

Agents

3

Action Shape

(5)

Action Values

Discrete(5)/Box(0.0, 1.0, (5))

Observation Shape

(8),(10)

Observation Values

(-inf,inf)

State Shape

(28,)

State Values

(-inf,inf)

This environment has 2 agents and 3 landmarks of different colors. Each agent wants to get closer to their target landmark, which is known only by the other agents. Both agents are simultaneous speakers and listeners.

Locally, the agents are rewarded by their distance to their target landmark. Globally, all agents are rewarded by the average distance of all the agents to their respective landmarks. The relative weight of these rewards is controlled by the local_ratio parameter.

Agent observation space: [self_vel, all_landmark_rel_positions, landmark_ids, goal_id, communication]

Agent discrete action space: [say_0, say_1, say_2, say_3, say_4, say_5, say_6, say_7, say_8, say_9] X [no_action, move_left, move_right, move_down, move_up]

Where X is the Cartesian product (giving a total action space of 50).

Agent continuous action space: [no_action, move_left, move_right, move_down, move_up, say_0, say_1, say_2, say_3, say_4, say_5, say_6, say_7, say_8, say_9]

Arguments

simple_reference_v3.env(local_ratio=0.5, max_cycles=25, continuous_actions=False, dynamic_rescaling=False)

local_ratio: Weight applied to local reward and global reward. Global reward weight will always be 1 - local reward weight.

max_cycles: number of frames (a step for each agent) until game terminates

continuous_actions: Whether agent action spaces are discrete(default) or continuous

dynamic_rescaling: Whether to rescale the size of agents and landmarks based on the screen size

Usage

AEC

from pettingzoo.mpe import simple_reference_v3

env = simple_reference_v3.env(render_mode="human")
env.reset(seed=42)

for agent in env.agent_iter():
    observation, reward, termination, truncation, info = env.last()

    if termination or truncation:
        action = None
    else:
        # this is where you would insert your policy
        action = env.action_space(agent).sample()

    env.step(action)
env.close()

Parallel

from pettingzoo.mpe import simple_reference_v3

env = simple_reference_v3.parallel_env(render_mode="human")
observations, infos = env.reset()

while env.agents:
    # this is where you would insert your policy
    actions = {agent: env.action_space(agent).sample() for agent in env.agents}

    observations, rewards, terminations, truncations, infos = env.step(actions)
env.close()

API

class pettingzoo.mpe.simple_reference.simple_reference.raw_env(local_ratio=0.5, max_cycles=25, continuous_actions=False, render_mode=None, dynamic_rescaling=False)[source]