Cooperative Pong

../../../_images/butterfly_cooperative_pong.gif

This environment is part of the butterfly environments. Please read that page first for general information.

Import

from pettingzoo.butterfly import cooperative_pong_v6

Actions

Discrete

Parallel API

Yes

Manual Control

Yes

Agents

agents= ['paddle_0', 'paddle_1']

Agents

2

Action Shape

Discrete(3)

Action Values

[0, 1, 2]

Observation Shape

(280, 480, 3)

Observation Values

[0, 255]

State Shape

(280, 480, 3)

State Values

(0, 255)

Cooperative pong is a game of simple pong, where the objective is to keep the ball in play for the longest time. The game is over when the ball goes out of bounds from either the left or right edge of the screen. There are two agents (paddles), one that moves along the left edge and the other that moves along the right edge of the screen. All collisions of the ball are elastic. The ball always starts moving in a random direction from the center of the screen with each reset. To make learning a little more challenging, the right paddle is tiered cake-shaped by default. The observation space of each agent is the entire screen. There are three possible actions for the agents (move up/down or do nothing). If the ball stays within bounds, each agent receives a reward of max_reward / max_cycles (default 0.11) at each timestep. Otherwise, each agent receives a reward of off_screen_penalty (default -10) and the game ends.

Manual Control

Move the left paddle using the ‘W’ and ‘S’ keys. Move the right paddle using ‘UP’ and ‘DOWN’ arrow keys.

Arguments

cooperative_pong_v6.env(
    ball_speed = 9,
    left_paddle_speed = 12,
    right_paddle_speed = 12,
    cake_paddle = True,
    max_cycles = 900,
    bounce_randomness = False,
    max_reward = 100,
    off_screen_penalty = -10,
    render_mode = None,
    render_ratio = 2,
    render_fps = 15,
)

ball_speed: Speed of ball (in pixels). Note that if the ball speed is set too high, it is possible for it to move through the paddle and out of bounds.

left_paddle_speed: Speed of left paddle (in pixels)

right_paddle_speed: Speed of right paddle (in pixels)

cake_paddle: If True, the right paddle cakes the shape of a 4 tiered wedding cake

max_cycles: After max_cycles steps all agents will return done

bounce_randomness: If True, each collision of the ball with the paddles adds a small random angle to the direction of the ball, with the speed of the ball remaining unchanged.

max_reward: Total reward given to each agent over max_cycles timesteps

off_screen_penalty: Negative reward penalty for each agent if the ball goes off the screen

render_mode: Render mode for the env (either None, “human”, or “rgb_array”)

render_ratio: Scaling ratio for rendering the screen (controls display size, larger value gives smaller screen)

render_fps: Speed that the game is run (in frames per second, higher values give faster game)

Version History

  • v6: Fixed incorrect termination condition and random bounce behaviour (1.25.5)

  • v5: Fixed ball teleporting bugs

  • v4: Added max_reward and off_screen_penalty arguments and changed default, fixed glitch where ball would occasionally teleport, reward redesign (1.14.0)

  • v3: Change observation space to include entire screen (1.10.0)

  • v2: Misc fixes (1.4.0)

  • v1: Fixed bug in how dones were computed (1.3.1)

  • v0: Initial versions release (1.0.0)

Usage

AEC

from pettingzoo.butterfly import cooperative_pong_v6

env = cooperative_pong_v6.env(render_mode="human")
env.reset(seed=42)

for agent in env.agent_iter():
    observation, reward, termination, truncation, info = env.last()

    if termination or truncation:
        action = None
    else:
        # this is where you would insert your policy
        action = env.action_space(agent).sample()

    env.step(action)
env.close()

Parallel

from pettingzoo.butterfly import cooperative_pong_v6

env = cooperative_pong_v6.parallel_env(render_mode="human")
observations, infos = env.reset()

while env.agents:
    # this is where you would insert your policy
    actions = {agent: env.action_space(agent).sample() for agent in env.agents}

    observations, rewards, terminations, truncations, infos = env.step(actions)
env.close()

API

class pettingzoo.butterfly.cooperative_pong.cooperative_pong.env(**kwargs: Any)[source]

Creates the wrapped environment.

class pettingzoo.butterfly.cooperative_pong.cooperative_pong.raw_env(**kwargs: Any)[source]

The CooperativePong AEC environment.

Initialize the environment.

action_space(agent: AgentID) Space[Any][source]

Return the action space for the given agent.

close() None[source]

Close the renderer.

observation_space(agent: AgentID) Space[Any][source]

Return the observation space for the given agent.

observe(agent: AgentID) ObsType[source]

Return the observation for the given agent.

render() ndarray[tuple[int, ...], dtype[integer]] | None[source]

Render the current state of the environment.

reset(seed: int | None = None, options: dict[str, Any] | None = None) None[source]

Reset the environment.

state() StateType[source]

Return the state for the environment.

step(action: Literal[0, 1, 2]) None[source]

Take a step of the environment.