Connect Four#


This environment is part of the classic environments. Please read that page first for general information.


from pettingzoo.classic import connect_four_v3



Parallel API


Manual Control



agents= ['player_0', 'player_1']



Action Shape


Action Values


Observation Shape

(6, 7, 2)

Observation Values


Connect Four is a 2-player turn based game, where players must connect four of their tokens vertically, horizontally or diagonally. The players drop their respective token in a column of a standing grid, where each token will fall until it reaches the bottom of the column or reaches an existing token. Players cannot place a token in a full column, and the game ends when either a player has made a sequence of 4 tokens, or when all 7 columns have been filled.

Observation Space#

The observation is a dictionary which contains an 'observation' element which is the usual RL observation described below, and an 'action_mask' which holds the legal moves, described in the Legal Actions Mask section.

The main observation space is 2 planes of a 6x7 grid. Each plane represents a specific agent’s tokens, and each location in the grid represents the placement of the corresponding agent’s token. 1 indicates that the agent has a token placed in that cell, and 0 indicates they do not have a token in that cell. A 0 means that either the cell is empty, or the other agent has a token in that cell.

Action Space#

The action space is the set of integers from 0 to 6 (inclusive), where the action represents which column a token should be dropped in.


If an agent successfully connects four of their tokens, they will be rewarded 1 point. At the same time, the opponent agent will be awarded -1 points. If the game ends in a draw, both players are rewarded 0.

Version History#

  • v3: Fixed bug in arbitrary calls to observe() (1.8.0)

  • v2: Legal action mask in observation replaced illegal move list in infos (1.5.0)

  • v1: Bumped version of all environments due to adoption of new agent iteration scheme where all agents are iterated over after they are done (1.4.0)

  • v0: Initial versions release (1.0.0)



env = connect_four_v3.env(render_mode="human")

for agent in env.agent_iter():
    observation, reward, termination, truncation, info = env.last()

    if termination or truncation:
        action = None
        mask = observation["action_mask"]
        # this is where you would insert your policy
        action = env.action_space(agent).sample(mask)



class pettingzoo.classic.connect_four.connect_four.env(**kwargs)[source]#
class pettingzoo.classic.connect_four.connect_four.raw_env(render_mode: str | None = None, screen_scaling: int = 9)[source]#

Takes in agent and returns the action space for that agent.

MUST return the same value for the same agent name

Default implementation is to return the action_spaces dict


Closes any resources that should be released.

Closes the rendering window, subprocesses, network connections, or any other resources that should be released.


Takes in agent and returns the observation space for that agent.

MUST return the same value for the same agent name

Default implementation is to return the observation_spaces dict


Returns the observation an agent currently can make.

last() calls this function.


Renders the environment as specified by self.render_mode.

Render mode can be human to display a window. Other render modes in the default environments are ‘rgb_array’ which returns a numpy array and is supported by all environments outside of classic, and ‘ansi’ which returns the strings printed (specific to classic environments).

reset(seed=None, options=None)[source]#

Resets the environment to a starting state.


Accepts and executes the action of the current agent_selection in the environment.

Automatically switches control to the next agent.