Leduc Hold’em#

../../../_images/classic_leduc_holdem.gif

This environment is part of the classic environments. Please read that page first for general information.

Import

from pettingzoo.classic import leduc_holdem_v4

Actions

Parallel API

Yes

Manual Control

No

Agents

agents= ['player_0', 'player_1']

Agents

2

Action Shape

Discrete(4)

Action Values

Discrete(4)

Observation Shape

(36,)

Observation Values

[0, 1]

Leduc Hold’em is a variation of Limit Texas Hold’em with fixed number of 2 players, 2 rounds and a deck of six cards (Jack, Queen, and King in 2 suits). At the beginning of the game, each player receives one card and, after betting, one public card is revealed. Another round follows. At the end, the player with the best hand wins and receives a reward (+1) and the loser receives -1. At any time, any player can fold.

Our implementation wraps RLCard and you can refer to its documentation for additional details. Please cite their work if you use this game in research.

Observation Space#

The observation is a dictionary which contains an 'observation' element which is the usual RL observation described below, and an 'action_mask' which holds the legal moves, described in the Legal Actions Mask section.

As described by RLCard, the first 3 entries of the main observation space correspond to the player’s hand (J, Q, and K) and the next 3 represent the public cards. Indexes 6 to 19 and 20 to 33 encode the number of chips by the current player and the opponent, respectively.

Index

Description

0 - 2

Current Player’s Hand
0: J, 1: Q, 2: K

3 - 5

Community Cards
3: J, 4: Q, 5: K

6 - 20

Current Player’s Chips
6: 0 chips, 7: 1 chip, …, 20: 14 chips

21 - 35

Opponent’s Chips
21: 0 chips, 22: 1 chip, …, 35: 14 chips

Action Space#

Action ID

Action

0

Call

1

Raise

2

Fold

3

Check

Rewards#

Winner

Loser

+raised chips / 2

-raised chips / 2

Version History#

  • v4: Upgrade to RLCard 1.0.3 (1.11.0)

  • v3: Fixed bug in arbitrary calls to observe() (1.8.0)

  • v2: Bumped RLCard version, bug fixes, legal action mask in observation replaced illegal move list in infos (1.5.0)

  • v1: Bumped RLCard version, fixed observation space, adopted new agent iteration scheme where all agents are iterated over after they are done (1.4.0)

  • v0: Initial versions release (1.0.0)

Usage#

AEC#

from pettingzoo.classic import leduc_holdem_v4

env = leduc_holdem_v4.env(render_mode="human")
env.reset(seed=42)

for agent in env.agent_iter():
    observation, reward, termination, truncation, info = env.last()

    if termination or truncation:
        action = None
    else:
        mask = observation["action_mask"]
        # this is where you would insert your policy
        action = env.action_space(agent).sample(mask)

    env.step(action)
env.close()

API#

class pettingzoo.classic.rlcard_envs.leduc_holdem.env(**kwargs)[source]#
class pettingzoo.classic.rlcard_envs.leduc_holdem.raw_env(render_mode: str | None = None, screen_height: int | None = 1000)[source]#
render()[source]#

Renders the environment as specified by self.render_mode.

Render mode can be human to display a window. Other render modes in the default environments are ‘rgb_array’ which returns a numpy array and is supported by all environments outside of classic, and ‘ansi’ which returns the strings printed (specific to classic environments).

step(action)[source]#

Accepts and executes the action of the current agent_selection in the environment.

Automatically switches control to the next agent.