Texas Hold’em¶

This environment is part of the classic environments. Please read that page first for general information.

Import	`from pettingzoo.classic import texas_holdem_v4`
Actions	Discrete
Parallel API	Yes
Manual Control	No
Agents	`agents= ['player_0', 'player_1']`
Agents	2
Action Shape	Discrete(4)
Action Values	Discrete(4)
Observation Shape	(72,)
Observation Values	[0, 1]

Arguments¶

texas_holdem_v4.env(num_players=2)

num_players: Sets the number of players in the game. Minimum is 2.

Observation Space¶

The observation is a dictionary which contains an 'observation' element which is the usual RL observation described below, and an 'action_mask' which holds the legal moves, described in the Legal Actions Mask section.

The main observation space is a vector of 72 boolean integers. The first 52 entries depict the current player’s hand plus any community cards as follows

Index	Description
0 - 12	Spades `0`: A, `1`: 2, …, `12`: K
13 - 25	Hearts `13`: A, `14`: 2, …, `25`: K
26 - 38	Diamonds `26`: A, `27`: 2, …, `38`: K
39 - 51	Clubs `39`: A, `40`: 2, …, `51`: K
52 - 56	Chips raised in Round 1 `52`: 0, `53`: 1, …, `56`: 4
57 - 61	Chips raised in Round 2 `57`: 0, `58`: 1, …, `61`: 4
62 - 66	Chips raised in Round 3 `62`: 0, `63`: 1, …, `66`: 4
67 - 71	Chips raised in Round 4 `67`: 0, `68`: 1, …, `71`: 4

Legal Actions Mask¶

The legal moves available to the current agent are found in the action_mask element of the dictionary observation. The action_mask is a binary vector where each index of the vector represents whether the action is legal or not. The action_mask will be all zeros for any agent except the one whose turn it is. Taking an illegal move ends the game with a reward of -1 for the illegally moving agent and a reward of 0 for all other agents.

Action Space¶

Action ID	Action
0	Call
1	Raise
2	Fold
3	Check

Rewards¶

Winner	Loser
+raised chips/2	-raised chips/2

Version History¶

v4: Upgrade to RLCard 1.0.3 (1.11.0)
v3: Fixed bug in arbitrary calls to observe() (1.8.0)
v2: Bumped RLCard version, bug fixes, legal action mask in observation replaced illegal move list in infos (1.5.0)
v1: Bumped RLCard version, fixed observation space, adopted new agent iteration scheme where all agents are iterated over after they are done (1.4.0)
v0: Initial versions release (1.0.0)

Usage¶

AEC¶

from pettingzoo.classic import texas_holdem_v4

env = texas_holdem_v4.env(render_mode="human")
env.reset(seed=42)

for agent in env.agent_iter():
    observation, reward, termination, truncation, info = env.last()

    if termination or truncation:
        action = None
    else:
        mask = observation["action_mask"]
        # this is where you would insert your policy
        action = env.action_space(agent).sample(mask)

    env.step(action)
env.close()

API¶

class pettingzoo.classic.rlcard_envs.texas_holdem.env(**kwargs)[source]¶

class pettingzoo.classic.rlcard_envs.texas_holdem.raw_env(num_players: int = 2, render_mode: str | None = None, screen_height: int | None = 1000)[source]¶

render()[source]¶

Renders the environment as specified by self.render_mode.

Render mode can be human to display a window. Other render modes in the default environments are ‘rgb_array’ which returns a numpy array and is supported by all environments outside of classic, and ‘ansi’ which returns the strings printed (specific to classic environments).