Texas Hold’em#

../../../_images/classic_texas_holdem.gif

This environment is part of the classic environments. Please read that page first for general information.

Import

from pettingzoo.classic import texas_holdem_v4

Actions

Discrete

Parallel API

Yes

Manual Control

No

Agents

agents= ['player_0', 'player_1']

Agents

2

Action Shape

Discrete(4)

Action Values

Discrete(4)

Observation Shape

(72,)

Observation Values

[0, 1]

Arguments#

texas_holdem_v4.env(num_players=2)

num_players: Sets the number of players in the game. Minimum is 2.

Observation Space#

The observation is a dictionary which contains an 'observation' element which is the usual RL observation described below, and an 'action_mask' which holds the legal moves, described in the Legal Actions Mask section.

The main observation space is a vector of 72 boolean integers. The first 52 entries depict the current player’s hand plus any community cards as follows

Index

Description

0 - 12

Spades
0: A, 1: 2, …, 12: K

13 - 25

Hearts
13: A, 14: 2, …, 25: K

26 - 38

Diamonds
26: A, 27: 2, …, 38: K

39 - 51

Clubs
39: A, 40: 2, …, 51: K

52 - 56

Chips raised in Round 1
52: 0, 53: 1, …, 56: 4

57 - 61

Chips raised in Round 2
57: 0, 58: 1, …, 61: 4

62 - 66

Chips raised in Round 3
62: 0, 63: 1, …, 66: 4

67 - 71

Chips raised in Round 4
67: 0, 68: 1, …, 71: 4

Action Space#

Action ID

Action

0

Call

1

Raise

2

Fold

3

Check

Rewards#

Winner

Loser

+raised chips/2

-raised chips/2

Version History#

  • v4: Upgrade to RLCard 1.0.3 (1.11.0)

  • v3: Fixed bug in arbitrary calls to observe() (1.8.0)

  • v2: Bumped RLCard version, bug fixes, legal action mask in observation replaced illegal move list in infos (1.5.0)

  • v1: Bumped RLCard version, fixed observation space, adopted new agent iteration scheme where all agents are iterated over after they are done (1.4.0)

  • v0: Initial versions release (1.0.0)

Usage#

AEC#

from pettingzoo.classic import texas_holdem_v4

env = texas_holdem_v4.env(render_mode="human")
env.reset(seed=42)

for agent in env.agent_iter():
    observation, reward, termination, truncation, info = env.last()

    if termination or truncation:
        action = None
    else:
        mask = observation["action_mask"]
        # this is where you would insert your policy
        action = env.action_space(agent).sample(mask)

    env.step(action)
env.close()

API#

class pettingzoo.classic.rlcard_envs.texas_holdem.env(**kwargs)[source]#
class pettingzoo.classic.rlcard_envs.texas_holdem.raw_env(num_players: int = 2, render_mode: str | None = None, screen_height: int | None = 1000)[source]#
render()[source]#

Renders the environment as specified by self.render_mode.

Render mode can be human to display a window. Other render modes in the default environments are ‘rgb_array’ which returns a numpy array and is supported by all environments outside of classic, and ‘ansi’ which returns the strings printed (specific to classic environments).

step(action)[source]#

Accepts and executes the action of the current agent_selection in the environment.

Automatically switches control to the next agent.