Texas Hold’em#
![../../../_images/classic_texas_holdem.gif](../../../_images/classic_texas_holdem.gif)
This environment is part of the classic environments. Please read that page first for general information.
Import |
|
---|---|
Actions |
Discrete |
Parallel API |
Yes |
Manual Control |
No |
Agents |
|
Agents |
2 |
Action Shape |
Discrete(4) |
Action Values |
Discrete(4) |
Observation Shape |
(72,) |
Observation Values |
[0, 1] |
Arguments#
texas_holdem_v4.env(num_players=2)
num_players
: Sets the number of players in the game. Minimum is 2.
Observation Space#
The observation is a dictionary which contains an 'observation'
element which is the usual RL observation described below, and an 'action_mask'
which holds the legal moves, described in the Legal Actions Mask section.
The main observation space is a vector of 72 boolean integers. The first 52 entries depict the current player’s hand plus any community cards as follows
Index |
Description |
---|---|
0 - 12 |
Spades |
13 - 25 |
Hearts |
26 - 38 |
Diamonds |
39 - 51 |
Clubs |
52 - 56 |
Chips raised in Round 1 |
57 - 61 |
Chips raised in Round 2 |
62 - 66 |
Chips raised in Round 3 |
67 - 71 |
Chips raised in Round 4 |
Legal Actions Mask#
The legal moves available to the current agent are found in the action_mask
element of the dictionary observation. The action_mask
is a binary vector where each index of the vector represents whether the action is legal or not. The action_mask
will be all zeros for any agent except the one
whose turn it is. Taking an illegal move ends the game with a reward of -1 for the illegally moving agent and a reward of 0 for all other agents.
Action Space#
Action ID |
Action |
---|---|
0 |
Call |
1 |
Raise |
2 |
Fold |
3 |
Check |
Rewards#
Winner |
Loser |
---|---|
+raised chips/2 |
-raised chips/2 |
Version History#
v4: Upgrade to RLCard 1.0.3 (1.11.0)
v3: Fixed bug in arbitrary calls to observe() (1.8.0)
v2: Bumped RLCard version, bug fixes, legal action mask in observation replaced illegal move list in infos (1.5.0)
v1: Bumped RLCard version, fixed observation space, adopted new agent iteration scheme where all agents are iterated over after they are done (1.4.0)
v0: Initial versions release (1.0.0)