Rock Paper Scissors¶
This environment is part of the classic environments. Please read that page first for general information.
Import |
|
---|---|
Actions |
Discrete |
Parallel API |
Yes |
Manual Control |
No |
Agents |
|
Agents |
2 |
Action Shape |
Discrete(3) |
Action Values |
Discrete(3) |
Observation Shape |
Discrete(4) |
Observation Values |
Discrete(4) |
Rock, Paper, Scissors is a 2-player hand game where each player chooses either rock, paper or scissors and reveals their choices simultaneously. If both players make the same choice, then it is a draw. However, if their choices are different, the winner is determined as follows: rock beats scissors, scissors beat paper, and paper beats rock.
The game can be expanded to have extra actions by adding new action pairs. Adding the new actions in pairs allows for a more balanced game. This means that the final game will have an odd number of actions and each action wins over exactly half of the other actions while being defeated by the other half. The most common expansion of this game is Rock, Paper, Scissors, Lizard, Spock, in which only one extra action pair is added.
Arguments¶
rps_v2.env(num_actions=3, max_cycles=15)
num_actions
: number of actions applicable in the game. The default value is 3 for the game of Rock, Paper, Scissors. This argument must be an integer greater than 3 and with odd parity. If the value given is 5, the game is expanded to Rock, Paper, Scissors, Lizard, Spock.
max_cycles
: after max_cycles steps all agents will return done.
Observation Space¶
Rock, Paper, Scissors¶
If 3 actions are required, the game played is the standard Rock, Paper, Scissors. The observation is the last opponent action and its space is a scalar value with 4 possible values. Since both players reveal their choices at the same time, the observation is None until both players have acted. Therefore, 3 represents no action taken yet. Rock is represented with 0, paper with 1 and scissors with 2.
Value |
Observation |
---|---|
0 |
Rock |
1 |
Paper |
2 |
Scissors |
3 |
None |
Expanded Game¶
If the number of actions required in the game is greater than 3, the observation is still the last opponent action and its space is a scalar with 1 + n possible values, where n is the number of actions. The observation will as well be None until both players have acted and the largest possible scalar value for the space, 1 + n, represents no action taken yet. The additional actions are encoded in increasing order starting from the 0 Rock action. If 5 actions are required the game is expanded to Rock, Paper, Scissors, Lizard, Spock. The following table shows an example of an observation space with 7 possible actions.
Value |
Observation |
---|---|
0 |
Rock |
1 |
Paper |
2 |
Scissors |
3 |
Lizard |
4 |
Spock |
5 |
Action_6 |
6 |
Action_7 |
7 |
None |
Action Space¶
Rock, Paper, Scissors¶
The action space is a scalar value with 3 possible values. The values are encoded as follows: Rock is 0, paper is 1 and scissors is 2.
Value |
Action |
---|---|
0 |
Rock |
1 |
Paper |
2 |
Scissors |
Expanded Game¶
The action space is a scalar value with n possible values, where n is the number of additional action pairs. The values for 7 possible actions are encoded as in the following table.
Value |
Action |
---|---|
0 |
Rock |
1 |
Paper |
2 |
Scissors |
3 |
Lizard |
4 |
Spock |
5 |
Action_6 |
6 |
Action_7 |
Rewards¶
Winner |
Loser |
---|---|
+1 |
-1 |
If the game ends in a draw, both players will receive a reward of 0.
Version History¶
v2: Merge RPS and rock paper lizard scissors spock environments, add num_actions and max_cycles arguments (1.9.0)
v1: Bumped version of all environments due to adoption of new agent iteration scheme where all agents are iterated over after they are done (1.4.0)
v0: Initial versions release (1.0.0)
Usage¶
AEC¶
from pettingzoo.classic import rps_v2
env = rps_v2.env(render_mode="human")
env.reset(seed=42)
for agent in env.agent_iter():
observation, reward, termination, truncation, info = env.last()
if termination or truncation:
action = None
else:
# this is where you would insert your policy
action = env.action_space(agent).sample()
env.step(action)
env.close()
Parallel¶
from pettingzoo.classic import rps_v2
env = rps_v2.parallel_env(render_mode="human")
observations, infos = env.reset()
while env.agents:
# this is where you would insert your policy
actions = {agent: env.action_space(agent).sample() for agent in env.agents}
observations, rewards, terminations, truncations, infos = env.step(actions)
env.close()
API¶
- class pettingzoo.classic.rps.rps.raw_env(num_actions: int | None = 3, max_cycles: int | None = 15, render_mode: str | None = None, screen_height: int | None = 800)[source]¶
Two-player environment for rock paper scissors.
Expandable environment to rock paper scissors lizard spock action_6 action_7 … The observation is simply the last opponent action.
- action_space(agent)[source]¶
Takes in agent and returns the action space for that agent.
MUST return the same value for the same agent name
Default implementation is to return the action_spaces dict
- close()[source]¶
Closes any resources that should be released.
Closes the rendering window, subprocesses, network connections, or any other resources that should be released.
- observation_space(agent)[source]¶
Takes in agent and returns the observation space for that agent.
MUST return the same value for the same agent name
Default implementation is to return the observation_spaces dict
- observe(agent)[source]¶
Returns the observation an agent currently can make.
last() calls this function.
- render()[source]¶
Renders the environment as specified by self.render_mode.
Render mode can be human to display a window. Other render modes in the default environments are ‘rgb_array’ which returns a numpy array and is supported by all environments outside of classic, and ‘ansi’ which returns the strings printed (specific to classic environments).