Simple Crypto#
This environment is part of the MPE environments. Please read that page first for general information.
Import |
|
---|---|
Actions |
Discrete/Continuous |
Parallel API |
Yes |
Manual Control |
No |
Agents |
|
Agents |
2 |
Action Shape |
(4) |
Action Values |
Discrete(4)/Box(0.0, 1.0, (4)) |
Observation Shape |
(4),(8) |
Observation Values |
(-inf,inf) |
State Shape |
(20,) |
State Values |
(-inf,inf) |
In this environment, there are 2 good agents (Alice and Bob) and 1 adversary (Eve). Alice must sent a private 1 bit message to Bob over a public channel. Alice and Bob are rewarded +2 if Bob reconstructs the message, but are rewarded -2 if Eve reconstruct the message (that adds to 0 if both teams reconstruct the bit). Eve is rewarded -2 based if it cannot reconstruct the signal, zero if it can. Alice and Bob have a private key (randomly generated at beginning of each episode) which they must learn to use to encrypt the message.
Alice observation space: [message, private_key]
Bob observation space: [private_key, alices_comm]
Eve observation space: [alices_comm]
Alice action space: [say_0, say_1, say_2, say_3]
Bob action space: [say_0, say_1, say_2, say_3]
Eve action space: [say_0, say_1, say_2, say_3]
For Bob and Eve, their communication is checked to be the 1 bit of information that Alice is trying to convey.
Arguments#
simple_crypto_v3.env(max_cycles=25, continuous_actions=False)
max_cycles
: number of frames (a step for each agent) until game terminates
continuous_actions
: Whether agent action spaces are discrete(default) or continuous
Usage#
AEC#
from pettingzoo.mpe import simple_crypto_v3
env = simple_crypto_v3.env(render_mode="human")
env.reset(seed=42)
for agent in env.agent_iter():
observation, reward, termination, truncation, info = env.last()
if termination or truncation:
action = None
else:
# this is where you would insert your policy
action = env.action_space(agent).sample()
env.step(action)
env.close()
Parallel#
from pettingzoo.mpe import simple_crypto_v3
env = simple_crypto_v3.parallel_env(render_mode="human")
observations, infos = env.reset()
while env.agents:
# this is where you would insert your policy
actions = {agent: env.action_space(agent).sample() for agent in env.agents}
observations, rewards, terminations, truncations, infos = env.step(actions)
env.close()
API#
- class pettingzoo.mpe.simple_crypto.simple_crypto.raw_env(max_cycles=25, continuous_actions=False, render_mode=None)[source]#
- action_spaces: dict[AgentID, gymnasium.spaces.Space]#
- agent_selection: AgentID#
- agents: list[AgentID]#
- infos: dict[AgentID, dict[str, Any]]#
- observation_spaces: dict[AgentID, gymnasium.spaces.Space]#
- possible_agents: list[AgentID]#
- rewards: dict[AgentID, float]#
- terminations: dict[AgentID, bool]#
- truncations: dict[AgentID, bool]#