Knights Archers Zombies (‘KAZ’)¶
This environment is part of the butterfly environments. Please read that page first for general information.
Import |
|
|---|---|
Actions |
Discrete |
Parallel API |
Yes |
Manual Control |
Yes |
Agents |
|
Agents |
4 |
Action Shape |
(1,) |
Action Values |
[0, 5] |
Observation Shape |
(512, 512, 3) |
Observation Values |
(0, 255) |
State Shape |
(720, 1280, 3) |
State Values |
(0, 255) |
Zombies walk from the top border of the screen down to the bottom border in unpredictable paths. The agents you control are knights and archers (default 2 knights and 2 archers) that are initially positioned at the bottom border of the screen. Each agent can rotate clockwise or counter-clockwise and move forward or backward. An agent that moves off the top or bottom of the screen may be killed (if this setting is enabled, otherwise they stop at the edge). Each agent can also attack to kill zombies. When a knight attacks, it swings a mace in an arc in front of its current heading direction. During this attack, which takes several steps, the knight cannot take any other action. When an archer attacks, it fires an arrow in a straight line in the direction of the archer’s heading. There is a slight delay after firing an arrow before an archer can fire another arrow. However, the archer and move and turn during this time. A zombie that is hit by either a mace or an arrow is removed. A knight is rewarded 1 point when its mace hits and kills a zombie. An archer is rewarded 1 point when one of its arrows hits and kills a zombie. A knight or archer that is touched by a zombie is removed (unless they are set to be unkillable). The game ends when all agents die (collide with a zombie) or a zombie reaches the bottom screen border.
Actions¶
Each agent acts independently. The action space is discrete [0,5] with the following meanings:
0 - move forward
1 - move backward
2 - turn counter clockwise
3 - turn clockwise
4 - use weapon
5 - do nothing
Movement and turning is done at a fixed rate.
Observations¶
There are four possible observation types for this environment - three vectorized (a base vectorized, a sequence vectorized, and a typemasked vectorized) and one image-based.
The selection is made when the environment is created by passing a value to the obs_method argument.
Vectorized (the default)¶
Pass the argument obs_method='vector' to the environment.
The observation is an (N+1)x5 array for each agent, where N = num_archers + num_knights + num_swords + max_arrows + max_zombies.
Note that
num_swords = num_knights
The ordering of the rows of the observation is:
[
[current agent],
[archer 1],
...,
[archer N],
[knight 1],
...
[knight M],
[sword 1],
...
[sword M],
[arrow 1],
...
[arrow max_arrows],
[zombie 1],
...
[zombie max_zombies]
]
In total, there will be N+1 rows. Rows with no entities will be all 0, but the ordering of the entities will not change.
Vector Breakdown
This breaks down what a row in the observation means. All distances are normalized to [0, 1]. Note that for positions, [0, 0] is the top left corner of the image. Down is positive y, Right is positive x. The bottom right corner is [1, 1] in normalized coordinates.
For the vector of current agent (the first row):
The first value means nothing and will always be 0.
The next four values are the position and angle of the current agent.
The first two values are position values, normalized to the width and height of the image respectively.
The final two values are heading of the agent represented as a unit vector.
For the remaining rows, each row is a 5 wide vector with the following values:
The first value is the magnitude of the distance between an entity and the current agent.
The next four values are the position and angle.
The first two values are position values relative to the current agent, normalized to the width and height of the image respectively.
The final two values are the heading of the entity (relative to the world) represented as a unit vector.
Dead agents or missing objects (i.e. zombies, swords, or arrows) have a vector of all zeros.
Vectorized with typemasks¶
There is an option to prepend a typemask to each row vector. This can be enabled by passing obs_method='vector-masked' as an argument.
The typemask is a 6 wide vector that indicates the type of entry. A dead agent or a non-existent object (zombie, sword, arrow) is indicated by a vector (type mask and all state vector) of all zeros. Otherwise, a single value of 1 indicated, by its position, which type of entry the observation corresponds to. The possible options are:
an active zombie: [1., 0., 0., 0., 0., 0.]
a living archer: [0., 1., 0., 0., 0., 0.]
a living knight: [0., 0., 1., 0., 0., 0.]
an active sword: [0., 0., 0., 1., 0., 0.]
an active arrow: [0., 0., 0., 0., 1., 0.]
the current agent: [0., 0., 0., 0., 0., 1.]
The remainder of the row will be as defined by the vector observation above.
As a result, setting obs_method='vector-masked' results in the observation being a (N+1)x11 vector.
Sequence Vectorized space with typemasks¶
With this option, the observations are the same as in the typemask section except that all empty rows are removed. So the size of the observation will change through the run. So the observation size is a (X+1)x11 vector where ‘X’ varies within a run.
Note that this is still experimental. It can be used by passing obs_method='vector-sequence' as aargument during environment creation.
Image-based¶
To use a image based observation, pass obs_method='image' to the environment during creation.
Each agent observes the environment as a square region around itself, with its own body in the center of the square. The observation is represented as a 512x512 pixel image around the agent, or in other words, a 16x16 agent sized space around the agent. Each pixel is defined as RGB values in range [0, 255]. Areas outside of the game box are returned as black pixels: (0,0,0). Dead agents return all pixels as black.
Arguments¶
knights_archers_zombies_v11.env(
spawn_delay=20,
num_archers=2,
num_knights=2,
max_zombies=10,
max_arrows=10,
killable_knights=True,
killable_archers=True,
line_death=False,
max_cycles=900,
obs_method="vector",
)
spawn_delay: how many cycles before a new zombie is spawned. A higher number means zombies are spawned at a slower rate.
num_archers: how many archer agents initially spawn.
num_knights: how many knight agents initially spawn.
max_zombies: maximum number of zombies that can exist at a time
max_arrows: maximum number of arrows that can exist at a time
killable_knights: if set to False, knight agents cannot be killed by zombies.
killable_archers: if set to False, archer agents cannot be killed by zombies.
line_death: if set to False, agents do not die when they touch the top or bottom border. If True, agents die as soon as they touch the top or bottom border.
max_cycles: The maximum number of game cycles to run. One cycle is complete when one step has been made by all players and zombies.
obs_method: method of observations to use. Options are ‘vector’ (default), ‘image’, ‘vector-sequence’, or ‘vector-masked’. See docs for details.
Version History¶
v11: Code rewrite and numerous fixes (1.25.1)
v10: Add vectorizable state space (1.17.0)
v9: Code rewrite and numerous fixes (1.16.0)
v8: Code cleanup and several bug fixes (1.14.0)
v7: Minor bug fix relating to end of episode crash (1.6.0)
v6: Fixed reward structure (1.5.2)
v5: Removed black death argument (1.5.0)
v4: Fixed observation and rendering issues (1.4.2)
v3: Misc bug fixes, bumped PyGame and PyMunk version (1.4.0)
v2: Fixed bug in how
doneswere computed (1.3.1)v1: Fixes to how all environments handle premature death (1.3.0)
v0: Initial versions release (1.0.0)
Usage¶
AEC¶
from pettingzoo.butterfly import knights_archers_zombies_v11
env = knights_archers_zombies_v11.env(render_mode="human")
env.reset(seed=42)
for agent in env.agent_iter():
observation, reward, termination, truncation, info = env.last()
if termination or truncation:
action = None
else:
# this is where you would insert your policy
action = env.action_space(agent).sample()
env.step(action)
env.close()
Parallel¶
from pettingzoo.butterfly import knights_archers_zombies_v11
env = knights_archers_zombies_v11.parallel_env(render_mode="human")
observations, infos = env.reset()
while env.agents:
# this is where you would insert your policy
actions = {agent: env.action_space(agent).sample() for agent in env.agents}
observations, rewards, terminations, truncations, infos = env.step(actions)
env.close()
API¶
- class pettingzoo.butterfly.knights_archers_zombies.knights_archers_zombies.env(**kwargs: Any)[source]¶
Create the wrapped KAZ environment.
- class pettingzoo.butterfly.knights_archers_zombies.knights_archers_zombies.raw_env(spawn_delay: int = 20, num_archers: int = 2, num_knights: int = 2, max_zombies: int = 10, max_arrows: int = 10, killable_knights: bool = True, killable_archers: bool = True, line_death: bool = False, max_cycles: int = 900, obs_method: str = 'vector', render_mode: str | None = None)[source]¶
The Knights Archers Zombies environment.
Initialize the environment object.
- Parameters:
spawn_delay – how many cycles before a new zombie is spawned. A higher number means zombies are spawned at a slower rate.
num_archers – how many archer agents initially spawn.
num_knights – how many knight agents initially spawn.
max_zombies – maximum number of zombies that can exist at a time
max_arrows – maximum number of arrows that can exist at a time
killable_knights – if False, knight agents cannot be killed by zombies.
killable_archers – if False, archer agents cannot be killed by zombies.
line_death – if True, agents die when they touch the top or bottom wall.
max_cycles – The maximum number of game cycles to run. One cycle is complete when one step has been made by all players and zombies.
obs_method – method of observations to use. Options are ‘vector’ (default), ‘image’, ‘vector-sequence’, or ‘vector-masked’. See docs for details.
render_mode – the render mode to use.
- apply_weapons() None[source]¶
Move the weapons and remove any zombies that were hit.
The weapons are moved along their path. If an arrow hits a zombie, both the arrow and the zombie are removed. The archer that fired the arrow is awarded 1 point.
If a sword hits a zombie, the zombie is removed but the sword is not. The knight wielding the sword is awarded 1 point.
- get_vector_state() ndarray[tuple[Any, ...], dtype[float64]][source]¶
Returns the vector state of the global environment.
- property num_active_arrows: int¶
Return the number of arrows on the screen.
- property num_active_swords: int¶
Return the number of swords on the screen.
- observe(agent: str) ndarray[tuple[Any, ...], dtype[uint8]] | ndarray[tuple[Any, ...], dtype[float64]][source]¶
Return the observation for the given agent.
- render() ndarray[tuple[Any, ...], dtype[uint8]] | None[source]¶
Render the state based on the render mode.
- reset(seed: int | None = None, options: dict[str, Any] | None = None) None[source]¶
Reset the env for another run.
- Parameters:
seed – the seed to use for the new run
options – these are ignored for this env