gym_nethack.envs package

Submodules

gym_nethack.envs.base module

class gym_nethack.envs.base.Goals[source]

Bases: object

CONN_ERROR = 3
LOSS = 1
SUCCESS = 0
TIME_EXCEEDED = 2
class gym_nethack.envs.base.NetHackEnv(nhinfo)[source]

Bases: gym.core.Env, gym.utils.ezpickle.EzPickle

Basic NetHack environment. Must be subclassed. Contains statistics saving/loading methods and NetHack process management.

close()[source]

Save records.

end_episode()[source]

End the current episode, incrementing the game counter by one and calling to save_records() every 100 games.

end_turn()[source]
get_savedir_info_list()[source]

Get the strings that should form the save directory name.

load_records()[source]

Load the saved records found at self.savedir/*_records.dll

reset()[source]

Prepare the environment for a new map. Kills the current NetHack process and launches a new one.

save_records()[source]

Save records to self.savedir/*_records.dll, creating directories if necessary.

set_config(proc_id, num_procs, name, parse_items, **args)[source]

Set config and connect to the NetHack launcher daemon.

Parameters:
  • proc_id – process ID of this environment, to be matched with the argument passed to the daemon launching script.
  • num_procs – number of processes to run in parallel - used if grid search is running
  • name – to be used for the record folder name
  • parse_items – whether to handle items in the environment or not
start_episode()[source]
start_turn()[source]
class gym_nethack.envs.base.NetHackInfo(parse_items=True)[source]

Bases: object

Stores NetHack game state, and contains methods for processing/parsing screen output and for item & map information.

at_intersection()[source]

Return true if the player is at the intersection of two or more corridors.

at_room_opening(pos=None)[source]

Return true if the player (or the given position) is at a room opening.

basemap_char(x, y)[source]

Returns the basemap character at the given map coords.

char_under_player()[source]

Returns the character under the player.

count_char_on_map(char)[source]

Return the number of appearances of the given char on the map.

explored_current_room()[source]

Return true if the player has already explored the current room.

find_char_on_base_map(char)[source]

Return the map coordinate at which the first instance of the given char appears, or None if it does not.

get_chars_adjacent_to(x, y, diag=False)[source]

Returns the list of basemap tiles adjacent to the given map coordinate.

get_corridor_exits(pos=None, diag=True)[source]

Return the corridors adjacent to the given position.

Parameters:
  • pos – the position around which to look for corridors. If none, the player’s current position is used.
  • diag – whether to consider diagonal tiles
get_cur_weapon()[source]

Returns the current weapon object wielded by the player, and whether it is cursed or not.

get_inven_char_for_item(item)[source]

Returns the inventory character mapped to a particular item, assuming the player has it in the inventory.

get_neighboring_positions(x, y, diag=True)[source]

Returns the list of in-range coordinates adjacent to the given map coordinate.

get_room()[source]

Returns the list index for the current room object, creating one if necessary.

get_uncovered_doors()[source]

Return the coordinates which in the last turn were revealed to be doors.

in_corridor()[source]

Return true if the player is in a corridor.

in_range(x, y)[source]

Returns true if the given x,y coordinate is within the map bounds.

in_room()[source]

Return true if the player is in a room.

is_player_invisible()[source]

Return true if the player is currently invisible.

mark_all_explored()[source]

Mark all traversable positions in the map observed so far as explored, then update the pathfinding grid.

mark_explored(pos)[source]

Add the given position to the explored positions list.

Parameters:pos – position that we want to mark as explored
next_to_dead_end()[source]

Return true if the player (or the given position) is at a dead-end in a corridor.

on_stairs()[source]

Return true if the player is standing on top of the staircase.

pathfind_to(target, initial=None, full_path=True, explored_set=None, override_target_traversability=False, override_targets=[])[source]

A* pathfinding from initial to target, where A* can visit any position that has been explored.

Parameters:
  • target – target position to pathfind to.
  • initial – position to start pathfinding from. If None, use current player position.
  • full_path – return entire trajectory if True, else return first position from initial.
  • explored_set – if not None, increase A* heuristic score of non-explored tiles over explored tiles (e.g., if walking through a diagonal corridor, prefer to visit each square instead of moving diagonally, so we don’t miss any branching corridor).
  • override_target_traversability – pathfind to target even if it is not traversable by the player (e.g., solid wall).
  • override_targets – override traversability of all targets in this list (see above parameter)
process_msg(socket, message, update_base=True, parse_monsters=True, parse_ammo=False)[source]

Processes the map screen outputted by NetHack.

Parameters:
  • socket – the socket connected to the NetHack process (needed to send/rcv inventory message)
  • message – the message outputted by NetHack
  • update_base – whether to update our record of the map with new information gleaned or not
  • parse_monsters – whether to keep or discard monsters in the parsed NetHack map
  • parse_ammo – whether to keep or discard ammo in the parsed NetHack map
reset()[source]

Reset all map- and level-dependent variables.

update_pathfinding_grid()[source]

Update the pathfinding grid, setting a 0 if the position is traversable and 1 otherwise.

class gym_nethack.envs.base.NetHackRLEnv(nhinfo=None)[source]

Bases: gym_nethack.envs.base.NetHackEnv

Basic NetHack RL env with core step() and take_action() methods. Must be subclassed.

end_episode()[source]

End the current episode.

get_game_params()[source]

Parameters to pass to NetHack on the creation of a new game. (Will be saved in the options file.)

get_reward(status)[source]

Return reward for the given status. Should be implemented in subclass.

get_state()[source]

Return state passed to RL agent. Should be implemented in subclass.

get_status(msg)[source]

Process the message returned by NetHack to check if it is a terminal state. Must be implemented in subclass.

get_valid_action_indices()[source]

Get the indices of valid actions (according to the abilities list/action space). Should be implemented in subclass, if there are illegal actions in the action space. Currently returns all actions as valid.

process_action(action)[source]

Do any preprocessing required on the action selected, e.g., get the CMD object from the abilities list.

process_msg(msg, update_base=True, parse_monsters=True, parse_ammo=False)[source]

Processes the map screen outputted by NetHack.

reset()[source]

Prepare the environment for a new episode.

set_config(proc_id, action_size=1, state_size=1, max_num_actions=-1, max_num_episodes=-1, max_num_actions_per_episode=200, policy=None, **args)[source]

Set config.

Parameters:
  • proc_id – process ID of this environment, to be matched with the argument passed to the daemon launching script.
  • action_size – number of discrete actions that can be taken
  • state_size – size of state vector
  • max_num_actions – max number of actions that can be taken before exiting (*TODO*)
  • max_num_episodes – max number of episodes to take before exiting (for level env.) (TODO – but used for keras-rl)
  • max_num_actions_per_episode – max number of (legal) actions that can be taken in an episode
set_test()[source]

Change environment from training to test mode, if required.

should_end_episode()[source]

Check if we should end the current episode.

step(action)[source]

Take the given action, receive the message output from NetHack and return the new state.

take_action(action)[source]

Send the action to NetHack.

class gym_nethack.envs.base.Terminals[source]

Bases: object

CONN_ERROR = 5
IMPOSSIBLE_ACTION = 3
MONSTER_DIED = 2
OK = 0
PLAYER_DIED = 1
SUCCESS = 6
TIME_EXCEEDED = 4

gym_nethack.envs.combat module

class gym_nethack.envs.combat.Combat(monster, base_map, map, player_pos, monster_positions, start_state, start_attributes, start_stats, start_items, start_stateffs, action_list, goal_reached, end_attributes, end_stats, end_items)

Bases: tuple

action_list

Alias for field number 10

base_map

Alias for field number 1

end_attributes

Alias for field number 12

end_items

Alias for field number 14

end_stats

Alias for field number 13

goal_reached

Alias for field number 11

map

Alias for field number 2

monster

Alias for field number 0

monster_positions

Alias for field number 4

player_pos

Alias for field number 3

start_attributes

Alias for field number 6

start_items

Alias for field number 8

start_state

Alias for field number 5

start_stateffs

Alias for field number 9

start_stats

Alias for field number 7

class gym_nethack.envs.combat.NetHackCombatEnv(nhinfo=None)[source]

Bases: gym_nethack.envs.base.NetHackRLEnv

Arena-style player-on-monster NetHack combat environment, with specifiable monsters, items, atts/stats.

action_took_effect()[source]

Check if the last action taken has actually taken effect. Helps to diagnose errors that can propagate in state.

end_episode()[source]

End episode by recording episode data.

get_alignment_vector()[source]

Get the player alignment info for the state.

get_all_item_abilities()[source]

Split the potion items into ones of type ‘throw’ and ‘use’ - other items are unchanged.

get_character_vector()[source]

Get the player role info for the state.

get_command_for_action(action)[source]

Translate the given action (of type integer – an index into the self.abilities list) into a command that can be passed to NetHack, of type CMD.

get_current_equipment()[source]

Get the current equipment vector (weapons/armor/rings) for the state.

get_distance_info(discrete)[source]

Get the distance info for the state.

get_game_params()[source]

Parameters to pass to NetHack on the creation of a new game. (Will be saved in the NetHack options file.)

get_initial_inventory()[source]

Get player’s starting inventory for each episode.

get_initial_monsters()[source]

Get the possible monsters for each episode.

get_initial_role()[source]

Get player’s initial role for each episode.

get_inventory_vector()[source]

Get the player inventory vector for the state.

get_monster_pos()[source]

Get the position of the current monsteer: assuming only one monster.

get_monster_vector()[source]

Get the monster vector for the state.

get_norm_stats(discrete)[source]

Get the player attributes/dungeon level vector for the state.

get_num_monsters()[source]

Get the number of monsters for the state.

get_ranged_info()[source]

Get the projectile/ranged weapon info for the state.

get_reward(status)[source]

Return the reward for the given status.

get_savedir_info_list()[source]

Get the strings that should form the save directory name.

get_state()[source]

Create and return the current state vector.

get_status(msg)[source]

Check if we died or the monster died.

get_status_effects()[source]

Get the player status effects info for the state.

get_valid_action_indices()[source]

Return the list of valid action indices (according to the self.abilities list).

is_monster_in_line_of_fire()[source]

Check if the monster is present in the directions we can fire towards.

load_and_sample_combats()[source]

Optionally load in a list of objects of type Combat from self.savedir+/combat_records*.dll. This list of combats will then be the ones trained against in this environment. This method is called if load_combats=True is passed to set_config.

process_msg(msg)[source]

Processes the map screen outputted by NetHack.

reset()[source]

Prepare the environment for a new episode.

save_encounter_info()[source]

Called during full level combat to save combat encounters for training later on.

set_config(proc_id, num_actions=-1, num_episodes=-1, clvl_to_mlvl_diff=-3, monsters='none', initial_equipment=[], items=None, item_sampling='uniform', num_start_items=5, action_list='all', fixed_ac=999, dlvl=None, tabular=False, test_policy=None, lr=0, units_d1=0, units_d2=0, skip_training=False, load_combats=False, **args)[source]

Set config.

Parameters:
  • proc_id – process ID of this environment, to be matched with the argument passed to the daemon launching script.
  • num_actions – number of total actions to train for.
  • num_episodes – number of total episodes to train for (only used if load_combats is False).
  • clvl_to_mlvl_diff – the number of levels higher than the monster level that the player level will be set
  • monsters – tuple of (idname, [‘mon1’, ‘mon2’, …]) of monsters to be faced
  • initial_equipment – list of items that the player will always start each episode with
  • items – tuple of (idname, [‘item1’, ‘item2’, …]) of items to be used in sampling
  • item_sampling – how to determine which of the above ‘items’ will be given to the player at each episode start. could be ‘all’ (all items); ‘uniform’ (uniform sampling of size equal to the parameter below); or ‘type’ (see get_initial_inventory() for details)
  • num_start_items – number of items to randomly sample from the ‘items’ parameter if item_sampling == ‘uniform’
  • action_list – determines what actions the player can use. can be: ‘weapons_only’ (only weapons allowed); otherwise any action is allowed
  • fixed_ac – the player’s starting armor class (AC); if < 999, will be set to this value; otherwise default NH initial value will be used
  • dlvl – dungeon level for the episode. affects monster attributes (thus difficulty).
  • tabular – whether we are using a tabular representation for the Q-values (deprecated)
  • test_policy – used for record folder name (also used in ngym.py)
  • lr – used for record folder name (also used in ngym.py)
  • units_d1 – used for record folder name (also used in ngym.py)
  • units_d2 – used for record folder name (also used in ngym.py)
  • skip_training – if True, will not add above info to folder name
  • load_combats – whether to load combats from file to use for training; if True, many of the above parameters do not need to be specified.
set_test()[source]

Change environment from training to test mode.

start_episode()[source]

Start a new episode by preparing the episode record and checking if setup completed successfully.

switch_encounter()[source]

Called at the start of a new episode. If training on combats from file, switches to the next combat in the list.

gym_nethack.envs.exploration module

class gym_nethack.envs.exploration.ExplRec(actions_this_game, all_rooms_explored, actions_until_all_rooms_explored, num_rooms_explored, total_num_rooms, num_secret_rooms_explored, total_num_secret_rooms, num_secret_spots_explored, total_num_secret_spots, turn_records, opt_actions)

Bases: tuple

actions_this_game

Alias for field number 0

actions_until_all_rooms_explored

Alias for field number 2

all_rooms_explored

Alias for field number 1

num_rooms_explored

Alias for field number 3

num_secret_rooms_explored

Alias for field number 5

num_secret_spots_explored

Alias for field number 7

opt_actions

Alias for field number 10

total_num_rooms

Alias for field number 4

total_num_secret_rooms

Alias for field number 6

total_num_secret_spots

Alias for field number 8

turn_records

Alias for field number 9

class gym_nethack.envs.exploration.NetHackExplEnv(nhinfo=None)[source]

Bases: gym_nethack.envs.base.NetHackRLEnv

Environment for NetHack exploration.

end_episode()[source]

End the current episode, storing a record about the episode.

end_turn()[source]

End the current turn, observe map and store a Turn Record. (Turn = observe state & take action.)

get_command_for_action(action)[source]

Return the direction CMD for the given action index.

get_game_params()[source]

Parameters to pass to NetHack on the creation of a new game. (Will be saved in the NH options file.)

get_savedir_info_list()[source]

Get the strings that should form the save directory name.

get_status(msg)[source]

Check if we are done exploring or not.

mark_room_explored()[source]

Mark the current room as explored by adding its top left corner position to the explored rooms list.

pathfind_through_unexplored_to(target, initial)[source]

A* pathfinding from initial to target, where A* can visit any position that has NOT been explored.

Parameters:
  • target – target position to pathfind to.
  • initial – position to start pathfinding from. If None, use current player position.
process_msg(msg, slim_charset=False)[source]

Processes the map screen outputted by NetHack.

reset()[source]

Prepare the environment for a new episode.

set_config(proc_id, test_policy=None, num_episodes=200, num_episodes_per_combo=200, max_num_actions_per_episode=5000, dataset='fixed', secret_rooms=False, name='exploration', **args)[source]

Set config.

Parameters:
  • proc_id – process ID of this environment, to be matched with the argument passed to the daemon launching script.
  • num_episodes – number of total episodes to run for.
  • max_num_actions_per_episode – max number of (legal) actions that can be taken in an episode
  • dataset – whether the maps are ‘fixed’ (same set of maps, i.e., same starting RNG seed) or ‘random’ (always different)
  • secret_rooms – whether to enable generation of secret doors & corridors in NetHack maps
  • name – used for record folder name
gym_nethack.envs.exploration.TurnRec

alias of gym_nethack.envs.exploration.ExplFoodRec

gym_nethack.envs.level module

class gym_nethack.envs.level.Game(goal_reached, actions, game_number, final_clvl, final_ac, final_dlvl, final_score, final_inventory, num_combat_acts, num_expl_acts, num_combat_encounters)

Bases: tuple

actions

Alias for field number 1

final_ac

Alias for field number 4

final_clvl

Alias for field number 3

final_dlvl

Alias for field number 5

final_inventory

Alias for field number 7

final_score

Alias for field number 6

game_number

Alias for field number 2

goal_reached

Alias for field number 0

num_combat_acts

Alias for field number 8

num_combat_encounters

Alias for field number 10

num_expl_acts

Alias for field number 9

class gym_nethack.envs.level.NetHackLevelEnv[source]

Bases: gym_nethack.envs.base.NetHackRLEnv

NetHack level environment (exploration + combat).

end_episode()[source]

End the current episode, updating the record.

end_turn()[source]

End the current turn, calling the appropriate env. method.

get_command_for_action(action)[source]

Translate the given action (of type integer – an index into the self.abilities list) into a command that can be passed to NetHack, of type CMD.

get_game_params()[source]

Parameters to pass to NetHack on the creation of a new game. (Will be saved in the NH options file.)

get_reward(status)[source]

Return reward for the given status.

get_savedir_info_list()[source]

Get the strings that should form the save directory name.

get_state()[source]

Return state passed to RL agent.

get_status(msg)[source]

Check for a terminal state (death), or terminal state for one of the combat or exploration environments (monster died, or level finished).

get_valid_action_indices()[source]

Get the indices of valid actions (according to the abilities list/action space).

monster_present()[source]

Check if a monster is present within 6 squares of us.

process_msg(msg)[source]

Process the message returned by NetHack.

reset()[source]

Prepare the environment for a new episode. (Call reset() on combat and exploration envs.)

set_config(proc_id, dataset='fixed', secret_rooms=False, num_episodes=100, **args)[source]

Set config.

Parameters:
  • proc_id – process ID of this environment, to be matched with the argument passed to the daemon launching script.
  • dataset – whether the maps are ‘fixed’ (same set of maps, i.e., same starting RNG seed) or ‘random’ (always different)
  • secret_rooms – whether or not to enable secret door/corridor generation
  • num_episodes – number of total episodes to run for.

Other arguments are passed to the base, combat, and exploration env set_config() methods.

start_episode()[source]

Start a new episode (level), creating a record for it.

start_turn()[source]

Start the current turn, calling the appropriate env. method.

Module contents