gym_nethack.envs package¶
Submodules¶
gym_nethack.envs.base module¶
-
class
gym_nethack.envs.base.
Goals
[source]¶ Bases:
object
-
CONN_ERROR
= 3¶
-
LOSS
= 1¶
-
SUCCESS
= 0¶
-
TIME_EXCEEDED
= 2¶
-
-
class
gym_nethack.envs.base.
NetHackEnv
(nhinfo)[source]¶ Bases:
gym.core.Env
,gym.utils.ezpickle.EzPickle
Basic NetHack environment. Must be subclassed. Contains statistics saving/loading methods and NetHack process management.
-
end_episode
()[source]¶ End the current episode, incrementing the game counter by one and calling to save_records() every 100 games.
-
reset
()[source]¶ Prepare the environment for a new map. Kills the current NetHack process and launches a new one.
-
save_records
()[source]¶ Save records to self.savedir/*_records.dll, creating directories if necessary.
-
set_config
(proc_id, num_procs, name, parse_items, **args)[source]¶ Set config and connect to the NetHack launcher daemon.
Parameters: - proc_id – process ID of this environment, to be matched with the argument passed to the daemon launching script.
- num_procs – number of processes to run in parallel - used if grid search is running
- name – to be used for the record folder name
- parse_items – whether to handle items in the environment or not
-
-
class
gym_nethack.envs.base.
NetHackInfo
(parse_items=True)[source]¶ Bases:
object
Stores NetHack game state, and contains methods for processing/parsing screen output and for item & map information.
-
at_intersection
()[source]¶ Return true if the player is at the intersection of two or more corridors.
-
at_room_opening
(pos=None)[source]¶ Return true if the player (or the given position) is at a room opening.
-
find_char_on_base_map
(char)[source]¶ Return the map coordinate at which the first instance of the given char appears, or None if it does not.
-
get_chars_adjacent_to
(x, y, diag=False)[source]¶ Returns the list of basemap tiles adjacent to the given map coordinate.
-
get_corridor_exits
(pos=None, diag=True)[source]¶ Return the corridors adjacent to the given position.
Parameters: - pos – the position around which to look for corridors. If none, the player’s current position is used.
- diag – whether to consider diagonal tiles
-
get_cur_weapon
()[source]¶ Returns the current weapon object wielded by the player, and whether it is cursed or not.
-
get_inven_char_for_item
(item)[source]¶ Returns the inventory character mapped to a particular item, assuming the player has it in the inventory.
-
get_neighboring_positions
(x, y, diag=True)[source]¶ Returns the list of in-range coordinates adjacent to the given map coordinate.
-
get_uncovered_doors
()[source]¶ Return the coordinates which in the last turn were revealed to be doors.
-
mark_all_explored
()[source]¶ Mark all traversable positions in the map observed so far as explored, then update the pathfinding grid.
-
mark_explored
(pos)[source]¶ Add the given position to the explored positions list.
Parameters: pos – position that we want to mark as explored
-
next_to_dead_end
()[source]¶ Return true if the player (or the given position) is at a dead-end in a corridor.
-
pathfind_to
(target, initial=None, full_path=True, explored_set=None, override_target_traversability=False, override_targets=[])[source]¶ A* pathfinding from initial to target, where A* can visit any position that has been explored.
Parameters: - target – target position to pathfind to.
- initial – position to start pathfinding from. If None, use current player position.
- full_path – return entire trajectory if True, else return first position from initial.
- explored_set – if not None, increase A* heuristic score of non-explored tiles over explored tiles (e.g., if walking through a diagonal corridor, prefer to visit each square instead of moving diagonally, so we don’t miss any branching corridor).
- override_target_traversability – pathfind to target even if it is not traversable by the player (e.g., solid wall).
- override_targets – override traversability of all targets in this list (see above parameter)
-
process_msg
(socket, message, update_base=True, parse_monsters=True, parse_ammo=False)[source]¶ Processes the map screen outputted by NetHack.
Parameters: - socket – the socket connected to the NetHack process (needed to send/rcv inventory message)
- message – the message outputted by NetHack
- update_base – whether to update our record of the map with new information gleaned or not
- parse_monsters – whether to keep or discard monsters in the parsed NetHack map
- parse_ammo – whether to keep or discard ammo in the parsed NetHack map
-
-
class
gym_nethack.envs.base.
NetHackRLEnv
(nhinfo=None)[source]¶ Bases:
gym_nethack.envs.base.NetHackEnv
Basic NetHack RL env with core step() and take_action() methods. Must be subclassed.
-
get_game_params
()[source]¶ Parameters to pass to NetHack on the creation of a new game. (Will be saved in the options file.)
-
get_status
(msg)[source]¶ Process the message returned by NetHack to check if it is a terminal state. Must be implemented in subclass.
-
get_valid_action_indices
()[source]¶ Get the indices of valid actions (according to the abilities list/action space). Should be implemented in subclass, if there are illegal actions in the action space. Currently returns all actions as valid.
-
process_action
(action)[source]¶ Do any preprocessing required on the action selected, e.g., get the CMD object from the abilities list.
-
process_msg
(msg, update_base=True, parse_monsters=True, parse_ammo=False)[source]¶ Processes the map screen outputted by NetHack.
-
set_config
(proc_id, action_size=1, state_size=1, max_num_actions=-1, max_num_episodes=-1, max_num_actions_per_episode=200, policy=None, **args)[source]¶ Set config.
Parameters: - proc_id – process ID of this environment, to be matched with the argument passed to the daemon launching script.
- action_size – number of discrete actions that can be taken
- state_size – size of state vector
- max_num_actions – max number of actions that can be taken before exiting (*TODO*)
- max_num_episodes – max number of episodes to take before exiting (for level env.) (TODO – but used for keras-rl)
- max_num_actions_per_episode – max number of (legal) actions that can be taken in an episode
-
gym_nethack.envs.combat module¶
-
class
gym_nethack.envs.combat.
Combat
(monster, base_map, map, player_pos, monster_positions, start_state, start_attributes, start_stats, start_items, start_stateffs, action_list, goal_reached, end_attributes, end_stats, end_items)¶ Bases:
tuple
-
action_list
¶ Alias for field number 10
-
base_map
¶ Alias for field number 1
-
end_attributes
¶ Alias for field number 12
-
end_items
¶ Alias for field number 14
-
end_stats
¶ Alias for field number 13
-
goal_reached
¶ Alias for field number 11
-
map
¶ Alias for field number 2
-
monster
¶ Alias for field number 0
-
monster_positions
¶ Alias for field number 4
-
player_pos
¶ Alias for field number 3
-
start_attributes
¶ Alias for field number 6
-
start_items
¶ Alias for field number 8
-
start_state
¶ Alias for field number 5
-
start_stateffs
¶ Alias for field number 9
-
start_stats
¶ Alias for field number 7
-
-
class
gym_nethack.envs.combat.
NetHackCombatEnv
(nhinfo=None)[source]¶ Bases:
gym_nethack.envs.base.NetHackRLEnv
Arena-style player-on-monster NetHack combat environment, with specifiable monsters, items, atts/stats.
-
action_took_effect
()[source]¶ Check if the last action taken has actually taken effect. Helps to diagnose errors that can propagate in state.
-
get_all_item_abilities
()[source]¶ Split the potion items into ones of type ‘throw’ and ‘use’ - other items are unchanged.
-
get_command_for_action
(action)[source]¶ Translate the given action (of type integer – an index into the self.abilities list) into a command that can be passed to NetHack, of type CMD.
-
get_current_equipment
()[source]¶ Get the current equipment vector (weapons/armor/rings) for the state.
-
get_game_params
()[source]¶ Parameters to pass to NetHack on the creation of a new game. (Will be saved in the NetHack options file.)
-
get_valid_action_indices
()[source]¶ Return the list of valid action indices (according to the self.abilities list).
-
is_monster_in_line_of_fire
()[source]¶ Check if the monster is present in the directions we can fire towards.
-
load_and_sample_combats
()[source]¶ Optionally load in a list of objects of type Combat from self.savedir+/combat_records*.dll. This list of combats will then be the ones trained against in this environment. This method is called if load_combats=True is passed to set_config.
-
save_encounter_info
()[source]¶ Called during full level combat to save combat encounters for training later on.
-
set_config
(proc_id, num_actions=-1, num_episodes=-1, clvl_to_mlvl_diff=-3, monsters='none', initial_equipment=[], items=None, item_sampling='uniform', num_start_items=5, action_list='all', fixed_ac=999, dlvl=None, tabular=False, test_policy=None, lr=0, units_d1=0, units_d2=0, skip_training=False, load_combats=False, **args)[source]¶ Set config.
Parameters: - proc_id – process ID of this environment, to be matched with the argument passed to the daemon launching script.
- num_actions – number of total actions to train for.
- num_episodes – number of total episodes to train for (only used if load_combats is False).
- clvl_to_mlvl_diff – the number of levels higher than the monster level that the player level will be set
- monsters – tuple of (idname, [‘mon1’, ‘mon2’, …]) of monsters to be faced
- initial_equipment – list of items that the player will always start each episode with
- items – tuple of (idname, [‘item1’, ‘item2’, …]) of items to be used in sampling
- item_sampling – how to determine which of the above ‘items’ will be given to the player at each episode start. could be ‘all’ (all items); ‘uniform’ (uniform sampling of size equal to the parameter below); or ‘type’ (see get_initial_inventory() for details)
- num_start_items – number of items to randomly sample from the ‘items’ parameter if item_sampling == ‘uniform’
- action_list – determines what actions the player can use. can be: ‘weapons_only’ (only weapons allowed); otherwise any action is allowed
- fixed_ac – the player’s starting armor class (AC); if < 999, will be set to this value; otherwise default NH initial value will be used
- dlvl – dungeon level for the episode. affects monster attributes (thus difficulty).
- tabular – whether we are using a tabular representation for the Q-values (deprecated)
- test_policy – used for record folder name (also used in ngym.py)
- lr – used for record folder name (also used in ngym.py)
- units_d1 – used for record folder name (also used in ngym.py)
- units_d2 – used for record folder name (also used in ngym.py)
- skip_training – if True, will not add above info to folder name
- load_combats – whether to load combats from file to use for training; if True, many of the above parameters do not need to be specified.
-
gym_nethack.envs.exploration module¶
-
class
gym_nethack.envs.exploration.
ExplRec
(actions_this_game, all_rooms_explored, actions_until_all_rooms_explored, num_rooms_explored, total_num_rooms, num_secret_rooms_explored, total_num_secret_rooms, num_secret_spots_explored, total_num_secret_spots, turn_records, opt_actions)¶ Bases:
tuple
-
actions_this_game
¶ Alias for field number 0
-
actions_until_all_rooms_explored
¶ Alias for field number 2
-
all_rooms_explored
¶ Alias for field number 1
-
num_rooms_explored
¶ Alias for field number 3
-
num_secret_rooms_explored
¶ Alias for field number 5
-
num_secret_spots_explored
¶ Alias for field number 7
-
opt_actions
¶ Alias for field number 10
-
total_num_rooms
¶ Alias for field number 4
-
total_num_secret_rooms
¶ Alias for field number 6
-
total_num_secret_spots
¶ Alias for field number 8
-
turn_records
¶ Alias for field number 9
-
-
class
gym_nethack.envs.exploration.
NetHackExplEnv
(nhinfo=None)[source]¶ Bases:
gym_nethack.envs.base.NetHackRLEnv
Environment for NetHack exploration.
-
end_turn
()[source]¶ End the current turn, observe map and store a Turn Record. (Turn = observe state & take action.)
-
get_game_params
()[source]¶ Parameters to pass to NetHack on the creation of a new game. (Will be saved in the NH options file.)
-
mark_room_explored
()[source]¶ Mark the current room as explored by adding its top left corner position to the explored rooms list.
-
pathfind_through_unexplored_to
(target, initial)[source]¶ A* pathfinding from initial to target, where A* can visit any position that has NOT been explored.
Parameters: - target – target position to pathfind to.
- initial – position to start pathfinding from. If None, use current player position.
-
set_config
(proc_id, test_policy=None, num_episodes=200, num_episodes_per_combo=200, max_num_actions_per_episode=5000, dataset='fixed', secret_rooms=False, name='exploration', **args)[source]¶ Set config.
Parameters: - proc_id – process ID of this environment, to be matched with the argument passed to the daemon launching script.
- num_episodes – number of total episodes to run for.
- max_num_actions_per_episode – max number of (legal) actions that can be taken in an episode
- dataset – whether the maps are ‘fixed’ (same set of maps, i.e., same starting RNG seed) or ‘random’ (always different)
- secret_rooms – whether to enable generation of secret doors & corridors in NetHack maps
- name – used for record folder name
-
-
gym_nethack.envs.exploration.
TurnRec
¶ alias of
gym_nethack.envs.exploration.ExplFoodRec
gym_nethack.envs.level module¶
-
class
gym_nethack.envs.level.
Game
(goal_reached, actions, game_number, final_clvl, final_ac, final_dlvl, final_score, final_inventory, num_combat_acts, num_expl_acts, num_combat_encounters)¶ Bases:
tuple
-
actions
¶ Alias for field number 1
-
final_ac
¶ Alias for field number 4
-
final_clvl
¶ Alias for field number 3
-
final_dlvl
¶ Alias for field number 5
-
final_inventory
¶ Alias for field number 7
-
final_score
¶ Alias for field number 6
-
game_number
¶ Alias for field number 2
-
goal_reached
¶ Alias for field number 0
-
num_combat_acts
¶ Alias for field number 8
-
num_combat_encounters
¶ Alias for field number 10
-
num_expl_acts
¶ Alias for field number 9
-
-
class
gym_nethack.envs.level.
NetHackLevelEnv
[source]¶ Bases:
gym_nethack.envs.base.NetHackRLEnv
NetHack level environment (exploration + combat).
-
get_command_for_action
(action)[source]¶ Translate the given action (of type integer – an index into the self.abilities list) into a command that can be passed to NetHack, of type CMD.
-
get_game_params
()[source]¶ Parameters to pass to NetHack on the creation of a new game. (Will be saved in the NH options file.)
-
get_status
(msg)[source]¶ Check for a terminal state (death), or terminal state for one of the combat or exploration environments (monster died, or level finished).
-
get_valid_action_indices
()[source]¶ Get the indices of valid actions (according to the abilities list/action space).
-
reset
()[source]¶ Prepare the environment for a new episode. (Call reset() on combat and exploration envs.)
-
set_config
(proc_id, dataset='fixed', secret_rooms=False, num_episodes=100, **args)[source]¶ Set config.
Parameters: - proc_id – process ID of this environment, to be matched with the argument passed to the daemon launching script.
- dataset – whether the maps are ‘fixed’ (same set of maps, i.e., same starting RNG seed) or ‘random’ (always different)
- secret_rooms – whether or not to enable secret door/corridor generation
- num_episodes – number of total episodes to run for.
Other arguments are passed to the base, combat, and exploration env set_config() methods.
-