Masked Grid World

Experience Replay Experiment

MaskedGridWorldERExperiment — Module

MaskedGridWorldERExperiment

Module for running a standard experiment in masked grid world. An experiment to compare different RNN cells using the ActionRNNs.MaskedGridWorld environment.

Usage is detailed through the docs for

MaskedGridWorldERExperiment.default_config
MaskedGridWorldERExperiment.main_experiment
MaskedGridWorldERExperiment.working_experiment
MaskedGridWorldERExperiment.construct_env
MaskedGridWorldERExperiment.construct_agent

source

MaskedGridWorldERExperiment.main_experiment — Function

main_experiment

Run an experiment from config. See MaskedGridWorldERExperiment.working_experiment for details on running on the command line and MaskedGridWorldERExperiment.default_config for info about the default configuration.

source

MaskedGridWorldERExperiment.working_experiment — Function

working_experiment

Creates a wrapper experiment where the main experiment is called with progress=true, testing=true and the config is the default_config with the addition of the keyword arguments.

source

MaskedGridWorldERExperiment.default_config — Function

Automatically generated docs for MaskedGridWorldERExperiment config.

Experiment details.

seed::Int: seed of RNG
steps::Int: Number of steps taken in the experiment

Environment details

This experiment uses the MaskedGridWorldExperiment environment. The usable args are:

width::Int, height::Int: Width and height of the grid world
num_anchors::Int: number of states with active observations
num_goals::Int: number of goals

agent details

RNN

The RNN used for this experiment and its total hidden size, as well as a flag to use (or not use) zhu's deep action network. See

cell::String: The typeof cell. Many types are possible.
deepaction::Bool: Whether to use Zhu et. al.'s deep action 4 RNNs idea.
- internal_a::Int: the size of the action representation layer when deepaction=true
numhidden::Int: Size of hidden state in RNNs.

Optimizer details

Flux optimizers are used. See flux documentation and ExpUtils.Flux.get_optimizer for details.

opt::String: The name of the optimizer used
Parameters defined by the particular optimizer.

Learning update and replay details including:

Replay:
- replay_size::Int: How many transitions are stored in the replay.
- warm_up::Int: How many steps for warm-up (i.e. before learning begins).
Update details:
- lupdate_agg::String: the aggregation function for the QLearning update.
- gamma::Float: the discount for learning update.
- batch_size::Int: size of batch
- truncation::Int: Length of sequences used for training.
- update_wait::Int: Time between updates (counted in agent interactions)
- target_update_wait::Int: Time between target network updates (counted in agent interactions)
- hs_strategy::String: Strategy for dealing w/ hidden state in buffer.

source

MaskedGridWorldERExperiment.construct_agent — Function

construct_agent

Construct the agent for MaskedGridWorldERExperiment. See

source

MaskedGridWorldERExperiment.construct_env — Function

construct_env

Construct MaskedGridWorld. settings

"width": width of gridworld
"height": height of gridworld
"num_anchors": number of anchors
"num_goals": number of goals

source

MaskedGridWorldERExperiment.get_ann_size — Function

get_ann_size

Helper function which constructs the environment and agent using default config and kwargs then returns the number of parameters in the model.

source