Masked Grid World
Experience Replay Experiment
MaskedGridWorldERExperiment
— ModuleMaskedGridWorldERExperiment
Module for running a standard experiment in masked grid world. An experiment to compare different RNN cells using the ActionRNNs.MaskedGridWorld
environment.
Usage is detailed through the docs for
MaskedGridWorldERExperiment.main_experiment
— Functionmain_experiment
Run an experiment from config. See MaskedGridWorldERExperiment.working_experiment
for details on running on the command line and MaskedGridWorldERExperiment.default_config
for info about the default configuration.
MaskedGridWorldERExperiment.working_experiment
— Functionworking_experiment
Creates a wrapper experiment where the main experiment is called with progress=true, testing=true and the config is the default_config with the addition of the keyword arguments.
MaskedGridWorldERExperiment.default_config
— FunctionAutomatically generated docs for MaskedGridWorldERExperiment config.
Experiment details.
seed::Int
: seed of RNGsteps::Int
: Number of steps taken in the experiment
Environment details
This experiment uses the MaskedGridWorldExperiment environment. The usable args are:
width::Int, height::Int
: Width and height of the grid worldnum_anchors::Int
: number of states with active observationsnum_goals::Int
: number of goals
agent details
RNN
The RNN used for this experiment and its total hidden size, as well as a flag to use (or not use) zhu's deep action network. See
cell::String
: The typeof cell. Many types are possible.deepaction::Bool
: Whether to use Zhu et. al.'s deep action 4 RNNs idea.internal_a::Int
: the size of the action representation layer whendeepaction=true
numhidden::Int
: Size of hidden state in RNNs.
Optimizer details
Flux optimizers are used. See flux documentation and ExpUtils.Flux.get_optimizer
for details.
opt::String
: The name of the optimizer used- Parameters defined by the particular optimizer.
Learning update and replay details including:
Replay:
replay_size::Int
: How many transitions are stored in the replay.warm_up::Int
: How many steps for warm-up (i.e. before learning begins).
Update details:
lupdate_agg::String
: the aggregation function for the QLearning update.gamma::Float
: the discount for learning update.batch_size::Int
: size of batchtruncation::Int
: Length of sequences used for training.update_wait::Int
: Time between updates (counted in agent interactions)target_update_wait::Int
: Time between target network updates (counted in agent interactions)hs_strategy::String
: Strategy for dealing w/ hidden state in buffer.
MaskedGridWorldERExperiment.construct_agent
— Functionconstruct_agent
Construct the agent for MaskedGridWorldERExperiment
. See
MaskedGridWorldERExperiment.construct_env
— Functionconstruct_env
Construct MaskedGridWorld. settings
- "width": width of gridworld
- "height": height of gridworld
- "num_anchors": number of anchors
- "num_goals": number of goals
MaskedGridWorldERExperiment.get_ann_size
— Functionget_ann_size
Helper function which constructs the environment and agent using default config and kwargs then returns the number of parameters in the model.