Directional TMaze
Experience Replay Experiment
DirectionalTMazeERExperiment
— ModuleDirectionalTMazeERExperiment
An experiment to compare different RNN cells using the ActionRNNs.DirectionalTMaze
environment.
Usage is detailed through the docs for
DirectionalTMazeERExperiment.main_experiment
— Functionmain_experiment
Run an experiment from config. See DirectionalTMazeERExperiment.working_experiment
for details on running on the command line and DirectionalTMazeERExperiment.default_config
for info about the default configuration.
DirectionalTMazeERExperiment.working_experiment
— Functionworking_experiment
Creates a wrapper experiment where the main experiment is called with progress=true, testing=true and the config is the default_config with the addition of the keyword arguments.
DirectionalTMazeERExperiment.default_config
— FunctionAutomatically generated docs for DirectionalTMazeERExperiment config.
Experiment details.
seed::Int
: seed of RNGsteps::Int
: Number of steps taken in the experiment
Logging Extras
By default the experiment will log and save (depending on the synopsis flag) the logging group :EXP
. You can add extra logging groups and [group, name] pairs using the below arguments. Everything added to save_extras
will be passed to the save operation, and will be logged automatically. The groups and names added to log_extras
will be ommited from save_results but still passed back to the user through the data dict.
<log_extras::Vector{Union{String, Vector{String}}>
: which group and <name> to log to the data dict. This will not be passed to save.<save_extras::Vector{Union{String, Vector{String}}>
: which groups and <names> to log to the data dict. This will be passed to save.
Environment details
This experiment uses the DirectionalTMaze environment. The usable args are:
size::Int
: Size of the hallway in directional tmaze.
agent details
RNN
The RNN used for this experiment and its total hidden size, as well as a flag to use (or not use) zhu's deep action network. See
cell::String
: The typeof cell. Many types are possible.deepaction::Bool
: Whether to use Zhu et. al.'s deep action 4 RNNs idea. -internal_a::Int
: the size of the action representation layer whendeepaction=true
numhidden::Int
: Size of hidden state in RNNs.
Optimizer details
Flux optimizers are used. See flux documentation and ExpUtils.Flux.get_optimizer
for details.
opt::String
: The name of the optimizer used- Parameters defined by the particular optimizer.
Learning update and replay details including:
Replay:
replay_size::Int
: How many transitions are stored in the replay.warm_up::Int
: How many steps for warm-up (i.e. before learning begins).
Update details:
lupdate::String
: Learning update namegamma::Float
: the discount for learning update.batch_size::Int
: size of batchtruncation::Int
: Length of sequences used for training.update_wait::Int
: Time between updates (counted in agent interactions)target_update_wait::Int
: Time between target network updates (counted in agent interactions)hs_strategy::String
: Strategy for dealing w/ hidden state in buffer.
Default performance:
Time: 0:02:28
episode: 5385
successes: 0.8351648351648352
loss: 1.0
l1: 0.0
action: 2
preds: Float32[0.369189, 0.48326853, 0.993273]
DirectionalTMazeERExperiment.get_ann_size
— Functionget_ann_size
Helper function which constructs the environment and agent using default config and kwargs then returns the number of parameters in the model.
DirectionalTMazeERExperiment.construct_agent
— Functionconstruct_agent
Construct the agent for DirectionalTMazeERExperiment
. See
DirectionalTMazeERExperiment.construct_env
— Functionconstruct_env
Construct direction tmaze using:
size::Int
size of hallway.
Intervention Experiment (Section 6)
DirectionalTMazeInterventionExperiment
— ModuleDirectionalTMazeInterventionExperiment
An experiment to compare different RNN cells using the ActionRNNs.DirectionalTMaze
environment.
Usage is detailed through the docs for
DirectionalTMazeInterventionExperiment.main_experiment
— Functionmain_experiment
Run an experiment from config. See DirectionalTMazeInterventionExperiment.working_experiment
for details on running on the command line and DirectionalTMazeInterventionExperiment.default_config
for info about the default configuration.
DirectionalTMazeInterventionExperiment.working_experiment
— Functionworking_experiment
Creates a wrapper experiment where the main experiment is called with progress=true, testing=true and the config is the default_config with the addition of the keyword arguments.
DirectionalTMazeInterventionExperiment.default_config
— FunctionAutomatically generated docs for DirectionalTMazeInterventionExperiment config.
Experiment details.
seed::Int
: seed of RNGsteps::Int
: Number of steps taken in the experiment
Logging Extras
By default the experiment will log and save (depending on the synopsis flag) the logging group :EXP
. You can add extra logging groups and [group, name] pairs using the below arguments. Everything added to save_extras
will be passed to the save operation, and will be logged automatically. The groups and names added to log_extras
will be ommited from save_results but still passed back to the user through the data dict.
<log_extras::Vector{Union{String, Vector{String}}>
: which group and <name> to log to the data dict. This will not be passed to save.<save_extras::Vector{Union{String, Vector{String}}>
: which groups and <names> to log to the data dict. This will be passed to save.
Environment details
This experiment uses the DirectionalTMaze environment. The usable args are:
size::Int
: Size of the hallway in directional tmaze.
agent details
RNN
The RNN used for this experiment and its total hidden size, as well as a flag to use (or not use) zhu's deep action network. See
cell::String
: The typeof cell. Many types are possible.deepaction::Bool
: Whether to use Zhu et. al.'s deep action 4 RNNs idea. -internal_a::Int
: the size of the action representation layer whendeepaction=true
numhidden::Int
: Size of hidden state in RNNs.
Optimizer details
Flux optimizers are used. See flux documentation and ExpUtils.Flux.get_optimizer
for details.
opt::String
: The name of the optimizer used- Parameters defined by the particular optimizer.
Learning update and replay details including:
Replay:
replay_size::Int
: How many transitions are stored in the replay.warm_up::Int
: How many steps for warm-up (i.e. before learning begins).
Update details:
lupdate::String
: Learning update namegamma::Float
: the discount for learning update.batch_size::Int
: size of batchtruncation::Int
: Length of sequences used for training.update_wait::Int
: Time between updates (counted in agent interactions)target_update_wait::Int
: Time between target network updates (counted in agent interactions)hs_strategy::String
: Strategy for dealing w/ hidden state in buffer.
Intervention details:
inter_list::String
: Points to the constructor fot the list of interventions to run on the agent after training.\"DTMazeV1\"
: Testing the start intervention and middle intervention.
inter_freeze_training::Bool
: whether to pause training when performing interventions.inter_num_episodes::Int
: Number of episodes to perform interventions (these are after training episodes).
DirectionalTMazeInterventionExperiment.get_ann_size
— Functionget_ann_size
Helper function which constructs the environment and agent using default config and kwargs then returns the number of parameters in the model.
DirectionalTMazeInterventionExperiment.construct_agent
— Functionconstruct_agent
Construct the agent for DirectionalTMazeInterventionExperiment
. See
DirectionalTMazeInterventionExperiment.construct_env
— Functionconstruct_env
Construct direction tmaze using:
size::Int
size of hallway.