Directional TMaze

Experience Replay Experiment

DirectionalTMazeERExperiment — Module

DirectionalTMazeERExperiment

An experiment to compare different RNN cells using the ActionRNNs.DirectionalTMaze environment.

Usage is detailed through the docs for

DirectionalTMazeERExperiment.default_config
DirectionalTMazeERExperiment.main_experiment
DirectionalTMazeERExperiment.working_experiment
DirectionalTMazeERExperiment.construct_env
DirectionalTMazeERExperiment.construct_agent

source

DirectionalTMazeERExperiment.main_experiment — Function

main_experiment

Run an experiment from config. See DirectionalTMazeERExperiment.working_experiment for details on running on the command line and DirectionalTMazeERExperiment.default_config for info about the default configuration.

source

DirectionalTMazeERExperiment.working_experiment — Function

working_experiment

Creates a wrapper experiment where the main experiment is called with progress=true, testing=true and the config is the default_config with the addition of the keyword arguments.

source

DirectionalTMazeERExperiment.default_config — Function

Automatically generated docs for DirectionalTMazeERExperiment config.

Experiment details.

seed::Int: seed of RNG
steps::Int: Number of steps taken in the experiment

Logging Extras

By default the experiment will log and save (depending on the synopsis flag) the logging group :EXP. You can add extra logging groups and [group, name] pairs using the below arguments. Everything added to save_extras will be passed to the save operation, and will be logged automatically. The groups and names added to log_extras will be ommited from save_results but still passed back to the user through the data dict.

<log_extras::Vector{Union{String, Vector{String}}>: which group and <name> to log to the data dict. This will not be passed to save.
<save_extras::Vector{Union{String, Vector{String}}>: which groups and <names> to log to the data dict. This will be passed to save.

Environment details

This experiment uses the DirectionalTMaze environment. The usable args are:

size::Int: Size of the hallway in directional tmaze.

agent details

RNN

The RNN used for this experiment and its total hidden size, as well as a flag to use (or not use) zhu's deep action network. See

cell::String: The typeof cell. Many types are possible.
deepaction::Bool: Whether to use Zhu et. al.'s deep action 4 RNNs idea. -internal_a::Int: the size of the action representation layer when deepaction=true
numhidden::Int: Size of hidden state in RNNs.

Optimizer details

Flux optimizers are used. See flux documentation and ExpUtils.Flux.get_optimizer for details.

opt::String: The name of the optimizer used
Parameters defined by the particular optimizer.

Learning update and replay details including:

Replay:
- replay_size::Int: How many transitions are stored in the replay.
- warm_up::Int: How many steps for warm-up (i.e. before learning begins).
Update details:
- lupdate::String: Learning update name
- gamma::Float: the discount for learning update.
- batch_size::Int: size of batch
- truncation::Int: Length of sequences used for training.
- update_wait::Int: Time between updates (counted in agent interactions)
- target_update_wait::Int: Time between target network updates (counted in agent interactions)
- hs_strategy::String: Strategy for dealing w/ hidden state in buffer.

Default performance:

Time: 0:02:28
  episode:    5385
  successes:  0.8351648351648352
  loss:       1.0
  l1:         0.0
  action:     2
  preds:      Float32[0.369189, 0.48326853, 0.993273]

source

DirectionalTMazeERExperiment.get_ann_size — Function

get_ann_size

Helper function which constructs the environment and agent using default config and kwargs then returns the number of parameters in the model.

source

DirectionalTMazeERExperiment.construct_agent — Function

construct_agent

Construct the agent for DirectionalTMazeERExperiment. See

source

DirectionalTMazeERExperiment.construct_env — Function

construct_env

Construct direction tmaze using:

size::Int size of hallway.

source

Intervention Experiment (Section 6)

DirectionalTMazeInterventionExperiment — Module

DirectionalTMazeInterventionExperiment

An experiment to compare different RNN cells using the ActionRNNs.DirectionalTMaze environment.

Usage is detailed through the docs for

DirectionalTMazeInterventionExperiment.default_config
DirectionalTMazeInterventionExperiment.main_experiment
DirectionalTMazeInterventionExperiment.working_experiment
DirectionalTMazeInterventionExperiment.construct_env
DirectionalTMazeInterventionExperiment.construct_agent

source

DirectionalTMazeInterventionExperiment.main_experiment — Function

main_experiment

Run an experiment from config. See DirectionalTMazeInterventionExperiment.working_experiment for details on running on the command line and DirectionalTMazeInterventionExperiment.default_config for info about the default configuration.

source

DirectionalTMazeInterventionExperiment.working_experiment — Function

working_experiment

Creates a wrapper experiment where the main experiment is called with progress=true, testing=true and the config is the default_config with the addition of the keyword arguments.

source

DirectionalTMazeInterventionExperiment.default_config — Function

Automatically generated docs for DirectionalTMazeInterventionExperiment config.

Experiment details.

seed::Int: seed of RNG
steps::Int: Number of steps taken in the experiment

Logging Extras

<log_extras::Vector{Union{String, Vector{String}}>: which group and <name> to log to the data dict. This will not be passed to save.
<save_extras::Vector{Union{String, Vector{String}}>: which groups and <names> to log to the data dict. This will be passed to save.

Environment details

This experiment uses the DirectionalTMaze environment. The usable args are:

size::Int: Size of the hallway in directional tmaze.

agent details

RNN

The RNN used for this experiment and its total hidden size, as well as a flag to use (or not use) zhu's deep action network. See

cell::String: The typeof cell. Many types are possible.
deepaction::Bool: Whether to use Zhu et. al.'s deep action 4 RNNs idea. -internal_a::Int: the size of the action representation layer when deepaction=true
numhidden::Int: Size of hidden state in RNNs.

Optimizer details

Flux optimizers are used. See flux documentation and ExpUtils.Flux.get_optimizer for details.

opt::String: The name of the optimizer used
Parameters defined by the particular optimizer.

Learning update and replay details including:

Replay:
- replay_size::Int: How many transitions are stored in the replay.
- warm_up::Int: How many steps for warm-up (i.e. before learning begins).
Update details:
- lupdate::String: Learning update name
- gamma::Float: the discount for learning update.
- batch_size::Int: size of batch
- truncation::Int: Length of sequences used for training.
- update_wait::Int: Time between updates (counted in agent interactions)
- target_update_wait::Int: Time between target network updates (counted in agent interactions)
- hs_strategy::String: Strategy for dealing w/ hidden state in buffer.
Intervention details:
- inter_list::String: Points to the constructor fot the list of interventions to run on the agent after training.
  - \"DTMazeV1\": Testing the start intervention and middle intervention.
- inter_freeze_training::Bool: whether to pause training when performing interventions.
- inter_num_episodes::Int: Number of episodes to perform interventions (these are after training episodes).

source

DirectionalTMazeInterventionExperiment.get_ann_size — Function

get_ann_size

Helper function which constructs the environment and agent using default config and kwargs then returns the number of parameters in the model.

source

DirectionalTMazeInterventionExperiment.construct_agent — Function

construct_agent

Construct the agent for DirectionalTMazeInterventionExperiment. See

source

DirectionalTMazeInterventionExperiment.construct_env — Function

construct_env

Construct direction tmaze using:

size::Int size of hallway.

source