General Documentation
This page hosts the general documentation of the ActionRNNs.jl
library. This includes all research code used in this project.
Contents
Index
ActionRNNs.MaskedGridWorldHelpers
ActionRNNs.AbstractActionRNN
ActionRNNs.AbstractERAgent
ActionRNNs.ActionDense
ActionRNNs.CircularBuffer
ActionRNNs.DRQNAgent
ActionRNNs.DRTDNAgent
ActionRNNs.DirectionalTMaze
ActionRNNs.ExpUtils.FluxUtils.RMSPropTF
ActionRNNs.ExpUtils.FluxUtils.RMSPropTFCentered
ActionRNNs.HSMinimize
ActionRNNs.HSRefil
ActionRNNs.HSStale
ActionRNNs.LinkedChainsV2
ActionRNNs.LunarLander
ActionRNNs.MaskedGridWorld
ActionRNNs.QLearning
ActionRNNs.RingWorld
ActionRNNs.StateBuffer
ActionRNNs.TMaze
ActionRNNs.UpdateTimer
ActionRNNs.ϵGreedy
ActionRNNs.ϵGreedyDecay
ActionRNNs.AAGRU
ActionRNNs.AALSTM
ActionRNNs.AARNN
ActionRNNs.ActionGatedRNN
ActionRNNs.CaddAAGRU
ActionRNNs.CaddElGRU
ActionRNNs.CaddElRNN
ActionRNNs.CaddGRU
ActionRNNs.CaddMAGRU
ActionRNNs.CaddRNN
ActionRNNs.CcatGRU
ActionRNNs.CcatRNN
ActionRNNs.CsoftmaxElGRU
ActionRNNs.CsoftmaxElRNN
ActionRNNs.ExpUtils.FluxUtils.get_optimizer
ActionRNNs.FacMAGRU
ActionRNNs.FacMARNN
ActionRNNs.FacTucMAGRU
ActionRNNs.FacTucMARNN
ActionRNNs.GAIAGRU
ActionRNNs.GAIALSTM
ActionRNNs.GAIARNN
ActionRNNs.GAIGRU
ActionRNNs.GAUGRU
ActionRNNs.MAGRU
ActionRNNs.MALSTM
ActionRNNs.MARNN
ActionRNNs.MixElGRU
ActionRNNs.MixElRNN
ActionRNNs.MixGRU
ActionRNNs.MixRNN
ActionRNNs._needs_action_input
ActionRNNs.build_gating_network
ActionRNNs.build_new_feat
ActionRNNs.build_rnn_layer
ActionRNNs.capacity
ActionRNNs.contains_comp
ActionRNNs.contains_layer_type
ActionRNNs.contains_rnn_type
ActionRNNs.find_layers_with_eq
ActionRNNs.find_layers_with_recur
ActionRNNs.get_action_and_prob
ActionRNNs.get_device
ActionRNNs.get_hs_details_for_er
ActionRNNs.get_hs_from_experience!
ActionRNNs.get_hs_replay_strategy
ActionRNNs.get_hs_symbol_list
ActionRNNs.get_information_from_experience
ActionRNNs.get_learning_update
ActionRNNs.get_model
ActionRNNs.get_prob
ActionRNNs.get_replay
ActionRNNs.get_state_from_experience
ActionRNNs.hs_symbol_layer
ActionRNNs.make_obs_list
ActionRNNs.make_replay
ActionRNNs.modify_hs_in_er!
ActionRNNs.modify_hs_in_er_by_grad!
ActionRNNs.needs_action_input
ActionRNNs.reset!
ActionRNNs.sample
ActionRNNs.set_training_mode!
ActionRNNs.training_mode
ActionRNNs.update!
ActionRNNs.update!
ActionRNNs.update!
ActionRNNs.update_target_network!
Base.length
Base.push!
HelpfulKernelFuncs.contract_WA
HelpfulKernelFuncs.get_waa
MinimalRLCore.start!
MinimalRLCore.step!
MinimalRLCore.step!
Cells
ActionRNNs.AbstractActionRNN
— TypeAbstractActionRNN
An abstract struct which will take the current hidden state and a tuple of observations and actions and returns the next hidden state.
ActionRNNs._needs_action_input
— Function_needs_action_input
If true, this means the cell or layer needs a tuple as input.
Basic Cells
ActionRNNs.AARNN
— FunctionAARNN(in::Integer, actions::Integer, out::Integer, σ = tanh)
Like an RNN cell, except takes a tuple (action, observation) as input. The action is used with get_waa
with results added to the usual update.
The update is as follows: σ.(Wi*o .+ get_waa(Wa, a) .+ Wh*h .+ b)
ActionRNNs.AAGRU
— FunctionAAGRU(in, actions, out)
Additive Action Gated Recurrent Unit layer. Behaves like an AARNN
but uses a GRU internal structure
ActionRNNs.AALSTM
— FunctionAALSTM(in::Integer, na::Integer, out::Integer)
Additive Action Long Short Term Memory recurrent layer. Behaves like an RNN but generally exhibits a longer memory span over sequences. See this article for a good overview of the internals.
ActionRNNs.MARNN
— FunctionMARNN(in::Integer, actions::Integer, out::Integer, σ = tanh)
This cell incorporates the action as a multiplicative operation. We use contract_WA
and get_waa
to handle this.
The update is as follows:
new_h = σ.(contract_WA(m.Wx, a, o) .+ contract_WA(m.Wh, a, h) .+ get_waa(m.b, a))
ActionRNNs.MAGRU
— FunctionMAGRU(in, actions, out)
Multiplicative Action Gated Recurrent Unit layer. Behaves like an MARNN
but uses a GRU internal structure.
ActionRNNs.MALSTM
— FunctionMALSTM(in::Integer, na::Integer, out::Integer)
Muliplicative Action Long Short Term Memory recurrent layer. Behaves like an RNN but generally exhibits a longer memory span over sequences. See this article for a good overview of the internals.
ActionRNNs.FacMARNN
— FunctionFacMARNN(in::Integer, actions::Integer, out::Integer, factors, σ = tanh; init_style="ignore")
This cell incorporates the action as a multiplicative operation, but as a factored approximation of the multiplicative version. This cell uses get_waa
. Uses CP decomposition.
The update is as follows:
new_h = m.σ.(W*((Wx*o .+ Wh*h) .* get_waa(Wa, a)) .+ get_waa(m.b, a))
Three init_styles:
- standard: using init and initb w/o any keywords
- ignore:
W = init(out, factors, ignore_dims=2)
- tensor: Decompose
W_t = init(actions, out, in+out; ignore_dims=1)
to getW_o, W_a, W_hi
usingTensorToolbox.cp_als
.
ActionRNNs.FacMAGRU
— FunctionFacMAGRU(in, actions, out, factors)
Factored Multiplicative Action Gated Recurrent Unit layer. Behaves like an FacMARNN
but uses a GRU internal structure.
Three init_styles:
- standard: using init and initb w/o any keywords
- ignore:
W = init(out, factors, ignore_dims=2)
- tensor: Decompose
W_t = init(actions, out, in+out; ignore_dims=1)
to getW_o, W_a, W_hi
usingTensorToolbox.cp_als
.
ActionRNNs.FacTucMARNN
— FunctionFacTucMARNN(in::Integer, actions::Integer, out::Integer, action_factors, out_factors, in_factors, σ = tanh; init_style="ignore")
This cell incorporates the action as a multiplicative operation, but as a factored approximation of the multiplicative version. This cell uses get_waa
. Uses Tucker decomposition.
Three init_styles:
- standard: using init and initb w/o any keywords
- ignore:
Wa = init(action_factors, actions; ignore_dims=2)
ActionRNNs.FacTucMAGRU
— FunctionFacTucMAGRU(in, actions, out, factors)
Factored Multiplicative Action Gated Recurrent Unit layer. Behaves like an FacTucMARNN
but uses a GRU internal structure.
Combo Cells
ActionRNNs.CaddRNN
— FunctionCaddRNN(in, actions, out, σ = tanh)
Mixing between AARNN
and MARNN
through a weighting
h′ = (w[1]*new_hAA + w[2]*new_hMA) ./ sum(w)
ActionRNNs.CaddGRU
— FunctionCaddGRU(in, actions, out, σ = tanh)
Mixing between AAGRU
and MAGRU
through a weighting
h′ = (w[1]*new_hAA + w[2]*new_hMA) ./ sum(w)
ActionRNNs.CaddAAGRU
— FunctionCaddAAGRU(in, actions, out)
Mixing between two AAGRU
cells through weighting
```julia h′ = (w[1]new_hAA1 + w[2]new_hAA2) ./ sum(w)
ActionRNNs.CaddMAGRU
— FunctionCaddMAGRU(in, actions, out)
Mixing between two MAGRU
cells through weighting
```julia h′ = (w[1]new_hMA1 + w[2]new_hMA2) ./ sum(w)
ActionRNNs.CaddElRNN
— FunctionCaddElRNN(in, actions, out, σ = tanh)
Mixing between AARNN
and MARNN
through a weighting
h′ = (AA_θ .* AA_h′ .+ MA_θ .* MA_h′) ./ (AA_θ .+ MA_θ)
ActionRNNs.CaddElGRU
— FunctionCaddElGRU(in, actions, out)
Mixing between AAGRU
and MAGRU
through a weighting
h′ = (AA_θ .* AA_h′ .+ MA_θ .* MA_h′) ./ (AA_θ .+ MA_θ)
ActionRNNs.CcatRNN
— FunctionActionRNNs.CcatGRU
— FunctionActionRNNs.CsoftmaxElRNN
— FunctionCaddElRNN(in, actions, out, σ = tanh)
Mixing between AARNN
and MARNN
through a weighting
h′ = (AA_θ .* AA_h′ .+ MA_θ .* MA_h′) ./ (AA_θ .+ MA_θ)
ActionRNNs.CsoftmaxElGRU
— FunctionCaddElGRU(in, actions, out)
Mixing between AAGRU
and MAGRU
through a weighting
h′ = (AA_θ .* AA_h′ .+ MA_θ .* MA_h′) ./ (AA_θ .+ MA_θ)
Mixed Cells
ActionRNNs.MixRNN
— FunctionMixRNN(in, actions, out, num_experts, σ = tanh)
Mixing between num_experts
AARNN
cells. Uses the weighting
h′ = sum(θ[i] .* expert_h′[i] for i in 1:length(θ)) ./ sum(θ)
ActionRNNs.MixElRNN
— FunctionMixElRNN(in, actions, out, num_experts, σ = tanh)
Mixing between num_experts
AARNN
cells. Uses the weighting
h′ = sum(θ[i] .* expert_h′[i] for i in 1:length(θ)) ./ sum(θ)
(here θ[i] is a vector).
ActionRNNs.MixGRU
— FunctionMixGRU(in, actions, out, num_experts)
Mixing between num_experts
AAGRU
cells. Uses the weighting
h′ = sum(θ[i] .* expert_h′[i] for i in 1:length(θ)) ./ sum(θ)
ActionRNNs.MixElGRU
— FunctionMixElGRU(in, actions, out, num_experts)
Mixing between num_experts
AAGRU
cells. Uses the weighting
h′ = sum(θ[i] .* expert_h′[i] for i in 1:length(θ)) ./ sum(θ)
(here θ[i]
is a vector).
ActionRNNs.ActionGatedRNN
— FunctionActionGatedRNN(in::Integer, na, internal, out::Integer, σ = tanh)
The most basic recurrent layer; essentially acts as a Dense
layer, but with the output fed back into the input each time step.
Old/DefunctCells
ActionRNNs.GAUGRU
— FunctionGAUGRU(in::Integer, na::Integer, internal::Integer, out::Integer)
Gated Action Input Gated Recurrent Unit layer. Behaves like an RNN but generally exhibits a longer memory span over sequences. See this article for a good overview of the internals.
ActionRNNs.GAIGRU
— FunctionGAIGRU(in::Integer, na::Integer, internal::Integer, out::Integer)
Gated Action Input Gated Recurrent Unit layer. Behaves like an RNN but generally exhibits a longer memory span over sequences. See this article for a good overview of the internals.
ActionRNNs.GAIARNN
— FunctionGAIARNN(in::Integer, na, internal, out::Integer, σ = tanh)
The most basic recurrent layer; essentially acts as a Dense
layer, but with the output fed back into the input each time step.
ActionRNNs.GAIAGRU
— FunctionGAIAGRU(in::Integer, na::Integer, internal::Integer, out::Integer)
Gated Action Input Gated Recurrent Unit layer. Behaves like an RNN but generally exhibits a longer memory span over sequences. See this article for a good overview of the internals.
ActionRNNs.GAIALSTM
— FunctionGAIALSTM(in::Integer, na::Integer, out::Integer)
Gated Action Input by Action Long Short Term Memory recurrent layer. Behaves like an RNN but generally exhibits a longer memory span over sequences. See this article for a good overview of the internals.
Shared operations for cells
HelpfulKernelFuncs.contract_WA
— Functioncontract_WA(W, a::Int, x)
contract_WA(W, a::AbstractVector{Int}, x)
contract_WA(W, a::AbstractVector{<:AbstractFloat}, x)
contract_WA(W::CuArray, a::AbstractVector{Int}, x)
This contraction operator will take the weights W
, action (or action vector for batches) a
, and features. The weight matrix is assumed to be in nactions × out × in
.
HelpfulKernelFuncs.get_waa
— Functionget_waa(Wa, a)
Different ways of handeling geting action value from a set of weights. This operation can be seen as Wa*a
where Wa
is the weight matrix, and a is the action representation. This is to be used with various cells to incorporate this operation more reliably.
Other Layers
ActionRNNs.ActionDense
— TypeActionDense(in, na, out, σ; init, bias)
Create an actions Dense layer. This layer takes in a tuple (action, observaiton) and returns the dense layer using and additive approach. This can be used for previous actions or current actions.
Learning Updates
ActionRNNs.QLearning
— TypeQLearning
QLearningMSE(γ)
QLearningSUM(γ)
QLearningHUBER(γ)
Watkins q-learning with various loss functions.
Constructors
ActionRNNs.build_rnn_layer
— Functionbuild_rnn_layer(in, actions, out, parsed, rng)
Build an rnn layer according from parsed. This assumes the "cell"
key is in the parsed
dict. in, actions, and out are integers. must explicitly pass in a RNG.
Gets layer constructor from either the ActionRNNs or Flux namespaces.
Types of build types
build_rnn_layer(::BuildActionRNN, args...; kwargs...)
Standard Additive and Multiplicative cells. No extra parameters.
build_rnn_layer(::BuildFactored, args...; kwargs...)
Factored (not tucker) cells. Extra Config Options:
init_style::String
: They style of init. Check your cell for possible options.factors::Int
: Number of factors in factorization.
build_rnn_layer(::BuildTucFactored, args...; kwargs...)
Tucker Factored cells: Extra Config Options:
in_factors::Int
: Number of factors in input matrixaction_factors::Int
: Number of factors in action matrixout_factors::Int
: Number of factors in out matrix
build_rnn_layer(::BuildComboCat, args...; kwargs...)
Combo cat AA/MA cells. No Extra Params.
build_rnn_layer(::BuildComboAdd, args...; kwargs...)
Combo add AA/MA cells. No Extra Params.
build_rnn_layer(::BuildMixed, args...; kwargs...)
Mixed layers. Extra Config Options -num_experts::Int
: number of parallel cells in mixture.
build_rnn_layer(::BuildFlux, args...; kwargs...)
Flux cell. No extra parameters.
ActionRNNs.build_gating_network
— Functionbuild_gating_network
[[out, activation]]
Agents
Experience Replay Agents
ActionRNNs.AbstractERAgent
— TypeAbstractERAgent
The abstract struct for building experience replay agents.
example agent: mutable struct DRQNAgent{ER, Φ, Π, HS<:AbstractMatrix{Float32}} <: AbstractERAgent lu::LearningUpdate opt::O model::C target_network::CT
build_features::F
state_list::DataStructures.CircularBuffer{Φ}
hidden_state_init::Dict{Symbol, HS}
replay::ER
update_timer::UpdateTimer
target_update_timer::UpdateTimer
batch_size::Int
τ::Int
s_t::Φ
π::Π
γ::Float32
action::Int
am1::Int
action_prob::Float64
hs_learnable::Bool
beg::Bool
cur_step::Int
hs_tr_init::Dict{Symbol, HS}
end
Instantiations
ActionRNNs.DRQNAgent
— TypeDRQNAgent
An intense function... lol.
ActionRNNs.DRTDNAgent
— TypeBasic DRQNAgent.
Implementation details
ActionRNNs.get_replay
— Functionget_replay(agent::AbstractERAgent)
Get the replay buffer from the agent.
ActionRNNs.get_learning_update
— Functionget_learning_update(agent::AbstractERAgent)
Get the learning update from the agent.
ActionRNNs.get_device
— Functionget_device(agent::AbstractERAgent)
Get the current device from the agent.
ActionRNNs.get_action_and_prob
— Functionget_action_and_prob(π, values, rng)
Get action and the associated probability of taking the action.
ActionRNNs.get_model
— Functionget_model(agent::AbstractERAgent)
return the model from the agent.
MinimalRLCore.start!
— Method MinimalRLCore.start!(agent::AbstractERAgent, s, rng; kwargs...)
Start the agent for a new episode.
MinimalRLCore.step!
— MethodMinimalRLCore.step!(agent::AbstractERAgent, env_s_tp1, r, terminal, rng; kwargs...)
step! for an experience replay agent.
MinimalRLCore.step!
— FunctionMinimalRLCore.step!(agent::AbstractERAgent, env_s_tp1, r, terminal, rng; kwargs...)
step! for an experience replay agent.
ActionRNNs.training_mode
— Functiontraining_mode(agent::AbstractERAgent)
returns bool whether the agent is in training mode.
ActionRNNs.set_training_mode!
— Functionset_training_mode(agent::AbstractERAgent, mode::Bool)
sets training mode to boolean value
ActionRNNs.update!
— Methodupdate!(agent::AbstractERAgent{<:ControlUpdate}, rng)
Update the parameters of the model.
ActionRNNs.update!
— Methodupdate!(agent::AbstractERAgent{<:PredictionUpdate}, rng)
Update the parameters of the model.
ActionRNNs.update!
— Functionupdate!(agent::AbstractERAgent{<:ControlUpdate}, rng)
Update the parameters of the model.
update!(agent::AbstractERAgent{<:PredictionUpdate}, rng)
Update the parameters of the model.
ActionRNNs.update_target_network!
— Functionupdate_target_network!
Update the target network.
Online Agents
Tools/Utils
ActionRNNs.UpdateTimer
— TypeUpdateTimer
Keeps track of timer for doing things in the agent.
ActionRNNs.make_obs_list
— Functionmake_obs_list
Makes the obs list and initial state used for recurrent networks in an agent. Uses an init function to define the init tuple.
ActionRNNs.build_new_feat
— Functionbuild_new_feat(agent, state, action)
convenience for building new feature vector
Hidden state manipulation
ActionRNNs.HSStale
— TypeHSStale
ActionRNNs.HSMinimize
— TypeHSMinimize
ActionRNNs.HSRefil
— TypeHSRefil
ActionRNNs.get_hs_replay_strategy
— Functionget_hs_replay_strategy(agent::AbstractERAgent)
Get the replay strategy of the agent.
ActionRNNs.modify_hs_in_er!
— Functionmodify_hs_in_er!(hs_strategy::Bool, args...; kwargs...)
Legacy function for hs_strategy as a boolean.
ActionRNNs.modify_hs_in_er_by_grad!
— Functionmodify_hs_in_er!
Updating hidden state in the experience replay buffer.
ActionRNNs.reset!
— Functionreset!(m, h_init::Dict)
reset!(m::Flux.Recur, h_init)
Reset the hidden state according to the dict hinit with keys from [`gethssymbollist`](@ref). If model is a recur just replace the hidden state.
Replay buffer
ActionRNNs.CircularBuffer
— TypeCircularBuffer Maintains a buffer of fixed size w/o reallocating and deallocating memory through a circular queue data struct.
ActionRNNs.StateBuffer
— TypeStateBuffer(size::Int, state_size)
A cicular buffer for states. Typically used for images, can be used for state shapes up to 4d.
Base.length
— Methodlength(buffer)
Returns the current amount of data in the circular buffer. If the full flag is true then we return the size of the whole data frame.
Base.push!
— Methodpush!(buffer, data)
Adds data to the buffer, where data is an array of collections of types defined in CircularBuffer.datatypes returns row of data of added d
ActionRNNs.get_hs_details_for_er
— Functionget_hs_details_for_er(model)
Return the types, sizes, and symbols of the hidden state for the ER buffer.
ActionRNNs.hs_symbol_layer
— Functionhs_symbol_layer(l, idx)
Get symbol of current layer's hidden state layer.
ActionRNNs.get_hs_symbol_list
— Functionget_hs_symbol_list(model)
Get list of hidden state symbols for all rnn layers.
ActionRNNs.get_state_from_experience
— Functionget_state_from_experiment
Returns hidden state from experience sampled from an experience replay buffer. This assumes the replay has (:am1, :s, :a, :sp, :r, :t, :beg, hs_symbol...)
as columns.
ActionRNNs.get_information_from_experience
— Functionget_information_from_experience(agent, exp)
Gets the tuple of required details for the update of the agent. This is dispatched on the type of learning update. You can use the helper abstract classes, or dispatch for your specific update.
ActionRNNs.make_replay
— Functionmake_replay
ActionRNNs.get_hs_from_experience!
— Functionget_hs_from_experience!(model, exp::NamedTuple, hs_dict::Dict, device)
get_hs_from_experience!(model, exp::Vector, hs_dict::Dict, device)
Get hs in the appropriate formate from the experience (either a Named Tuple or a vector of tuples Named Tuples).
ActionRNNs.capacity
— Functioncapacity(buffer)
returns the max number of elements the buffer can store.
Flux Chain Manipulation
ActionRNNs.contains_comp
— Functioncontains_comp(comp::Function, model)
Check if a layer of a model returns true with comp.
ActionRNNs.find_layers_with_eq
— Functionfind_layers_with_eq(eq::Function, model)
A function which takes a model and a function and returns the locations where the function returns true. This only supports composing chains twice.
ActionRNNs.find_layers_with_recur
— Functionfind_layers_with_recur(model)
Finds layers with recur. Uses find_layers_with_eq
.
ActionRNNs.contains_rnn_type
— Functioncontains_rnn_type(m, rnn_type)
Checks if the model has a specific rnn type.
ActionRNNs.needs_action_input
— Functionneeds_action_input(m)
Checks if the model needs action input as a tuple.
ActionRNNs.contains_layer_type
— Functioncontains_layer_type(model, type)
Check if the model has a specific layer type.
Policies
ActionRNNs.ϵGreedy
— TypeϵGreedy(ϵ, action_set)
ϵGreedy(ϵ, num_actions)
Simple ϵGreedy value policy.
ActionRNNs.ϵGreedyDecay
— TypeϵGreedyDecay{AS}(ϵ_range, decay_period, warmup_steps, action_set::AS)
ϵGreedyDecay(ϵ_range, end_step, num_actions)
This is an acting policy which decays exploration linearly over time. This api will possibly change overtime once I figure out a better way to specify decaying epsilon.
Arguments
ϵ_range::Tuple{Float64, Float64}
: (max epsilon, min epsilon) decay_period::Int
: period epsilon decays warmup_steps::Int
: number of steps before decay starts
ActionRNNs.get_prob
— Functionget_prob(ap::ϵGreedy, values, action)
Get probabiliyt of action according to values.
ActionRNNs.sample
— Functionsample(ap::ϵGreedy, values, rng)
Select an action according to the values.
Feature Constructors
Environments
RingWorld
ActionRNNs.RingWorld
— TypeRingWorld States: 1 2 3 ... n Vis: 1 <-> 0 <-> 0 <-> ... <-> 0 <-| ^––––––––––––––-|
chain_length: size (diameter) of ring actions: Forward of Backward
LinkedChains
ActionRNNs.LinkedChainsV2
— TypeLinkedChains
termmode:
- CONT: No termination
- TERM: Terminate after chain
dynmode:
- STRAIGHT: high Negative reward on wrong actions, but still progress through chain
- JUMP: Jump to different chain on wrong action
- STUCK: Don't progress on wrong action
- JUMPSTUCK: Get "lost" with wrong actions, still being implemented.
TMaze
ActionRNNs.TMaze
— TypeTMaze
TMaze as defined by Bram Bakker.
DirectionalTMaze
ActionRNNs.DirectionalTMaze
— TypeDirectionalTMaze
Similar to ActionRNNs.TMaze
but with a directional componenet overlayed ontop. This also changes to observation structure, where the agent must know what direction it is facing to get information about which goal is the good goal.
Masked Grid World
ActionRNNs.MaskedGridWorld
— TypeMaskedGridWorld
This grid world gives observations on a random number of states which are aliased (or not given obsstrategy). This environment also has the pacmanwrapping flag which makes it so the edges wrap around.
width::Int
: width of gwheight::Int
: height of gwanchors::Int
: number of anchors (Int), or list of anchor statesgoals_or_rews
: number of goals, list of goals, or list of rewards.obs_strategy
: what obs are returned, :seperate, :full, aliasedpacman_wrapping::Bool
: whether the walls are invisible and wrap around
ActionRNNs.MaskedGridWorldHelpers
— ModuleMaskedGridWorldHelpers
Helper functions for the Masked grid world environment.
Lunar Lander
ActionRNNs.LunarLander
— TypeLunarLander
FluxUtils Stuff
ActionRNNs.ExpUtils.FluxUtils.get_optimizer
— Functionget_optimizer
Return the Flux optimizer given a config dictionary. The optimizer name is found at key "opt"
. The parameters also change based on the optimizer.
- OneParamInit:
eta::Float
- TwoParamInit:
eta::Float
,rho::Float
- AdamParamInit:
eta::Float
,beta::Vector
or(beta_m::Int, beta_v::Int)
ActionRNNs.ExpUtils.FluxUtils.RMSPropTF
— TypeRMSPropTF(η, ρ)
Implements the RMSProp algortihm as implemented in tensorflow.
- Learning Rate (η): Defaults to
0.001
. - Rho (ρ): Defaults to
0.9
. - Gamma (γ): Defaults to
0.0
. - Epsilon (ϵ): Defaults to
1e-6
Examples
References
ActionRNNs.ExpUtils.FluxUtils.RMSPropTFCentered
— TypeRMSPropTFCentered(η, ρ)
Implements the Centered version of RMSProp algortihm as implemented in tensorflow.
- Learning Rate (η): Defaults to
0.001
. - Rho (ρ): Defaults to
0.9
. - Gamma (γ): Defaults to
0.0
. - Epsilon (ϵ): Defaults to
1e-6
Examples
References