Resampling.jl

A repository containing the julia code used for creating the majority of results in “Importance Resampling for Off-policy Prediction”.

Core Idea

Our contribution consists of a new approach to using importance weights for off-policy prediction for learning general value functions. Specifically, we introduce Importance Resampling, the first application of samplie importance-resampling to reinforcement learning. Importance Resampling samples mini-batches from an experience replay buffer according to a probability mass function defined with probabilities proportional to a transitions importance weights of policies (\pi/\mu). We provide theory on the consistency of Importance Sampling under typical conditions in RL, and we show improved sample efficiency as compared to reweighting.

Other Resources:

These resources were used at the NeurIPS conference, occuring in December of 2019.

Authors

Matthew Schlegel
Wes Chung
Jian Qian
Daniel Graves
Martha White

Acknowledgments

We would like to thank Huawei for their support, and especially for allowing a portion of this work to be completed during Matthew’s internship in the summer of 2018. We also would like to acknowledge University of Alberta, Alberta Machine Intelligence Institute, and NSERC for their continued funding and support.

Errata

“samplie importance-resampling” -> “sampling importance-resampling”
In the paper currently there is a mistake in the continuous four rooms figures. The x-axis should have values [3.0, 3.5, 4.0, 4.5, 5.0]. This mistake will be fixed for the final camera ready.