LSTM

tags
Recurrent Neural Network, Neural Network, Machine Learning
source
https://colah.github.io/posts/2015-08-Understanding-LSTMs/

Recurrent Neural Network

See other extensions: LSTM, wu2016, chandar2019, goudreau1994, sutskever2011, cho2014

chandar2019: Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies

In this paper they introduce a new recurrent cell named “NRU” for Non-saturating Recurrent Unit. There are two main contributions in this architecture which make it unique from other cells (i.e LSTM).

chung2014: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

This paper does an empirical evaluation of several recurrent gates including LSTMs hochreiter1997, GRU cho2014, and Vanilla RNNs. The paper also provides descriptions for the different cells tested and a nice high level description of the generative model employed by RNNs.

This paper does an empirical evaluation of several recurrent gates including LSTMs hochreiter1997, GRU cho2014, and Vanilla RNNs. The paper also provides descriptions for the different cells tested and a nice high level description of the generative model employed by RNNs.

sutskever2011: Generating text with recurrent neural networks

The main contribution of this paper is the application of RNNs on a hard language tasks, thus showing their potential for language and other sequence tasks. Instead of using the usual Vanilla RNN, or an LSTM they introduce the idea of multiplicative RNNs and tensor RNNs. They find these significantly improve performance on the tasks. They mention that the multiplicative RNNs have some optimization issues which are mediated through the use of second-order optimization techniques.