1. Goodfellow
I.
Bengio
Y.
& Courville
A. (2016). Deep Learning. MIT Press. In Chapter 10
"Sequence Modeling: Recurrent and Recursive Nets
" Section 10.2.1
"Recurrent Neural Networks
" it is explained: "This recurrence allows the network to have 'memory' as the output at a time step t is a function of the hidden state at t-1... The hidden state h(t) serves as a summary of the past sequence of inputs up to t." (pp. 371-372).
2. Stanford University. (n.d.). CS230: Deep Learning
Lecture Notes: Sequence Models. Retrieved from Stanford University Courseware. The notes describe the core RNN equation as h = g(Wh [h
x] + bh)
explicitly showing that the hidden state at time t (h) is a function of the previous hidden state (h)
thus retaining context.
3. Lipton
Z. C.
Berkowitz
J.
& Elkan
C. (2015). A Critical Review of Recurrent Neural Networks for Sequence Learning. arXiv preprint arXiv:1506.00019. Section 2
"Recurrent Neural Network Models
" states: "At each time step t
the hidden state ht of the RNN is updated by a function f... taking as input the previous hidden state h{t-1} and the current input xt... This recurrent formulation allows the network to store information about the inputs it has processed so far in its hidden state." (p. 2). https://doi.org/10.48550/arXiv.1506.00019