Temporal Difference Learning
This is a method for learning Value Functions and was first described by (Sutton 1988).
References
Sutton, Richard. 1988. “Learning to Predict by the Methods of Temporal Differences.” Machine Learning 3 (1): 9–44. doi:10.1007/BF00115009.