1. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
2. Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).
3. Robbins, H. & Monro, S. A stochastic approximation method. Ann. Math. Statist. 22, 400–407 (1951).
4. Bottou, L. Online algorithms and stochastic approximations. In Saad, D. (ed.) Online Learning and Neural Networks (Cambridge University Press, Cambridge, UK, 1998). Revised, oct 2012.
5. Sutskever, I., Martens, J., Dahl, G. & Hinton, G. On the importance of initialization and momentum in deep learning. In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28, ICML'13, III–1139–III–1147 (JMLR.org, 2013).