The Hessian by blocks for neural network by backward propagation

Radhia Bessia LAMSIN, ENIT, Tunis, Tunisia

Nabil Gmatib College of sciences, Basic and Applied Scientific Research Center, Imam Abdulrahman Bin Faisal University, Dammam, Kingdom of Saudi ArabiaCorrespondence[email protected]

https://orcid.org/0000-0002-7847-0155

Article: 2327102 | Received 18 May 2023, Accepted 01 Mar 2024, Published online: 23 Apr 2024

Cite this article
https://doi.org/10.1080/16583655.2024.2327102
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF

References

McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943;5:115–133. doi: 10.1007/BF02478259
Google Scholar
Haohan W, Bhiksha R. On the origin of deep learning. 2017. doi: 10.48550/arXiv.1702.07800
Google Scholar
Matouq M, El-Hasan T, Duheisat S, et al. The climate change implication on Jordan: A case study using GIS and artificial neural networks for weather forecasting. J Taibah Univ Sci. 2013;7(2):44–55. doi: 10.1016/j.jtusci.2013.04.001
Google Scholar
Bottou L. 2012. Stochastic gradient descent tricks. In: Montavon G, Orr GB, Klaus-Robert M, editors. Neural networks: tricks of the trade. 2nd ed. Berlin, Heidelberg: Springer Berlin Heidelberg; 2012. p. 421–436.
Google Scholar
Dogo EM, Afolabi OJ, Nwulu NI, et al. A comparative analysis of gradient descent-based optimization algorithms on convolutional neural networks. In: 2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS), Belgaum, India, 2018, pp. 92–99. doi: 10.1109/CTEMS.2018.8769211
Google Scholar
Zeng J, Lau TT-K, Lin S, et al. Global convergence of block coordinate descent in deep learning. In: Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, 2019. p. 7313–7323.
Google Scholar
LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–2324. doi: 10.1109/5.726791
Web of Science ®Google Scholar
Bollapragada R, Nocedal J, Mudigere D, et al. A progressive batching L-BFGS method for machine learning. In: International Conference on Machine Learning, Stockholm, 2018. p. 620–629.
Google Scholar
Goldfarb D, Ren Y, Bahamou A. Practical quasi-newton methods for training deep neural networks. Adv Neural Inf Process Syst. 2020;33:2386–2396.
Google Scholar
Byrd RH, Chin GM, Neveitt W, et al. On the use of stochastic Hessian information in optimization methods for machine learning. SIAM J Optim. 2011;21:977–995. doi: 10.1137/10079923X
Web of Science ®Google Scholar
Botev A, Ritter H, Barber D. Practical Gauss-Newton optimisation for deep learning. In: Proceedings of the 34th International Conference on Machine Learning, Sydney, Vol. 70. 2017. p. 557–565.
Google Scholar
Martens J. Deep learning via Hessian-free optimization. In: IICML'10: Proceedings of the 27th International Conference on International Conference on Machine Learning, Haifa, 2010. p. 735–742.
Google Scholar
Martens J, Sutskever I, Swersky K. Estimating the Hessian by back-propagating curvature. 2012. Preprint, arXiv:1206.6464.
Google Scholar
Bishop C. A fast procedure for retraining the multilayer perceptron. Int J Neural Syst. 1994;2:229–236. doi: 10.1142/S0129065791000212
Google Scholar
Bishop C. Exact calculation of the Hessian matrix for the multilayer perceptron. Neural Comput. 1992;4:494–501. doi: 10.1162/neco.1992.4.4.494
Web of Science ®Google Scholar
Møller MF. Exact calculation of the product of the Hessian matrix of feed-forward network error functions and a vector in o(n) time. In: DAIMI Report Series. 1993.
Google Scholar
Madi M, Laroussi I. Rate of complete second-order moment convergence and theoretical applications. J Taibah Univ Sci. 2022;16:566–574. doi: 10.1080/16583655.2022.2082179
Web of Science ®Google Scholar
Zhou B-C, Han C-Y, Guo T-D. Convergence of stochastic gradient descent in deep neural network. Acta Math Appl Sinica English Ser. 2021;37:126–136. doi: 10.1007/s10255-021-0991-2
Web of Science ®Google Scholar
Roosta-Khorasani F, Mahoney MW. Sub-sampled newton methods I: Globally convergent algorithms. 2016. arxiv.org/abs/1601.04737.
Google Scholar
Kingma D, Ba J. Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), San Diego, CA, 2015.
Google Scholar

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

The Hessian by blocks for neural network by backward propagation

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

The Hessian by blocks for neural network by backward propagation

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date