Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio

Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio. Jastrzk ebski, S., Kenton, Z., Arpit, D., Ballas, N., Fischer, A., Bengio, Y., & Storkey, A. J. In Artificial Neural Networks and Machine Learning - ICANN 2018 - 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4-7, 2018, Proceedings, Part III, volume 11141, of Lecture Notes in Computer Science, pages 392–402, 2018. Springer.

Paper doi bibtex

@inproceedings{DBLP:conf/icann/JastrzebskiKABF18,
title = {Width of Minima Reached by Stochastic Gradient Descent is Influenced 
by Learning Rate to Batch Size Ratio},
author = {Stanis{ł}aw Jastrz{k e}bski and 
Zachary Kenton and 
Devansh Arpit and 
Nicolas Ballas and 
Asja Fischer and 
Yoshua Bengio and 
Amos J. Storkey},
url = {https://doi.org/10.1007/978-3-030-01424-7_39},
doi = {10.1007/978-3-030-01424-7_39},
year  = {2018},
date = {2018-01-01},
booktitle = {Artificial Neural Networks and Machine Learning - ICANN 2018 - 27th 
International Conference on Artificial Neural Networks, Rhodes, Greece, 
October 4-7, 2018, Proceedings, Part III},
volume = {11141},
pages = {392--402},
publisher = {Springer},
series = {Lecture Notes in Computer Science},
keywords = {76-8485, sda-pub},
pubstate = {published},
tppubtype = {inproceedings}
}

Downloads: 0