Generalization Bounds: Perspectives from Information Theory and PAC-Bayes

Generalization Bounds: Perspectives from Information Theory and PAC-Bayes. Hellström, F., Durisi, G., Guedj, B., & Raginsky, M. 2024. Accepted for publication.

Paper

Generalization Bounds: Perspectives from Information Theory and PAC-Bayes [pdf]

Pdf doi abstract bibtex 27 downloads

A fundamental question in theoretical machine learning is generalization. Over the past decades, the PAC-Bayesian approach has been established as a flexible framework to address the generalization capabilities of machine learning algorithms, and design new ones. Recently, it has garnered increased interest due to its potential applicability for a variety of learning algorithms, including deep neural networks. In parallel, an information-theoretic view of generalization has developed, wherein the relation between generalization and various information measures has been established. This framework is intimately connected to the PAC-Bayesian approach, and a number of results have been independently discovered in both strands. In this monograph, we highlight this strong connection and present a unified treatment of generalization. We present techniques and results that the two perspectives have in common, and discuss the approaches and interpretations that differ. In particular, we demonstrate how many proofs in the area share a modular structure, through which the underlying ideas can be intuited. We pay special attention to the conditional mutual information (CMI) framework; analytical studies of the information complexity of learning algorithms; and the application of the proposed methods to deep learning. This monograph is intended to provide a comprehensive introduction to information-theoretic generalization bounds and their connection to PAC-Bayes, serving as a foundation from which the most recent developments are accessible. It is aimed broadly towards researchers with an interest in generalization and theoretical machine learning.

@unpublished{hellstrom2023generalisation,
title = {Generalization Bounds: Perspectives from Information Theory and {PAC-Bayes}},
author={Fredrik Hellström and Giuseppe Durisi and Benjamin Guedj and Maxim Raginsky},
year={2024},
journal = {Foundations and Trends in Machine Learning},
publisher = {},
note = "Accepted for publication.",
abstract = {A fundamental question in theoretical machine learning is generalization. Over the past decades, the PAC-Bayesian approach has been established as a flexible framework to address the generalization capabilities of machine learning algorithms, and design new ones. Recently, it has garnered increased interest due to its potential applicability for a variety of learning algorithms, including deep neural networks. In parallel, an information-theoretic view of generalization has developed, wherein the relation between generalization and various information measures has been established. This framework is intimately connected to the PAC-Bayesian approach, and a number of results have been independently discovered in both strands. In this monograph, we highlight this strong connection and present a unified treatment of generalization. We present techniques and results that the two perspectives have in common, and discuss the approaches and interpretations that differ. In particular, we demonstrate how many proofs in the area share a modular structure, through which the underlying ideas can be intuited. We pay special attention to the conditional mutual information (CMI) framework; analytical studies of the information complexity of learning algorithms; and the application of the proposed methods to deep learning. This monograph is intended to provide a comprehensive introduction to information-theoretic generalization bounds and their connection to PAC-Bayes, serving as a foundation from which the most recent developments are accessible. It is aimed broadly towards researchers with an interest in generalization and theoretical machine learning.},
url = {https://arxiv.org/abs/2309.04381},
url_PDF = {https://arxiv.org/pdf/2309.04381.pdf},
doi = {10.48550/arXiv.2309.04381},
eprint={2309.04381},
archivePrefix={arXiv},
primaryClass={cs.LG},
copyright = {Creative Commons Attribution Non Commercial Share Alike 4.0 International},
keywords={mine}
}

Downloads: 27

{"_id":"WJAjYJwJuC5uZu96p","bibbaseid":"hellstrm-durisi-guedj-raginsky-generalizationboundsperspectivesfrominformationtheoryandpacbayes-2024","author_short":["Hellström, F.","Durisi, G.","Guedj, B.","Raginsky, M."],"bibdata":{"bibtype":"unpublished","type":"unpublished","title":"Generalization Bounds: Perspectives from Information Theory and PAC-Bayes","author":[{"firstnames":["Fredrik"],"propositions":[],"lastnames":["Hellström"],"suffixes":[]},{"firstnames":["Giuseppe"],"propositions":[],"lastnames":["Durisi"],"suffixes":[]},{"firstnames":["Benjamin"],"propositions":[],"lastnames":["Guedj"],"suffixes":[]},{"firstnames":["Maxim"],"propositions":[],"lastnames":["Raginsky"],"suffixes":[]}],"year":"2024","journal":"Foundations and Trends in Machine Learning","publisher":"","note":"Accepted for publication.","abstract":"A fundamental question in theoretical machine learning is generalization. Over the past decades, the PAC-Bayesian approach has been established as a flexible framework to address the generalization capabilities of machine learning algorithms, and design new ones. Recently, it has garnered increased interest due to its potential applicability for a variety of learning algorithms, including deep neural networks. In parallel, an information-theoretic view of generalization has developed, wherein the relation between generalization and various information measures has been established. This framework is intimately connected to the PAC-Bayesian approach, and a number of results have been independently discovered in both strands. In this monograph, we highlight this strong connection and present a unified treatment of generalization. We present techniques and results that the two perspectives have in common, and discuss the approaches and interpretations that differ. In particular, we demonstrate how many proofs in the area share a modular structure, through which the underlying ideas can be intuited. We pay special attention to the conditional mutual information (CMI) framework; analytical studies of the information complexity of learning algorithms; and the application of the proposed methods to deep learning. This monograph is intended to provide a comprehensive introduction to information-theoretic generalization bounds and their connection to PAC-Bayes, serving as a foundation from which the most recent developments are accessible. It is aimed broadly towards researchers with an interest in generalization and theoretical machine learning.","url":"https://arxiv.org/abs/2309.04381","url_pdf":"https://arxiv.org/pdf/2309.04381.pdf","doi":"10.48550/arXiv.2309.04381","eprint":"2309.04381","archiveprefix":"arXiv","primaryclass":"cs.LG","copyright":"Creative Commons Attribution Non Commercial Share Alike 4.0 International","keywords":"mine","bibtex":"@unpublished{hellstrom2023generalisation,\ntitle = {Generalization Bounds: Perspectives from Information Theory and {PAC-Bayes}},\nauthor={Fredrik Hellström and Giuseppe Durisi and Benjamin Guedj and Maxim Raginsky},\nyear={2024},\njournal = {Foundations and Trends in Machine Learning},\npublisher = {},\nnote = \"Accepted for publication.\",\nabstract = {A fundamental question in theoretical machine learning is generalization. Over the past decades, the PAC-Bayesian approach has been established as a flexible framework to address the generalization capabilities of machine learning algorithms, and design new ones. Recently, it has garnered increased interest due to its potential applicability for a variety of learning algorithms, including deep neural networks. In parallel, an information-theoretic view of generalization has developed, wherein the relation between generalization and various information measures has been established. This framework is intimately connected to the PAC-Bayesian approach, and a number of results have been independently discovered in both strands. In this monograph, we highlight this strong connection and present a unified treatment of generalization. We present techniques and results that the two perspectives have in common, and discuss the approaches and interpretations that differ. In particular, we demonstrate how many proofs in the area share a modular structure, through which the underlying ideas can be intuited. We pay special attention to the conditional mutual information (CMI) framework; analytical studies of the information complexity of learning algorithms; and the application of the proposed methods to deep learning. This monograph is intended to provide a comprehensive introduction to information-theoretic generalization bounds and their connection to PAC-Bayes, serving as a foundation from which the most recent developments are accessible. It is aimed broadly towards researchers with an interest in generalization and theoretical machine learning.},\nurl = {https://arxiv.org/abs/2309.04381},\nurl_PDF = {https://arxiv.org/pdf/2309.04381.pdf},\ndoi = {10.48550/arXiv.2309.04381},\neprint={2309.04381},\narchivePrefix={arXiv},\nprimaryClass={cs.LG},\ncopyright = {Creative Commons Attribution Non Commercial Share Alike 4.0 International},\nkeywords={mine}\n}\n\n","author_short":["Hellström, F.","Durisi, G.","Guedj, B.","Raginsky, M."],"key":"hellstrom2023generalisation","id":"hellstrom2023generalisation","bibbaseid":"hellstrm-durisi-guedj-raginsky-generalizationboundsperspectivesfrominformationtheoryandpacbayes-2024","role":"author","urls":{"Paper":"https://arxiv.org/abs/2309.04381"," pdf":"https://arxiv.org/pdf/2309.04381.pdf"},"keyword":["mine"],"metadata":{"authorlinks":{}},"downloads":27,"html":""},"bibtype":"unpublished","biburl":"https://bguedj.github.io/files/bguedj-publications.bib","dataSources":["suE7RgYeZEnSYr5Fy"],"keywords":["mine"],"search_terms":["generalization","bounds","perspectives","information","theory","pac","bayes","hellström","durisi","guedj","raginsky"],"title":"Generalization Bounds: Perspectives from Information Theory and PAC-Bayes","year":2024,"downloads":27}