A Fast Learning Algorithm for Deep Belief Nets. Hinton, G. E., Osindero, S., & Teh, Y. Neural Computation, 18(7):1527–1554, July, 2006.
Paper doi abstract bibtex 4 downloads We show how to use “complementary priors” to eliminate the explaining away effects that make inference difficult in densely-connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modelled by long ravines in the free-energy landscape of the top-level associative memory and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.
@article{hinton_fast_2006,
title = {A {Fast} {Learning} {Algorithm} for {Deep} {Belief} {Nets}},
volume = {18},
issn = {0899-7667, 1530-888X},
url = {https://direct.mit.edu/neco/article/18/7/1527-1554/7065},
doi = {10.1162/neco.2006.18.7.1527},
abstract = {We show how to use “complementary priors” to eliminate the explaining away effects that make inference difficult in densely-connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modelled by long ravines in the free-energy landscape of the top-level associative memory and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.},
language = {en},
number = {7},
urldate = {2025-10-03},
journal = {Neural Computation},
author = {Hinton, Geoffrey E. and Osindero, Simon and Teh, Yee-Whye},
month = jul,
year = {2006},
pages = {1527--1554},
}
Downloads: 4
{"_id":{"_str":"534259570e946d920a000986"},"__v":19,"authorIDs":["2ACfCTBEv4pRPLwBb","4DQrsTmafKuPbvKom","5457dd852abc8e9f3700082c","546eb8ddec3c47a518000fdb","5de76c4f179cbdde01000135","5de7b92fbc280fdf01000192","5de7e861c8f9f6df01000188","5de7ff309b61e8de0100005f","5de917d35d589edf01000025","5de93bf8b8c3f8de010000a3","5de95819d574c6de010000d5","5de96615d574c6de010001ab","5de9faf7fac96fde01000039","5dea1112fac96fde01000194","5deb75f49e04d1df010000c8","5deb8542b62591df0100002d","5deb946fb62591df010000ef","5decb37d93ac84df01000108","5dece9a3619535de010000f9","5dee20da584fb4df0100023f","5dee5ebb773914de01000077","5dee6b12773914de0100015a","5deea5af0ceb4cdf01000193","5deee4cc66e59ade01000133","5def23c6e83f7dde0100003c","5def2e39e83f7dde010000a6","5def601cfe2024de01000084","5defdd35090769df01000181","5df0938cf651f5df01000056","5df0980df651f5df010000a2","5df0c74096fa76de01000024","5df0eda045b054df010000fb","5df2008fe4cb4ede01000035","5df2583563aac8df010000ad","5df25ae963aac8df010000dd","5df28978cf8320de0100001f","5df3756223fb6fdf010000fe","5df38d112b1f8ade01000086","5df3f9cad1756cdf01000039","5df4ca0755b997de0100009a","5df4cd8055b997de010000c2","5df53e56fd245cde01000125","5df60b78a37a40df01000156","5df62fce38e915de0100004b","5df6491ddf30fcdf0100003d","5df67503797ba9de01000104","5df6983872bbd4df01000160","5df6b0e031a37ade01000178","5df789d35c8a36df010000f7","5df7c23392a8e4df010000da","5df7dafbdc100cde010000e1","5df7e65edc100cde010001c6","5df89d4010b1d1de01000088","5df8b0cee6b510df01000021","5df93745d04b27df01000185","5df9d77138a7afde01000084","5dfa483ced5baede0100011b","5dfa67a37d1403df01000123","5dfbc3f34705b7de01000022","5dfcc5cc7a3608de0100004f","5dfe49bfbfbabdde01000004","5e1dc9478d71ddde0100015d","5e29d9d0888177df0100011e","5e48c117f1ed39de0100008d","5e555c0ee89e5fde010000e6","5e55fa1c819fabdf0100003a","5e5b04db6e568ade0100001f","5hGMdsfN7BrXW6K8T","5vmPz2jJcYQdtZPiZ","6yoSqPPyPrLdz8e5Q","BYkXaBeGZENiggkom","Bm98SYMoSNDbYwKGj","EsmZfHTQHAoi4zrJ2","N6cuxqTfG9ybhWDqZ","PXRdnhZs2CXY9NLhX","Q7zrKooGeSy8NTBjC","QxWxCp32GcmNqJ9K2","WnMtdN4pbnNcAtJ9C","e3ZEg6YfZmhHyjxdZ","exw99o2vqr9d3BXtB","fnGMsMDrpkcjCLZ5X","gN5Lfqjgx8P4c7HJT","gxtJ9RRRnpW2hQdtv","hCHC3WLvySqxwH4eZ","jN4BRAzEpDg6bmHmM","mBpuinLcpSzpxcFaz","n3Tju5NZ6trek5XEM","n3hXojCsQTaqGTPyY","ovEhxZqGLG9hGfrun","rnZ6cT67qkowNdLgz","u6Fai3nvyHwLKZpPn","vcz5Swk9goZXRki2G","x9kDqsoXq57J2bEu5","xmZk6XEacSsFbo2Sy","xufS6EqKGDqRQs47H"],"author_short":["Hinton, G. E.","Osindero, S.","Teh, Y."],"bibbaseid":"hinton-osindero-teh-afastlearningalgorithmfordeepbeliefnets-2006","bibdata":{"bibtype":"article","type":"article","title":"A Fast Learning Algorithm for Deep Belief Nets","volume":"18","issn":"0899-7667, 1530-888X","url":"https://direct.mit.edu/neco/article/18/7/1527-1554/7065","doi":"10.1162/neco.2006.18.7.1527","abstract":"We show how to use “complementary priors” to eliminate the explaining away effects that make inference difficult in densely-connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modelled by long ravines in the free-energy landscape of the top-level associative memory and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.","language":"en","number":"7","urldate":"2025-10-03","journal":"Neural Computation","author":[{"propositions":[],"lastnames":["Hinton"],"firstnames":["Geoffrey","E."],"suffixes":[]},{"propositions":[],"lastnames":["Osindero"],"firstnames":["Simon"],"suffixes":[]},{"propositions":[],"lastnames":["Teh"],"firstnames":["Yee-Whye"],"suffixes":[]}],"month":"July","year":"2006","pages":"1527–1554","bibtex":"@article{hinton_fast_2006,\n\ttitle = {A {Fast} {Learning} {Algorithm} for {Deep} {Belief} {Nets}},\n\tvolume = {18},\n\tissn = {0899-7667, 1530-888X},\n\turl = {https://direct.mit.edu/neco/article/18/7/1527-1554/7065},\n\tdoi = {10.1162/neco.2006.18.7.1527},\n\tabstract = {We show how to use “complementary priors” to eliminate the explaining away effects that make inference difficult in densely-connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modelled by long ravines in the free-energy landscape of the top-level associative memory and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.},\n\tlanguage = {en},\n\tnumber = {7},\n\turldate = {2025-10-03},\n\tjournal = {Neural Computation},\n\tauthor = {Hinton, Geoffrey E. and Osindero, Simon and Teh, Yee-Whye},\n\tmonth = jul,\n\tyear = {2006},\n\tpages = {1527--1554},\n}\n\n\n\n","author_short":["Hinton, G. E.","Osindero, S.","Teh, Y."],"key":"hinton_fast_2006","id":"hinton_fast_2006","bibbaseid":"hinton-osindero-teh-afastlearningalgorithmfordeepbeliefnets-2006","role":"author","urls":{"Paper":"https://direct.mit.edu/neco/article/18/7/1527-1554/7065"},"metadata":{"authorlinks":{"hinton, g":"https://bibbase.org/show?bib=www.cs.toronto.edu/~fritz/master3.bib&theme=side"}},"downloads":4},"bibtype":"article","biburl":"https://bibbase.org/zotero-group/schulzkx/5158478","downloads":4,"keywords":[],"search_terms":["fast","learning","algorithm","deep","belief","nets","hinton","osindero","teh"],"title":"A Fast Learning Algorithm for Deep Belief Nets","year":2006,"dataSources":["avdRdTCKoXoyxo2tQ","GtChgCdrAm62yoP3L","Lou3KcoFGgAb5MdCt","C5FtkvWWggFfMJTFX","cx4WvnDhXJhiLqdQo","nZHrFJKyxKKDaWYM8","qAPjQpsx8e9aJNrSa","yw8ZqdtKGJAFyrDpm","JFDnASMkoQCjjGL8E"]}