Mixture of Experts with Entropic Regularization for Data Classification. Peralta, B., Saavedra, A., Caro, L., & Soto, A. Entropy, 2019. Paper abstract bibtex Today, there is growing interest in the automatic classification of a variety of tasks, such as weather forecasting, product recommendations, intrusion detection, and people recognition.“Mixture-of-experts” is a well-known classification technique; it is a probabilistic model consisting of local expert classifiers weighted by a gate network that is typically based on softmax functions, combined with learnable complex patterns in data. In this scheme, one data point is influenced by only one expert; as a result, the training process can be misguided in real datasets for which complex data need to be explained by multiple experts. In this work, we propose a variant of the regular mixture-of-experts model. In the proposed model, the cost classification is penalized by the Shannon entropy of the gating network in order to avoid a “winner-takes-all” output for the gating network. Experiments show the advantage of our approach using several real datasets, with improvements in mean accuracy of 3–6% in some datasets. In future work, we plan to embed feature selection into this model.
@article{Peralta:EtAl:2019,
Author = {B. Peralta and A. Saavedra and L. Caro and A. Soto},
Title = {Mixture of Experts with Entropic Regularization for Data Classification},
Journal = {Entropy},
Volume = {21},
Number = {2},
Year = {2019},
abstract = {Today, there is growing interest in the automatic classification of a variety of tasks, such as weather forecasting, product recommendations, intrusion detection, and people recognition.“Mixture-of-experts” is a well-known classification technique; it is a probabilistic model consisting of local expert classifiers weighted by a gate network that is typically based on softmax functions, combined with learnable complex patterns in data. In this scheme, one data point is influenced by only one expert; as a result, the training process can be misguided in real datasets for which complex data need to be explained by multiple experts. In this work, we propose a variant of the regular mixture-of-experts model. In the proposed model, the cost classification is penalized by the Shannon entropy of the gating network in order to avoid a “winner-takes-all” output for the gating network. Experiments show the advantage of our approach using several real datasets, with improvements in mean accuracy of 3–6\% in some datasets. In future work, we plan to embed feature selection into this model.},
url = {https://www.mdpi.com/1099-4300/21/2/190}
}
Downloads: 0
{"_id":"PTWkmw5HtfNfHXKiT","bibbaseid":"peralta-saavedra-caro-soto-mixtureofexpertswithentropicregularizationfordataclassification-2019","authorIDs":["32ZR23o2BFySHbtQK","3ear6KFZSRqbj6YeT","4Pq6KLaQ8jKGXHZWH","54578d9a2abc8e9f370004f0","5e126ca5a4cabfdf01000053","5e158f76f1f31adf01000118","5e16174bf67f7dde010003ad","5e1f631ae8f5ddde010000eb","5e1f7182e8f5ddde010001ff","5e26da3642065ede01000066","5e3acefaf2a00cdf010001c8","5e62c3aecb259cde010000f9","5e65830c6e5f4cf3010000e7","5e666dfc46e828de010002c9","6cMBYieMJhf6Nd58M","6w6sGsxYSK2Quk6yZ","7xDcntrrtC62vkWM5","ARw5ReidxxZii9TTZ","DQ4JRTTWkvKXtCNCp","GbYBJvxugXMriQwbi","HhRoRmBvwWfD4oLyK","JFk6x26H6LZMoht2n","JvArGGu5qM6EvSCvB","LpqQBhFH3PxepH9KY","MT4TkSGzAp69M3dGt","QFECgvB5v2i4j2Qzs","RKv56Kes3h6FwEa55","Rb9TkQ3KkhGAaNyXq","RdND8NxcJDsyZdkcK","SpKJ5YujbHKZnHc4v","TSRdcx4bbYKqcGbDg","W8ogS2GJa6sQKy26c","WTi3X2fT8dzBN5d8b","WfZbctNQYDBaiYW6n","XZny8xuqwfoxzhBCB","Xk2Q5qedS5MFHvjEW","bbARiTJLYS79ZMFbk","cBxsyeZ37EucQeBYK","cFyFQps7W3Sa2Wope","dGRBfr8zhMmbwK6eP","eRLgwkrEk7T7Lmzmf","fMYSCX8RMZap548vv","g6iKCQCFnJgKYYHaP","h2hTcQYuf2PB3oF8t","h83jBvZYJPJGutQrs","jAtuJBcGhng4Lq2Nd","pMoo2gotJcdDPwfrw","q5Zunk5Y2ruhw5vyq","rzNGhqxkbt2MvGY29","uC8ATA8AfngWpYLBq","uoJ7BKv28Q6TtPmPp","vMiJzqEKCsBxBEa3v","vQE6iTPpjxpuLip2Z","wQDRsDjhgpMJDGxWX","wbNg79jvDpzX9zHLK","wk86BgRiooBjy323E","zCbPxKnQGgDHiHMWn","zf9HENjsAzdWLMDAu"],"author_short":["Peralta, B.","Saavedra, A.","Caro, L.","Soto, A."],"bibdata":{"bibtype":"article","type":"article","author":[{"firstnames":["B."],"propositions":[],"lastnames":["Peralta"],"suffixes":[]},{"firstnames":["A."],"propositions":[],"lastnames":["Saavedra"],"suffixes":[]},{"firstnames":["L."],"propositions":[],"lastnames":["Caro"],"suffixes":[]},{"firstnames":["A."],"propositions":[],"lastnames":["Soto"],"suffixes":[]}],"title":"Mixture of Experts with Entropic Regularization for Data Classification","journal":"Entropy","volume":"21","number":"2","year":"2019","abstract":"Today, there is growing interest in the automatic classification of a variety of tasks, such as weather forecasting, product recommendations, intrusion detection, and people recognition.“Mixture-of-experts” is a well-known classification technique; it is a probabilistic model consisting of local expert classifiers weighted by a gate network that is typically based on softmax functions, combined with learnable complex patterns in data. In this scheme, one data point is influenced by only one expert; as a result, the training process can be misguided in real datasets for which complex data need to be explained by multiple experts. In this work, we propose a variant of the regular mixture-of-experts model. In the proposed model, the cost classification is penalized by the Shannon entropy of the gating network in order to avoid a “winner-takes-all” output for the gating network. Experiments show the advantage of our approach using several real datasets, with improvements in mean accuracy of 3–6% in some datasets. In future work, we plan to embed feature selection into this model.","url":"https://www.mdpi.com/1099-4300/21/2/190","bibtex":"@article{Peralta:EtAl:2019,\n Author = {B. Peralta and A. Saavedra and L. Caro and A. Soto},\n Title = {Mixture of Experts with Entropic Regularization for Data Classification},\n Journal = {Entropy},\n Volume = {21},\n Number = {2},\n Year = {2019},\n abstract = {Today, there is growing interest in the automatic classification of a variety of tasks, such as weather forecasting, product recommendations, intrusion detection, and people recognition.“Mixture-of-experts” is a well-known classification technique; it is a probabilistic model consisting of local expert classifiers weighted by a gate network that is typically based on softmax functions, combined with learnable complex patterns in data. In this scheme, one data point is influenced by only one expert; as a result, the training process can be misguided in real datasets for which complex data need to be explained by multiple experts. In this work, we propose a variant of the regular mixture-of-experts model. In the proposed model, the cost classification is penalized by the Shannon entropy of the gating network in order to avoid a “winner-takes-all” output for the gating network. Experiments show the advantage of our approach using several real datasets, with improvements in mean accuracy of 3–6\\% in some datasets. In future work, we plan to embed feature selection into this model.},\nurl = {https://www.mdpi.com/1099-4300/21/2/190}\n}\n\n","author_short":["Peralta, B.","Saavedra, A.","Caro, L.","Soto, A."],"key":"Peralta:EtAl:2019","id":"Peralta:EtAl:2019","bibbaseid":"peralta-saavedra-caro-soto-mixtureofexpertswithentropicregularizationfordataclassification-2019","role":"author","urls":{"Paper":"https://www.mdpi.com/1099-4300/21/2/190"},"metadata":{"authorlinks":{"soto, a":"https://asoto.ing.puc.cl/publications/"}}},"bibtype":"article","biburl":"https://asoto.ing.puc.cl/AlvaroPapers.bib","creationDate":"2019-07-23T03:49:04.553Z","downloads":1,"keywords":[],"search_terms":["mixture","experts","entropic","regularization","data","classification","peralta","saavedra","caro","soto"],"title":"Mixture of Experts with Entropic Regularization for Data Classification","year":2019,"dataSources":["3YPRCmmijLqF4qHXd","m8qFBfFbjk9qWjcmJ","QjT2DEZoWmQYxjHXS"]}