Unsupervised Anomaly Detection in Large Databases Using Bayesian Networks. Cansado, A. & Soto, A. *Applied Artificial Intelligence*, 22(4):309-330, 2008. Paper abstract bibtex 2 downloads Today, there has been a massive proliferation of huge databases storing valuable information. The opportunities of an effective use of these new data sources are enormous, however, the huge size and dimensionality of current large databases call for new ideas to scale up current statistical and computational approaches. This paper presents an application of Ar- tificial Intelligence technology to the problem of automatic detection of candidate anomalous records in a large database. We build our approach with three main goals in mind: 1)An effective detection of the records that are potentially anomalous, 2)A suitable selection of the subset of at- tributes that explains what makes a record anomalous, and 3)An efficient implementation that allows us to scale the approach to large databases. Our algorithm, called Bayesian Network Anomaly Detector (BNAD), uses the joint probability density function (pdf) provided by a Bayesian Net- work (BN) to achieve these goals. By using appropriate data structures, advanced caching techniques, the flexibility of Gaussian Mixture models, and the efficiency of BNs to model joint pdfs, BNAD manages to effi- ciently learn a suitable BN from a large dataset. We test BNAD using synthetic and real databases, the latter from the fields of manufacturing and astronomy, obtaining encouraging results.

@Article{ cansado:soto:2008,
author = {A. Cansado and A. Soto},
title = {Unsupervised Anomaly Detection in Large Databases Using
Bayesian Networks},
journal = {Applied Artificial Intelligence},
volume = {22},
number = {4},
pages = {309-330},
year = {2008},
abstract = {Today, there has been a massive proliferation of huge
databases storing valuable information. The opportunities
of an effective use of these new data sources are enormous,
however, the huge size and dimensionality of current large
databases call for new ideas to scale up current
statistical and computational approaches. This paper
presents an application of Ar- tificial Intelligence
technology to the problem of automatic detection of
candidate anomalous records in a large database. We build
our approach with three main goals in mind: 1)An effective
detection of the records that are potentially anomalous,
2)A suitable selection of the subset of at- tributes that
explains what makes a record anomalous, and 3)An efficient
implementation that allows us to scale the approach to
large databases. Our algorithm, called Bayesian Network
Anomaly Detector (BNAD), uses the joint probability density
function (pdf) provided by a Bayesian Net- work (BN) to
achieve these goals. By using appropriate data structures,
advanced caching techniques, the flexibility of Gaussian
Mixture models, and the efficiency of BNs to model joint
pdfs, BNAD manages to effi- ciently learn a suitable BN
from a large dataset. We test BNAD using synthetic and real
databases, the latter from the fields of manufacturing and
astronomy, obtaining encouraging results. },
url = {http://saturno.ing.puc.cl/media/papers_alvaro/Cansado-Soto-AAI-2007.pdf}
}

Downloads: 2

{"_id":{"_str":"53427a470e946d920a0018d8"},"__v":1,"authorIDs":["32ZR23o2BFySHbtQK","3ear6KFZSRqbj6YeT","4Pq6KLaQ8jKGXHZWH","54578d9a2abc8e9f370004f0","5e126ca5a4cabfdf01000053","5e158f76f1f31adf01000118","5e16174bf67f7dde010003ad","5e1f631ae8f5ddde010000eb","5e1f7182e8f5ddde010001ff","5e26da3642065ede01000066","5e3acefaf2a00cdf010001c8","5e62c3aecb259cde010000f9","5e65830c6e5f4cf3010000e7","5e666dfc46e828de010002c9","6cMBYieMJhf6Nd58M","6w6sGsxYSK2Quk6yZ","7xDcntrrtC62vkWM5","ARw5ReidxxZii9TTZ","BjzM7QpRCG7uCF7Zf","DQ4JRTTWkvKXtCNCp","GbYBJvxugXMriQwbi","HhRoRmBvwWfD4oLyK","JFk6x26H6LZMoht2n","JvArGGu5qM6EvSCvB","LpqQBhFH3PxepH9KY","MT4TkSGzAp69M3dGt","QFECgvB5v2i4j2Qzs","RKv56Kes3h6FwEa55","Rb9TkQ3KkhGAaNyXq","RdND8NxcJDsyZdkcK","SpKJ5YujbHKZnHc4v","TSRdcx4bbYKqcGbDg","W8ogS2GJa6sQKy26c","WTi3X2fT8dzBN5d8b","WfZbctNQYDBaiYW6n","XZny8xuqwfoxzhBCB","Xk2Q5qedS5MFHvjEW","bbARiTJLYS79ZMFbk","cBxsyeZ37EucQeBYK","cFyFQps7W3Sa2Wope","dGRBfr8zhMmbwK6eP","eRLgwkrEk7T7Lmzmf","fMYSCX8RMZap548vv","g6iKCQCFnJgKYYHaP","h2hTcQYuf2PB3oF8t","h83jBvZYJPJGutQrs","jAtuJBcGhng4Lq2Nd","pMoo2gotJcdDPwfrw","q5Zunk5Y2ruhw5vyq","rzNGhqxkbt2MvGY29","uC8ATA8AfngWpYLBq","vMiJzqEKCsBxBEa3v","vQE6iTPpjxpuLip2Z","wQDRsDjhgpMJDGxWX","wbNg79jvDpzX9zHLK","wk86BgRiooBjy323E","zCbPxKnQGgDHiHMWn","zf9HENjsAzdWLMDAu"],"author_short":["Cansado, A.","Soto, A."],"bibbaseid":"cansado-soto-unsupervisedanomalydetectioninlargedatabasesusingbayesiannetworks-2008","bibdata":{"bibtype":"article","type":"article","author":[{"firstnames":["A."],"propositions":[],"lastnames":["Cansado"],"suffixes":[]},{"firstnames":["A."],"propositions":[],"lastnames":["Soto"],"suffixes":[]}],"title":"Unsupervised Anomaly Detection in Large Databases Using Bayesian Networks","journal":"Applied Artificial Intelligence","volume":"22","number":"4","pages":"309-330","year":"2008","abstract":"Today, there has been a massive proliferation of huge databases storing valuable information. The opportunities of an effective use of these new data sources are enormous, however, the huge size and dimensionality of current large databases call for new ideas to scale up current statistical and computational approaches. This paper presents an application of Ar- tificial Intelligence technology to the problem of automatic detection of candidate anomalous records in a large database. We build our approach with three main goals in mind: 1)An effective detection of the records that are potentially anomalous, 2)A suitable selection of the subset of at- tributes that explains what makes a record anomalous, and 3)An efficient implementation that allows us to scale the approach to large databases. Our algorithm, called Bayesian Network Anomaly Detector (BNAD), uses the joint probability density function (pdf) provided by a Bayesian Net- work (BN) to achieve these goals. By using appropriate data structures, advanced caching techniques, the flexibility of Gaussian Mixture models, and the efficiency of BNs to model joint pdfs, BNAD manages to effi- ciently learn a suitable BN from a large dataset. We test BNAD using synthetic and real databases, the latter from the fields of manufacturing and astronomy, obtaining encouraging results. ","url":"http://saturno.ing.puc.cl/media/papers_alvaro/Cansado-Soto-AAI-2007.pdf","bibtex":"@Article{\t cansado:soto:2008,\n author\t= {A. Cansado and A. Soto},\n title\t\t= {Unsupervised Anomaly Detection in Large Databases Using\n\t\t Bayesian Networks},\n journal\t= {Applied Artificial Intelligence},\n volume\t= {22},\n number\t= {4},\n pages\t\t= {309-330},\n year\t\t= {2008},\n abstract\t= {Today, there has been a massive proliferation of huge\n\t\t databases storing valuable information. The opportunities\n\t\t of an effective use of these new data sources are enormous,\n\t\t however, the huge size and dimensionality of current large\n\t\t databases call for new ideas to scale up current\n\t\t statistical and computational approaches. This paper\n\t\t presents an application of Ar- tificial Intelligence\n\t\t technology to the problem of automatic detection of\n\t\t candidate anomalous records in a large database. We build\n\t\t our approach with three main goals in mind: 1)An effective\n\t\t detection of the records that are potentially anomalous,\n\t\t 2)A suitable selection of the subset of at- tributes that\n\t\t explains what makes a record anomalous, and 3)An efficient\n\t\t implementation that allows us to scale the approach to\n\t\t large databases. Our algorithm, called Bayesian Network\n\t\t Anomaly Detector (BNAD), uses the joint probability density\n\t\t function (pdf) provided by a Bayesian Net- work (BN) to\n\t\t achieve these goals. By using appropriate data structures,\n\t\t advanced caching techniques, the flexibility of Gaussian\n\t\t Mixture models, and the efficiency of BNs to model joint\n\t\t pdfs, BNAD manages to effi- ciently learn a suitable BN\n\t\t from a large dataset. We test BNAD using synthetic and real\n\t\t databases, the latter from the fields of manufacturing and\n\t\t astronomy, obtaining encouraging results. },\n url\t\t= {http://saturno.ing.puc.cl/media/papers_alvaro/Cansado-Soto-AAI-2007.pdf}\n}\n\n","author_short":["Cansado, A.","Soto, A."],"key":"cansado:soto:2008","id":"cansado:soto:2008","bibbaseid":"cansado-soto-unsupervisedanomalydetectioninlargedatabasesusingbayesiannetworks-2008","role":"author","urls":{"Paper":"http://saturno.ing.puc.cl/media/papers_alvaro/Cansado-Soto-AAI-2007.pdf"},"metadata":{"authorlinks":{"soto, a":"https://asoto.ing.puc.cl/publications/"}},"downloads":2},"bibtype":"article","biburl":"https://raw.githubusercontent.com/ialab-puc/ialab.ing.puc.cl/master/pubs.bib","downloads":2,"keywords":[],"search_terms":["unsupervised","anomaly","detection","large","databases","using","bayesian","networks","cansado","soto"],"title":"Unsupervised Anomaly Detection in Large Databases Using Bayesian Networks","year":2008,"dataSources":["3YPRCmmijLqF4qHXd","sg6yZ29Z2xB5xP79R","sj4fjnZAPkEeYdZqL","m8qFBfFbjk9qWjcmJ","QjT2DEZoWmQYxjHXS","7jg7aEafvXsPnP9Wf"]}