Arrays of (locality-sensitive) Count Estimators (ACE): High-Speed Anomaly Detection via Cache Lookups

Arrays of (locality-sensitive) Count Estimators (ACE): High-Speed Anomaly Detection via Cache Lookups. Luo, C. & Shrivastava, A. arXiv:1706.06664 [cs, stat], June, 2017. arXiv: 1706.06664

Paper abstract bibtex

Anomaly detection is one of the frequent and important subroutines deployed in large-scale data processing systems. Even being a well-studied topic, existing techniques for unsupervised anomaly detection require storing significant amounts of data, which is prohibitive from memory and latency perspective. In the big-data world existing methods fail to address the new set of memory and latency constraints. In this paper, we propose ACE (Arrays of (locality-sensitive) Count Estimators) algorithm that can be 60x faster than the ELKI package\textasciitilde\textbackslashcite\DBLP:conf/ssd/AchtertBKSZ09\, which has the fastest implementation of the unsupervised anomaly detection algorithms. ACE algorithm requires less than \$4MB\$ memory, to dynamically compress the full data information into a set of count arrays. These tiny \$4MB\$ arrays of counts are sufficient for unsupervised anomaly detection. At the core of the ACE algorithm, there is a novel statistical estimator which is derived from the sampling view of Locality Sensitive Hashing(LSH). This view is significantly different and efficient than the widely popular view of LSH for near-neighbor search. We show the superiority of ACE algorithm over 11 popular baselines on 3 benchmark datasets, including the KDD-Cup99 data which is the largest available benchmark comprising of more than half a million entries with ground truth anomaly labels.

@article{luo_arrays_2017,
	title = {Arrays of (locality-sensitive) {Count} {Estimators} ({ACE}): {High}-{Speed} {Anomaly} {Detection} via {Cache} {Lookups}},
	shorttitle = {Arrays of (locality-sensitive) {Count} {Estimators} ({ACE})},
	url = {http://arxiv.org/abs/1706.06664},
	abstract = {Anomaly detection is one of the frequent and important subroutines deployed in large-scale data processing systems. Even being a well-studied topic, existing techniques for unsupervised anomaly detection require storing significant amounts of data, which is prohibitive from memory and latency perspective. In the big-data world existing methods fail to address the new set of memory and latency constraints. In this paper, we propose ACE (Arrays of (locality-sensitive) Count Estimators) algorithm that can be 60x faster than the ELKI package{\textasciitilde}{\textbackslash}cite\{DBLP:conf/ssd/AchtertBKSZ09\}, which has the fastest implementation of the unsupervised anomaly detection algorithms. ACE algorithm requires less than \$4MB\$ memory, to dynamically compress the full data information into a set of count arrays. These tiny \$4MB\$ arrays of counts are sufficient for unsupervised anomaly detection. At the core of the ACE algorithm, there is a novel statistical estimator which is derived from the sampling view of Locality Sensitive Hashing(LSH). This view is significantly different and efficient than the widely popular view of LSH for near-neighbor search. We show the superiority of ACE algorithm over 11 popular baselines on 3 benchmark datasets, including the KDD-Cup99 data which is the largest available benchmark comprising of more than half a million entries with ground truth anomaly labels.},
	urldate = {2019-03-23TZ},
	journal = {arXiv:1706.06664 [cs, stat]},
	author = {Luo, Chen and Shrivastava, Anshumali},
	month = jun,
	year = {2017},
	note = {arXiv: 1706.06664},
	keywords = {⛔ No DOI found}
}

Downloads: 0

{"_id":"WCygCPPdiS4YxNAPb","bibbaseid":"luo-shrivastava-arraysoflocalitysensitivecountestimatorsacehighspeedanomalydetectionviacachelookups-2017","authorIDs":[],"author_short":["Luo, C.","Shrivastava, A."],"bibdata":{"bibtype":"article","type":"article","title":"Arrays of (locality-sensitive) Count Estimators (ACE): High-Speed Anomaly Detection via Cache Lookups","shorttitle":"Arrays of (locality-sensitive) Count Estimators (ACE)","url":"http://arxiv.org/abs/1706.06664","abstract":"Anomaly detection is one of the frequent and important subroutines deployed in large-scale data processing systems. Even being a well-studied topic, existing techniques for unsupervised anomaly detection require storing significant amounts of data, which is prohibitive from memory and latency perspective. In the big-data world existing methods fail to address the new set of memory and latency constraints. In this paper, we propose ACE (Arrays of (locality-sensitive) Count Estimators) algorithm that can be 60x faster than the ELKI package\\textasciitilde\\textbackslashcite\\DBLP:conf/ssd/AchtertBKSZ09\\, which has the fastest implementation of the unsupervised anomaly detection algorithms. ACE algorithm requires less than \\$4MB\\$ memory, to dynamically compress the full data information into a set of count arrays. These tiny \\$4MB\\$ arrays of counts are sufficient for unsupervised anomaly detection. At the core of the ACE algorithm, there is a novel statistical estimator which is derived from the sampling view of Locality Sensitive Hashing(LSH). This view is significantly different and efficient than the widely popular view of LSH for near-neighbor search. We show the superiority of ACE algorithm over 11 popular baselines on 3 benchmark datasets, including the KDD-Cup99 data which is the largest available benchmark comprising of more than half a million entries with ground truth anomaly labels.","urldate":"2019-03-23TZ","journal":"arXiv:1706.06664 [cs, stat]","author":[{"propositions":[],"lastnames":["Luo"],"firstnames":["Chen"],"suffixes":[]},{"propositions":[],"lastnames":["Shrivastava"],"firstnames":["Anshumali"],"suffixes":[]}],"month":"June","year":"2017","note":"arXiv: 1706.06664","keywords":"⛔ No DOI found","bibtex":"@article{luo_arrays_2017,\n\ttitle = {Arrays of (locality-sensitive) {Count} {Estimators} ({ACE}): {High}-{Speed} {Anomaly} {Detection} via {Cache} {Lookups}},\n\tshorttitle = {Arrays of (locality-sensitive) {Count} {Estimators} ({ACE})},\n\turl = {http://arxiv.org/abs/1706.06664},\n\tabstract = {Anomaly detection is one of the frequent and important subroutines deployed in large-scale data processing systems. Even being a well-studied topic, existing techniques for unsupervised anomaly detection require storing significant amounts of data, which is prohibitive from memory and latency perspective. In the big-data world existing methods fail to address the new set of memory and latency constraints. In this paper, we propose ACE (Arrays of (locality-sensitive) Count Estimators) algorithm that can be 60x faster than the ELKI package{\\textasciitilde}{\\textbackslash}cite\\{DBLP:conf/ssd/AchtertBKSZ09\\}, which has the fastest implementation of the unsupervised anomaly detection algorithms. ACE algorithm requires less than \\$4MB\\$ memory, to dynamically compress the full data information into a set of count arrays. These tiny \\$4MB\\$ arrays of counts are sufficient for unsupervised anomaly detection. At the core of the ACE algorithm, there is a novel statistical estimator which is derived from the sampling view of Locality Sensitive Hashing(LSH). This view is significantly different and efficient than the widely popular view of LSH for near-neighbor search. We show the superiority of ACE algorithm over 11 popular baselines on 3 benchmark datasets, including the KDD-Cup99 data which is the largest available benchmark comprising of more than half a million entries with ground truth anomaly labels.},\n\turldate = {2019-03-23TZ},\n\tjournal = {arXiv:1706.06664 [cs, stat]},\n\tauthor = {Luo, Chen and Shrivastava, Anshumali},\n\tmonth = jun,\n\tyear = {2017},\n\tnote = {arXiv: 1706.06664},\n\tkeywords = {⛔ No DOI found}\n}\n","author_short":["Luo, C.","Shrivastava, A."],"key":"luo_arrays_2017","id":"luo_arrays_2017","bibbaseid":"luo-shrivastava-arraysoflocalitysensitivecountestimatorsacehighspeedanomalydetectionviacachelookups-2017","role":"author","urls":{"Paper":"http://arxiv.org/abs/1706.06664"},"keyword":["⛔ No DOI found"],"downloads":0},"bibtype":"article","biburl":"https://bibbase.org/zotero/ky25","creationDate":"2019-05-11T17:47:04.252Z","downloads":0,"keywords":["⛔ no doi found"],"search_terms":["arrays","locality","sensitive","count","estimators","ace","high","speed","anomaly","detection","via","cache","lookups","luo","shrivastava"],"title":"Arrays of (locality-sensitive) Count Estimators (ACE): High-Speed Anomaly Detection via Cache Lookups","year":2017,"dataSources":["XxiQtwZYfozhQmvGR"]}