An Incremental Clustering-Based Fault Detection Algorithm for Class-Imbalanced Process Data

An Incremental Clustering-Based Fault Detection Algorithm for Class-Imbalanced Process Data. Kwak, J., Lee, T., & Kim, C. O. IEEE Transactions on Semiconductor Manufacturing, 28(3):318–328, August, 2015. Conference Name: IEEE Transactions on Semiconductor Manufacturing
doi abstract bibtex

Training fault detection model requires advanced data-mining algorithms when the growth rate of the process data is notably high and normal-class data overwhelm fault-class data in number. Most standard classification algorithms, such as support vector machines (SVMs), can handle moderate sizes of training data and assume balanced class distributions. When the class sizes are highly imbalanced, the standard algorithms tend to strongly favor the majority class and provide a notably low detection of the minority class as a result. In this paper, we propose an online fault detection algorithm based on incremental clustering. The algorithm accurately finds wafer faults even in severe class distribution skews and efficiently processes massive sensor data in terms of reductions in the required storage. We tested our algorithm on illustrative examples and an industrial example. The algorithm performed well with the illustrative examples that included imbalanced class distributions of Gaussian and non-Gaussian types and process drifts. In the industrial example, which simulated real data from a plasma etcher, we verified that the performance of the algorithm was better than that of the standard SVM, one-class SVM and three instance-based fault detection algorithms that are typically used in the literature.

@article{kwak_incremental_2015,
	title = {An {Incremental} {Clustering}-{Based} {Fault} {Detection} {Algorithm} for {Class}-{Imbalanced} {Process} {Data}},
	volume = {28},
	issn = {1558-2345},
	doi = {10.1109/TSM.2015.2445380},
	abstract = {Training fault detection model requires advanced data-mining algorithms when the growth rate of the process data is notably high and normal-class data overwhelm fault-class data in number. Most standard classification algorithms, such as support vector machines (SVMs), can handle moderate sizes of training data and assume balanced class distributions. When the class sizes are highly imbalanced, the standard algorithms tend to strongly favor the majority class and provide a notably low detection of the minority class as a result. In this paper, we propose an online fault detection algorithm based on incremental clustering. The algorithm accurately finds wafer faults even in severe class distribution skews and efficiently processes massive sensor data in terms of reductions in the required storage. We tested our algorithm on illustrative examples and an industrial example. The algorithm performed well with the illustrative examples that included imbalanced class distributions of Gaussian and non-Gaussian types and process drifts. In the industrial example, which simulated real data from a plasma etcher, we verified that the performance of the algorithm was better than that of the standard SVM, one-class SVM and three instance-based fault detection algorithms that are typically used in the literature.},
	number = {3},
	journal = {IEEE Transactions on Semiconductor Manufacturing},
	author = {Kwak, Jueun and Lee, Taehyung and Kim, Chang Ouk},
	month = aug,
	year = {2015},
	note = {Conference Name: IEEE Transactions on Semiconductor Manufacturing},
	keywords = {Algorithm design and analysis, Class Imbalance Data, Classification algorithms, Clustering algorithms, Covariance matrices, Data Mining, Fault Detection, Fault detection, Incremental Clustering, Process Drift, Standards, Support vector machines, class imbalance data, data mining, incremental clustering, process drift},
	pages = {318--328},
}

Downloads: 0

{"_id":"anxc35vNoqDeZLwby","bibbaseid":"kwak-lee-kim-anincrementalclusteringbasedfaultdetectionalgorithmforclassimbalancedprocessdata-2015","author_short":["Kwak, J.","Lee, T.","Kim, C. O."],"bibdata":{"bibtype":"article","type":"article","title":"An Incremental Clustering-Based Fault Detection Algorithm for Class-Imbalanced Process Data","volume":"28","issn":"1558-2345","doi":"10.1109/TSM.2015.2445380","abstract":"Training fault detection model requires advanced data-mining algorithms when the growth rate of the process data is notably high and normal-class data overwhelm fault-class data in number. Most standard classification algorithms, such as support vector machines (SVMs), can handle moderate sizes of training data and assume balanced class distributions. When the class sizes are highly imbalanced, the standard algorithms tend to strongly favor the majority class and provide a notably low detection of the minority class as a result. In this paper, we propose an online fault detection algorithm based on incremental clustering. The algorithm accurately finds wafer faults even in severe class distribution skews and efficiently processes massive sensor data in terms of reductions in the required storage. We tested our algorithm on illustrative examples and an industrial example. The algorithm performed well with the illustrative examples that included imbalanced class distributions of Gaussian and non-Gaussian types and process drifts. In the industrial example, which simulated real data from a plasma etcher, we verified that the performance of the algorithm was better than that of the standard SVM, one-class SVM and three instance-based fault detection algorithms that are typically used in the literature.","number":"3","journal":"IEEE Transactions on Semiconductor Manufacturing","author":[{"propositions":[],"lastnames":["Kwak"],"firstnames":["Jueun"],"suffixes":[]},{"propositions":[],"lastnames":["Lee"],"firstnames":["Taehyung"],"suffixes":[]},{"propositions":[],"lastnames":["Kim"],"firstnames":["Chang","Ouk"],"suffixes":[]}],"month":"August","year":"2015","note":"Conference Name: IEEE Transactions on Semiconductor Manufacturing","keywords":"Algorithm design and analysis, Class Imbalance Data, Classification algorithms, Clustering algorithms, Covariance matrices, Data Mining, Fault Detection, Fault detection, Incremental Clustering, Process Drift, Standards, Support vector machines, class imbalance data, data mining, incremental clustering, process drift","pages":"318–328","bibtex":"@article{kwak_incremental_2015,\n\ttitle = {An {Incremental} {Clustering}-{Based} {Fault} {Detection} {Algorithm} for {Class}-{Imbalanced} {Process} {Data}},\n\tvolume = {28},\n\tissn = {1558-2345},\n\tdoi = {10.1109/TSM.2015.2445380},\n\tabstract = {Training fault detection model requires advanced data-mining algorithms when the growth rate of the process data is notably high and normal-class data overwhelm fault-class data in number. Most standard classification algorithms, such as support vector machines (SVMs), can handle moderate sizes of training data and assume balanced class distributions. When the class sizes are highly imbalanced, the standard algorithms tend to strongly favor the majority class and provide a notably low detection of the minority class as a result. In this paper, we propose an online fault detection algorithm based on incremental clustering. The algorithm accurately finds wafer faults even in severe class distribution skews and efficiently processes massive sensor data in terms of reductions in the required storage. We tested our algorithm on illustrative examples and an industrial example. The algorithm performed well with the illustrative examples that included imbalanced class distributions of Gaussian and non-Gaussian types and process drifts. In the industrial example, which simulated real data from a plasma etcher, we verified that the performance of the algorithm was better than that of the standard SVM, one-class SVM and three instance-based fault detection algorithms that are typically used in the literature.},\n\tnumber = {3},\n\tjournal = {IEEE Transactions on Semiconductor Manufacturing},\n\tauthor = {Kwak, Jueun and Lee, Taehyung and Kim, Chang Ouk},\n\tmonth = aug,\n\tyear = {2015},\n\tnote = {Conference Name: IEEE Transactions on Semiconductor Manufacturing},\n\tkeywords = {Algorithm design and analysis, Class Imbalance Data, Classification algorithms, Clustering algorithms, Covariance matrices, Data Mining, Fault Detection, Fault detection, Incremental Clustering, Process Drift, Standards, Support vector machines, class imbalance data, data mining, incremental clustering, process drift},\n\tpages = {318--328},\n}\n\n\n\n","author_short":["Kwak, J.","Lee, T.","Kim, C. O."],"key":"kwak_incremental_2015","id":"kwak_incremental_2015","bibbaseid":"kwak-lee-kim-anincrementalclusteringbasedfaultdetectionalgorithmforclassimbalancedprocessdata-2015","role":"author","urls":{},"keyword":["Algorithm design and analysis","Class Imbalance Data","Classification algorithms","Clustering algorithms","Covariance matrices","Data Mining","Fault Detection","Fault detection","Incremental Clustering","Process Drift","Standards","Support vector machines","class imbalance data","data mining","incremental clustering","process drift"],"metadata":{"authorlinks":{}},"html":""},"bibtype":"article","biburl":"https://bibbase.org/zotero/mh_lenguyen","dataSources":["iwKepCrWBps7ojhDx"],"keywords":["algorithm design and analysis","class imbalance data","classification algorithms","clustering algorithms","covariance matrices","data mining","fault detection","fault detection","incremental clustering","process drift","standards","support vector machines","class imbalance data","data mining","incremental clustering","process drift"],"search_terms":["incremental","clustering","based","fault","detection","algorithm","class","imbalanced","process","data","kwak","lee","kim"],"title":"An Incremental Clustering-Based Fault Detection Algorithm for Class-Imbalanced Process Data","year":2015}