Discrimination Neural Network Model for Binary Classification Tasks on Tabular Data

Discrimination Neural Network Model for Binary Classification Tasks on Tabular Data. Munkhdalai, L., Munkhdalai, T., Hong, J. E., Pham, V. H., Theera-Umpon, N., & Ryu, K. H. IEEE Access, 11:15404–15418, 2023. Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Paper doi abstract bibtex

For the classification task, neural network-based approaches attempt to distinguish between two distributions by determining the joint distribution of input variables for each class. However, the most challenging task is still to classify the observations in the overlapping region of two classes. In this work, we propose a new discrimination neural network (DiscNN) architecture to address this issue. Our DiscNN learns to embed the initial input into more informative representations with better discriminability between the two distributions based on the cosine embedding loss. We also train our proposed model using the few-shot learning method to extract better-generalized representations from the initial input. We applied the DiscNN model to 35 tabular datasets from the OpenML-CC18 benchmark for a binary classification task. Our model showed superior performances on 28 datasets of them. In addition, we also performed experiments on 95 imbalanced datasets from the KEEL repository. The experiment results showed that the DiscNN outperformed the state-of-the-art models, including CatBoost, LightGBM, TabNet, VIME and Scarf, by around 0.23% AUC, 0.20% G-mean, and 1.06% F1 score.

@article{Munkhdalai_2023,
	title = {Discrimination {Neural} {Network} {Model} for {Binary} {Classification} {Tasks} on {Tabular} {Data}},
	volume = {11},
	issn = {21693536},
	url = {http://dx.doi.org/10.1109/ACCESS.2023.3243919},
	doi = {10.1109/ACCESS.2023.3243919},
	abstract = {For the classification task, neural network-based approaches attempt to distinguish between two distributions by determining the joint distribution of input variables for each class. However, the most challenging task is still to classify the observations in the overlapping region of two classes. In this work, we propose a new discrimination neural network (DiscNN) architecture to address this issue. Our DiscNN learns to embed the initial input into more informative representations with better discriminability between the two distributions based on the cosine embedding loss. We also train our proposed model using the few-shot learning method to extract better-generalized representations from the initial input. We applied the DiscNN model to 35 tabular datasets from the OpenML-CC18 benchmark for a binary classification task. Our model showed superior performances on 28 datasets of them. In addition, we also performed experiments on 95 imbalanced datasets from the KEEL repository. The experiment results showed that the DiscNN outperformed the state-of-the-art models, including CatBoost, LightGBM, TabNet, VIME and Scarf, by around 0.23\% AUC, 0.20\% G-mean, and 1.06\% F1 score.},
	journal = {IEEE Access},
	author = {Munkhdalai, Lkhagvadorj and Munkhdalai, Tsendsuren and Hong, Jang Eui and Pham, Van Huy and Theera-Umpon, Nipon and Ryu, Keun Ho},
	year = {2023},
	note = {Publisher: Institute of Electrical and Electronics Engineers (IEEE)},
	keywords = {Neural network, classification task, cosine similarity, imbalanced problem, tabular data},
	pages = {15404--15418},
}

Downloads: 0

{"_id":"C6kvzDch4gvSoMsY3","bibbaseid":"munkhdalai-munkhdalai-hong-pham-theeraumpon-ryu-discriminationneuralnetworkmodelforbinaryclassificationtasksontabulardata-2023","author_short":["Munkhdalai, L.","Munkhdalai, T.","Hong, J. E.","Pham, V. H.","Theera-Umpon, N.","Ryu, K. H."],"bibdata":{"bibtype":"article","type":"article","title":"Discrimination Neural Network Model for Binary Classification Tasks on Tabular Data","volume":"11","issn":"21693536","url":"http://dx.doi.org/10.1109/ACCESS.2023.3243919","doi":"10.1109/ACCESS.2023.3243919","abstract":"For the classification task, neural network-based approaches attempt to distinguish between two distributions by determining the joint distribution of input variables for each class. However, the most challenging task is still to classify the observations in the overlapping region of two classes. In this work, we propose a new discrimination neural network (DiscNN) architecture to address this issue. Our DiscNN learns to embed the initial input into more informative representations with better discriminability between the two distributions based on the cosine embedding loss. We also train our proposed model using the few-shot learning method to extract better-generalized representations from the initial input. We applied the DiscNN model to 35 tabular datasets from the OpenML-CC18 benchmark for a binary classification task. Our model showed superior performances on 28 datasets of them. In addition, we also performed experiments on 95 imbalanced datasets from the KEEL repository. The experiment results showed that the DiscNN outperformed the state-of-the-art models, including CatBoost, LightGBM, TabNet, VIME and Scarf, by around 0.23% AUC, 0.20% G-mean, and 1.06% F1 score.","journal":"IEEE Access","author":[{"propositions":[],"lastnames":["Munkhdalai"],"firstnames":["Lkhagvadorj"],"suffixes":[]},{"propositions":[],"lastnames":["Munkhdalai"],"firstnames":["Tsendsuren"],"suffixes":[]},{"propositions":[],"lastnames":["Hong"],"firstnames":["Jang","Eui"],"suffixes":[]},{"propositions":[],"lastnames":["Pham"],"firstnames":["Van","Huy"],"suffixes":[]},{"propositions":[],"lastnames":["Theera-Umpon"],"firstnames":["Nipon"],"suffixes":[]},{"propositions":[],"lastnames":["Ryu"],"firstnames":["Keun","Ho"],"suffixes":[]}],"year":"2023","note":"Publisher: Institute of Electrical and Electronics Engineers (IEEE)","keywords":"Neural network, classification task, cosine similarity, imbalanced problem, tabular data","pages":"15404–15418","bibtex":"@article{Munkhdalai_2023,\n\ttitle = {Discrimination {Neural} {Network} {Model} for {Binary} {Classification} {Tasks} on {Tabular} {Data}},\n\tvolume = {11},\n\tissn = {21693536},\n\turl = {http://dx.doi.org/10.1109/ACCESS.2023.3243919},\n\tdoi = {10.1109/ACCESS.2023.3243919},\n\tabstract = {For the classification task, neural network-based approaches attempt to distinguish between two distributions by determining the joint distribution of input variables for each class. However, the most challenging task is still to classify the observations in the overlapping region of two classes. In this work, we propose a new discrimination neural network (DiscNN) architecture to address this issue. Our DiscNN learns to embed the initial input into more informative representations with better discriminability between the two distributions based on the cosine embedding loss. We also train our proposed model using the few-shot learning method to extract better-generalized representations from the initial input. We applied the DiscNN model to 35 tabular datasets from the OpenML-CC18 benchmark for a binary classification task. Our model showed superior performances on 28 datasets of them. In addition, we also performed experiments on 95 imbalanced datasets from the KEEL repository. The experiment results showed that the DiscNN outperformed the state-of-the-art models, including CatBoost, LightGBM, TabNet, VIME and Scarf, by around 0.23\\% AUC, 0.20\\% G-mean, and 1.06\\% F1 score.},\n\tjournal = {IEEE Access},\n\tauthor = {Munkhdalai, Lkhagvadorj and Munkhdalai, Tsendsuren and Hong, Jang Eui and Pham, Van Huy and Theera-Umpon, Nipon and Ryu, Keun Ho},\n\tyear = {2023},\n\tnote = {Publisher: Institute of Electrical and Electronics Engineers (IEEE)},\n\tkeywords = {Neural network, classification task, cosine similarity, imbalanced problem, tabular data},\n\tpages = {15404--15418},\n}\n\n","author_short":["Munkhdalai, L.","Munkhdalai, T.","Hong, J. E.","Pham, V. H.","Theera-Umpon, N.","Ryu, K. H."],"key":"Munkhdalai_2023","id":"Munkhdalai_2023","bibbaseid":"munkhdalai-munkhdalai-hong-pham-theeraumpon-ryu-discriminationneuralnetworkmodelforbinaryclassificationtasksontabulardata-2023","role":"author","urls":{"Paper":"http://dx.doi.org/10.1109/ACCESS.2023.3243919"},"keyword":["Neural network","classification task","cosine similarity","imbalanced problem","tabular data"],"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"https://api.zotero.org/groups/2168152/items?key=VCdsaROd5deDY3prqqG8kI0c&format=bibtex&limit=100","dataSources":["syJjwTDDM32TsM2iF","QwrFbRJvXF69SEShv","HbngRCZLbLed2q9QT","LtEFvT85hYpNg4Esp","iHfnnAr7wKJJxkNMt","PrvBTxn4Zgeep29e5","78Yd9ZHcx783Wkffe","SKRhTA7ok4L4waPkZ","GfrMfnKTkYdcYTRsy","RqqCdXGEyWH4dZ76k","cbiwaQPQJSZeJDDY9","2Jak7xK39ytqcgqQ4","CDfDBPD6CDScj6Ty4","WgiCycoQjRx6KArBy","KBdipwowTNXWiKqYd","yjd6eECyb3TYZpZ3R","D9jmZ7aoHfJnYQ4ES","R8dLFAvyQ2oFRijDJ","dc6SzEK4S9LfC3XpA","kGWABmrDfhF29uibh","YE9GesxGLCsBc3vvC","v3qfuosZ66nvD85FK","BSxBG5ms26R2teZn9"],"keywords":["neural network","classification task","cosine similarity","imbalanced problem","tabular data"],"search_terms":["discrimination","neural","network","model","binary","classification","tasks","tabular","data","munkhdalai","munkhdalai","hong","pham","theera-umpon","ryu"],"title":"Discrimination Neural Network Model for Binary Classification Tasks on Tabular Data","year":2023}