March, 2022. arXiv:2203.01606 [cs, math]

Paper abstract bibtex

Paper abstract bibtex

In this work we study binary classiﬁcation problems where we assume that our training data is subject to uncertainty, i.e. the precise data points are not known. To tackle this issue in the ﬁeld of robust machine learning the aim is to develop models which are robust against small perturbations in the training data. We study robust support vector machines (SVM) and extend the classical approach by an ensemble method which iteratively solves a non-robust SVM on diﬀerent perturbations of the dataset, where the perturbations are derived by an adversarial problem. Afterwards for classiﬁcation of an unknown data point we perform a majority vote of all calculated SVM solutions. We study three diﬀerent variants for the adversarial problem, the exact problem, a relaxed variant and an eﬃcient heuristic variant. While the exact and the relaxed variant can be modeled using integer programming formulations, the heuristic one can be implemented by an easy and eﬃcient algorithm. All derived methods are tested on random and realistic datasets and the results indicate that the derived ensemble methods have a much more stable behaviour when changing the protection level compared to the classical robust SVM model.

@misc{kurtz_ensemble_2022, title = {Ensemble {Methods} for {Robust} {Support} {Vector} {Machines} using {Integer} {Programming}}, url = {http://arxiv.org/abs/2203.01606}, abstract = {In this work we study binary classiﬁcation problems where we assume that our training data is subject to uncertainty, i.e. the precise data points are not known. To tackle this issue in the ﬁeld of robust machine learning the aim is to develop models which are robust against small perturbations in the training data. We study robust support vector machines (SVM) and extend the classical approach by an ensemble method which iteratively solves a non-robust SVM on diﬀerent perturbations of the dataset, where the perturbations are derived by an adversarial problem. Afterwards for classiﬁcation of an unknown data point we perform a majority vote of all calculated SVM solutions. We study three diﬀerent variants for the adversarial problem, the exact problem, a relaxed variant and an eﬃcient heuristic variant. While the exact and the relaxed variant can be modeled using integer programming formulations, the heuristic one can be implemented by an easy and eﬃcient algorithm. All derived methods are tested on random and realistic datasets and the results indicate that the derived ensemble methods have a much more stable behaviour when changing the protection level compared to the classical robust SVM model.}, language = {en}, urldate = {2023-10-21}, publisher = {arXiv}, author = {Kurtz, Jannis}, month = mar, year = {2022}, note = {arXiv:2203.01606 [cs, math]}, keywords = {Computer Science - Machine Learning, Mathematics - Optimization and Control}, }

Downloads: 0