Online ensemble learning with abstaining classifiers for drifting and noisy data streams. Krawczyk, B. & Cano, A. Applied Soft Computing, 68:677–692, July, 2018.
Online ensemble learning with abstaining classifiers for drifting and noisy data streams [link]Paper  doi  abstract   bibtex   
Mining data streams is among most vital contemporary topics in machine learning. Such scenario requires adaptive algorithms that are able to process constantly arriving instances, adapt to potential changes in data, use limited computational resources, as well as be robust to any atypical events that may appear. Ensemble learning has proven itself to be an effective solution, as combining learners leads to an improved predictive power, more flexible drift handling, as well as ease of being implemented in high-performance computing environments. In this paper, we propose an enhancement of popular online ensembles by augmenting them with abstaining option. Instead of relying on a traditional voting, classifiers are allowed to abstain from contributing to the final decision. Their confidence level is being monitored for each incoming instance and only learners that exceed certain threshold are selected. We introduce a dynamic and self-adapting threshold that is able to adapt to changes in the data stream, by monitoring outputs of the ensemble and allowing to exploit underlying diversity in order to efficiently anticipate drifts. Additionally, we show that forcing uncertain classifiers to abstain from making a prediction is especially useful for noisy data streams. Our proposal is a lightweight enhancement that can be applied to any online ensemble method, improving its robustness to drifts and noise. Thorough experimental analysis validated through statistical tests proves the usefulness of the proposed approach.
@article{krawczyk_online_2018,
	title = {Online ensemble learning with abstaining classifiers for drifting and noisy data streams},
	volume = {68},
	issn = {1568-4946},
	url = {http://www.sciencedirect.com/science/article/pii/S1568494617307238},
	doi = {10.1016/j.asoc.2017.12.008},
	abstract = {Mining data streams is among most vital contemporary topics in machine learning. Such scenario requires adaptive algorithms that are able to process constantly arriving instances, adapt to potential changes in data, use limited computational resources, as well as be robust to any atypical events that may appear. Ensemble learning has proven itself to be an effective solution, as combining learners leads to an improved predictive power, more flexible drift handling, as well as ease of being implemented in high-performance computing environments. In this paper, we propose an enhancement of popular online ensembles by augmenting them with abstaining option. Instead of relying on a traditional voting, classifiers are allowed to abstain from contributing to the final decision. Their confidence level is being monitored for each incoming instance and only learners that exceed certain threshold are selected. We introduce a dynamic and self-adapting threshold that is able to adapt to changes in the data stream, by monitoring outputs of the ensemble and allowing to exploit underlying diversity in order to efficiently anticipate drifts. Additionally, we show that forcing uncertain classifiers to abstain from making a prediction is especially useful for noisy data streams. Our proposal is a lightweight enhancement that can be applied to any online ensemble method, improving its robustness to drifts and noise. Thorough experimental analysis validated through statistical tests proves the usefulness of the proposed approach.},
	language = {en},
	urldate = {2020-12-12},
	journal = {Applied Soft Computing},
	author = {Krawczyk, Bartosz and Cano, Alberto},
	month = jul,
	year = {2018},
	keywords = {Abstaining classifier, Concept drift, Data stream mining, Diversity, Ensemble learning, Machine learning},
	pages = {677--692},
}

Downloads: 0