Correcting the Usage of the Hoeffding Inequality in Stream Mining. Matuszyk, P., Krempl, G., & Spiliopoulou, M. In Tucker, A., Höppner, F., Siebes, A., & Swift, S., editors, Advances in Intelligent Data Analysis XII, of Lecture Notes in Computer Science, pages 298–309, Berlin, Heidelberg, 2013. Springer.
doi  abstract   bibtex   
Many stream classification algorithms use the Hoeffding Inequality to identify the best split attribute during tree induction.We show that the prerequisites of the Inequality are violated by these algorithms, and we propose corrective steps. The new stream classification core, correctedVFDT, satisfies the prerequisites of the Hoeffding Inequality and thus provides the expected performance guarantees.The goal of our work is not to improve accuracy, but to guarantee a reliable and interpretable error bound. Nonetheless, we show that our solution achieves lower error rates regarding split attributes and sooner split decisions while maintaining a similar level of accuracy.
@inproceedings{matuszyk_correcting_2013,
	address = {Berlin, Heidelberg},
	series = {Lecture {Notes} in {Computer} {Science}},
	title = {Correcting the {Usage} of the {Hoeffding} {Inequality} in {Stream} {Mining}},
	isbn = {978-3-642-41398-8},
	doi = {10.1007/978-3-642-41398-8_26},
	abstract = {Many stream classification algorithms use the Hoeffding Inequality to identify the best split attribute during tree induction.We show that the prerequisites of the Inequality are violated by these algorithms, and we propose corrective steps. The new stream classification core, correctedVFDT, satisfies the prerequisites of the Hoeffding Inequality and thus provides the expected performance guarantees.The goal of our work is not to improve accuracy, but to guarantee a reliable and interpretable error bound. Nonetheless, we show that our solution achieves lower error rates regarding split attributes and sooner split decisions while maintaining a similar level of accuracy.},
	language = {en},
	booktitle = {Advances in {Intelligent} {Data} {Analysis} {XII}},
	publisher = {Springer},
	author = {Matuszyk, Pawel and Krempl, Georg and Spiliopoulou, Myra},
	editor = {Tucker, Allan and Höppner, Frank and Siebes, Arno and Swift, Stephen},
	year = {2013},
	keywords = {Concept Drift, Incorrect Decision, Information Gain, Split Attribute, Split Function},
	pages = {298--309},
}

Downloads: 0