Active Learning from Data Streams. Zhu, X., Zhang, P., Lin, X., & Shi, Y. In Seventh IEEE International Conference on Data Mining (ICDM 2007), pages 757–762, October, 2007. ISSN: 2374-8486
doi  abstract   bibtex   
In this paper, we address a new research problem on active learning from data streams where data volumes grow continuously and labeling all data is considered expensive and impractical. The objective is to label a small portion of stream data from which a model is derived to predict newly arrived instances as accurate as possible. In order to tackle the challenges raised by data streams' dynamic nature, we propose a classifier ensembling based active learning framework which selectively labels instances from data streams to build an accurate classifier. A minimal variance principle is introduced to guide instance labeling from data streams. In addition, a weight updating rule is derived to ensure that our instance labeling process can adaptively adjust to dynamic drifting concepts in the data. Experimental results on synthetic and real-world data demonstrate the performances of the proposed efforts in comparison with other simple approaches.
@inproceedings{zhu_active_2007,
	title = {Active {Learning} from {Data} {Streams}},
	doi = {10.1109/ICDM.2007.101},
	abstract = {In this paper, we address a new research problem on active learning from data streams where data volumes grow continuously and labeling all data is considered expensive and impractical. The objective is to label a small portion of stream data from which a model is derived to predict newly arrived instances as accurate as possible. In order to tackle the challenges raised by data streams' dynamic nature, we propose a classifier ensembling based active learning framework which selectively labels instances from data streams to build an accurate classifier. A minimal variance principle is introduced to guide instance labeling from data streams. In addition, a weight updating rule is derived to ensure that our instance labeling process can adaptively adjust to dynamic drifting concepts in the data. Experimental results on synthetic and real-world data demonstrate the performances of the proposed efforts in comparison with other simple approaches.},
	booktitle = {Seventh {IEEE} {International} {Conference} on {Data} {Mining} ({ICDM} 2007)},
	author = {Zhu, Xingquan and Zhang, Peng and Lin, Xiaodong and Shi, Yong},
	month = oct,
	year = {2007},
	note = {ISSN: 2374-8486},
	keywords = {Accuracy, Association rules, Computer science, Data engineering, Data mining, Decision making, Labeling, Predictive models, USA Councils, Uncertainty},
	pages = {757--762},
}

Downloads: 0