Estimating mutual information. Kraskov, A., Stögbauer, H., & Grassberger, P. Physical Review E, 69(6):066138, June, 2004. Publisher: American Physical Society
Estimating mutual information [link]Paper  doi  abstract   bibtex   
We present two classes of improved estimators for mutual information M(X,Y), from samples of random points distributed according to some joint probability density μ(x,y). In contrast to conventional estimators based on binnings, they are based on entropy estimates from k-nearest neighbor distances. This means that they are data efficient (with k=1 we resolve structures down to the smallest possible scales), adaptive (the resolution is higher where data are more numerous), and have minimal bias. Indeed, the bias of the underlying entropy estimates is mainly due to nonuniformity of the density at the smallest resolved scale, giving typically systematic errors which scale as functions of k∕N for N points. Numerically, we find that both families become exact for independent distributions, i.e. the estimator ˆM(X,Y) vanishes (up to statistical fluctuations) if μ(x,y)=μ(x)μ(y). This holds for all tested marginal distributions and for all dimensions of x and y. In addition, we give estimators for redundancies between more than two random variables. We compare our algorithms in detail with existing algorithms. Finally, we demonstrate the usefulness of our estimators for assessing the actual independence of components obtained from independent component analysis (ICA), for improving ICA, and for estimating the reliability of blind source separation., This article appears in the following collections:
@article{kraskov_estimating_2004-1,
	title = {Estimating mutual information},
	volume = {69},
	url = {https://link.aps.org/doi/10.1103/PhysRevE.69.066138},
	doi = {10.1103/PhysRevE.69.066138},
	abstract = {We present two classes of improved estimators for mutual information M(X,Y), from samples of random points distributed according to some joint probability density μ(x,y). In contrast to conventional estimators based on binnings, they are based on entropy estimates from k-nearest neighbor distances. This means that they are data efficient (with k=1 we resolve structures down to the smallest possible scales), adaptive (the resolution is higher where data are more numerous), and have minimal bias. Indeed, the bias of the underlying entropy estimates is mainly due to nonuniformity of the density at the smallest resolved scale, giving typically systematic errors which scale as functions of k∕N for N points. Numerically, we find that both families become exact for independent distributions, i.e. the estimator ˆM(X,Y) vanishes (up to statistical fluctuations) if μ(x,y)=μ(x)μ(y). This holds for all tested marginal distributions and for all dimensions of x and y. In addition, we give estimators for redundancies between more than two random variables. We compare our algorithms in detail with existing algorithms. Finally, we demonstrate the usefulness of our estimators for assessing the actual independence of components obtained from independent component analysis (ICA), for improving ICA, and for estimating the reliability of blind source separation., This article appears in the following collections:},
	number = {6},
	urldate = {2022-11-10},
	journal = {Physical Review E},
	author = {Kraskov, Alexander and Stögbauer, Harald and Grassberger, Peter},
	month = jun,
	year = {2004},
	note = {Publisher: American Physical Society},
	pages = {066138},
	file = {Full Text PDF:/Users/soumikp/Zotero/storage/8YKSIY7V/Kraskov et al. - 2004 - Estimating mutual information.pdf:application/pdf},
}

Downloads: 0