April, 2014.
Independent component analysis (ICA) has become a standard data analysis technique applied to an array of problems in signal processing and machine learning. This tutorial provides an introduction to ICA based on linear algebra formulating an intuition for ICA from first principles. The goal of this tutorial is to provide a solid foundation on this advanced topic so that one might learn the motivation behind ICA, learn why and when to apply this technique and in the process gain an introduction to this exciting field of active research. [Excerpt: Introduction] Measurements often do not reflect the very thing intended to be measured. Measurements are corrupted by random noise - but that is only one piece of the story. Often, measurements can not be made in isolation, but reflect the combination of many distinct sources. For instance, try to record a person's voice on a city street. The faint crackle of the tape can be heard in the recording but so are the sounds of cars, other pedestrians, footsteps, etc. Sometimes the main obstacle preventing a clean measurement is not just noise in the traditional sense (e.g. faint crackle) but independent signals arising from distinct, identifiable sources (e.g. cars, footsteps). [] The distinction is subtle. We could view a measurement as an estimate of a single source corrupted by some random fluctuations (e.g. additive white noise). Instead, we assert that a measurement can be a combination of many distinct sources - each different from random noise. The broad topic of separating mixed sources has a name - blind source separation (BSS). As of today's writing, solving an arbitrary BSS problem is often intractable. However, a small subset of these types of problem have been solved only as recently as the last two decades - this is the provenance of independent component analysis (ICA). [] Solving blind source separation using ICA has two related interpretations - filtering and dimensional reduction. If each source can be identified, a practitioner might choose to selectively delete or retain a single source (e.g. a person's voice, above). This is a filtering operation in the sense that some aspect of the data is selectively removed or retained. A filtering operation is equivalent to projecting out some aspect (or dimension) of the data - in other words a prescription for dimensional reduction. Filtering data based on ICA has found many applications including the analysis of photographic images, medical signals (e.g. EEG, MEG, MRI, etc.), biological assays (e.g. micro-arrays, gene chips, etc.) and most notably audio signal processing. [] ICA can be applied to data in a naive manner treating the technique as a sophisticated black box that essentially performs '' magic''. While empowering, deploying a technique in this manner is fraught with peril. For instance, how does one judge the success of ICA? When will ICA fail? When are other methods more appropriate? I believe that understanding these questions and the method itself are necessary for appreciating when and how to apply ICA. It is for these reasons that I write this tutorial. [] This tutorial is not a scholarly paper. Nor is it thorough. The goal of this paper is simply to educate. That said, the ideas in this tutorial are sophisticated. I presume that the reader is comfortable with linear algebra, basic probability and statistics as well as the topic of principal component analysis (PCA). This paper does not shy away from informal explanations but also stresses the mathematics when they shed insight on to the problem. As always, please feel free to email me with any comments, concerns or corrections. [] [...]
@article{shlensTutorialIndependentComponent2014,
title = {A Tutorial on Independent Component Analysis},
author = {Shlens, Jonathon},
year = {2014},
month = apr,
abstract = {Independent component analysis (ICA) has become a standard data analysis technique applied to an array of problems in signal processing and machine learning. This tutorial provides an introduction to ICA based on linear algebra formulating an intuition for ICA from first principles. The goal of this tutorial is to provide a solid foundation on this advanced topic so that one might learn the motivation behind ICA, learn why and when to apply this technique and in the process gain an introduction to this exciting field of active research.

[Excerpt: Introduction]

Measurements often do not reflect the very thing intended to be measured. Measurements are corrupted by random noise - but that is only one piece of the story. Often, measurements can not be made in isolation, but reflect the combination of many distinct sources. For instance, try to record a person's voice on a city street. The faint crackle of the tape can be heard in the recording but so are the sounds of cars, other pedestrians, footsteps, etc. Sometimes the main obstacle preventing a clean measurement is not just noise in the traditional sense (e.g. faint crackle) but independent signals arising from distinct, identifiable sources (e.g. cars, footsteps).

[] The distinction is subtle. We could view a measurement as an estimate of a single source corrupted by some random fluctuations (e.g. additive white noise). Instead, we assert that a measurement can be a combination of many distinct sources - each different from random noise. The broad topic of separating mixed sources has a name - blind source separation (BSS). As of today's writing, solving an arbitrary BSS problem is often intractable. However, a small subset of these types of problem have been solved only as recently as the last two decades - this is the provenance of independent component analysis (ICA).

[] Solving blind source separation using ICA has two related interpretations - filtering and dimensional reduction. If each source can be identified, a practitioner might choose to selectively delete or retain a single source (e.g. a person's voice, above). This is a filtering operation in the sense that some aspect of the data is selectively removed or retained. A filtering operation is equivalent to projecting out some aspect (or dimension) of the data - in other words a prescription for dimensional reduction. Filtering data based on ICA has found many applications including the analysis of photographic images, medical signals (e.g. EEG, MEG, MRI, etc.), biological assays (e.g. micro-arrays, gene chips, etc.) and most notably audio signal processing.

[] ICA can be applied to data in a naive manner treating the technique as a sophisticated black box that essentially performs '' magic''. While empowering, deploying a technique in this manner is fraught with peril. For instance, how does one judge the success of ICA? When will ICA fail? When are other methods more appropriate? I believe that understanding these questions and the method itself are necessary for appreciating when and how to apply ICA. It is for these reasons that I write this tutorial.

[] This tutorial is not a scholarly paper. Nor is it thorough. The goal of this paper is simply to educate. That said, the ideas in this tutorial are sophisticated. I presume that the reader is comfortable with linear algebra, basic probability and statistics as well as the topic of principal component analysis (PCA). This paper does not shy away from informal explanations but also stresses the mathematics when they shed insight on to the problem. As always, please feel free to email me with any comments, concerns or corrections.

[] [...]},
archivePrefix = {arXiv},
eprint = {1404.2986},
eprinttype = {arxiv},
}