8803 Machine Learning Theory The Winnow Algorithm. Balcan, M. Technical Report
8803 Machine Learning Theory The Winnow Algorithm [pdf]Paper  abstract   bibtex   
We now turn to an algorithm called the Winnow Algorithm developed by Nick Littlestone that performs especially well when many of the features given to the learner turn out to be irrelevant. Like the Perceptron Training Procedure discussed in the previous lectures, Win-now uses linear threshold functions as hypotheses and performs incremental updates to its current hypothesis. Unlike the Perceptron Training Procedure, Winnow uses multiplicative rather than additive updates. Winnow was developed with the goal of providing a significant Mistake Bound improvement when r ≪ n. We will analyze the Winnow algorithm for learning the class of C of monotone disjunctions of r variables. Note that Winnow or generalizations of Winnow can handle other specific concept classes (e.g. non-monotone disjunctions, majority functions, linear spearators), but the analysis is simplest in the case of monotone disjunctions. The Algorithm Both Winnow and Perceptron Algorithms use the same classification scheme: • w · x ≥ θ ⇒ positive classification • w · x < θ ⇒ negative classification For convenience, we assume that θ = n and we initialize w to the "all ones" vector, i.e., w i = 1 for all i 1. The Winnow Algorithm differs from the Perceptron Algorithm in its update scheme. When misclassifying a positive training example x (i.e. the predition was negative because w · x was too small): ∀x i = 1 : w i ← 2w i. When misclassifying a negative training example x (i.e. prediction was positive because w ·x was too large): 1 Note that Perceptron used a threshold of 0 but here we use a threshold of n.

Downloads: 0