Sparsity vs. Large Margins for Linear Classifiers. Herbrich, R., Graepel, T., & Shawe-Taylor, J. In Proceedings of the Thirteenth Annual Conference on Computational Learning Theory, pages 304--308, Palo Alto, 2000.
Sparsity vs. Large Margins for Linear Classifiers [pdf]Paper  abstract   bibtex   2 downloads  
We provide small sample size bounds on the generalisation error of linear classifiers that take advantage of large observed margins on the training set and sparsity in the data dependent expansion coefficients. It is already known from results in the luckiness framework that both criteria independently have a large impact on the generalisation error. Our new results show that they can be combined which theoretically justifies learning algorithms like the Support Vector Machine or the Relevance Vector Machine. In contrast to previous studies we avoid using the classical technique of symmetrisation by a ghost sample but directly using the sparsity for the estimation of the generalisation error. We demonstrate that our result leads to practical useful results even in case of small sample size if the training set witnesses our prior belief in sparsity and large margins.

Downloads: 2