A Theoretical Analysis of Using Gradient Data for Sobolev Training in RKHS. ul Abdeen, Z., Jia, R., Kekatos, V., & Jin, M. IFAC World Congress, 2023.
A Theoretical Analysis of Using Gradient Data for Sobolev Training in RKHS [pdf]Pdf  abstract   bibtex   
Recent works empirically demonstrated that incorporating target derivatives, in addition to the conventional usage of target values, during the training process improves the accuracy of the predictor and data efficiency. Despite the successful application of gradient data in the learning process, very little is understood theoretically about their performance guarantee. In this paper, our goal is to highlight (i) the limitations of gradient data on their performance guarantees, especially in low-data regimes, and (ii) the extent to which the gradients affect the learning rate. Our result implies that in a low-data regime, if the Lipschitz of the target function is below a threshold, gradient data for Sobolev training outperforms the classical training in terms of sample efficiency. For a target function with a large Lipschitz constant, there is a threshold for training data size beyond which the gradient data perform better than conventional training. The convergence behavior of gradient data for Sobolev training is studied, and the learning rate of order $\mathcal{O}(n^{-\frac{1}{2}+ε})$ is derived. Experiments are conducted to determine the effect of gradient data in the learning process.

Downloads: 0