A Theoretical Analysis of Using Gradient Data for Sobolev Training in RKHS. ul Abdeen, Z., Jia, R., Kekatos, V., & Jin, M. *IFAC World Congress*, 2023. Pdf abstract bibtex Recent works empirically demonstrated that incorporating target derivatives, in addition to the conventional usage of target values, during the training process improves the accuracy of the predictor and data efficiency. Despite the successful application of gradient data in the learning process, very little is understood theoretically about their performance guarantee. In this paper, our goal is to highlight (i) the limitations of gradient data on their performance guarantees, especially in low-data regimes, and (ii) the extent to which the gradients affect the learning rate. Our result implies that in a low-data regime, if the Lipschitz of the target function is below a threshold, gradient data for Sobolev training outperforms the classical training in terms of sample efficiency. For a target function with a large Lipschitz constant, there is a threshold for training data size beyond which the gradient data perform better than conventional training. The convergence behavior of gradient data for Sobolev training is studied, and the learning rate of order $\mathcal{O}(n^{-\frac{1}{2}+ε})$ is derived. Experiments are conducted to determine the effect of gradient data in the learning process.

@article{2023_2C_SobolevTrain,
title={A Theoretical Analysis of Using Gradient Data for Sobolev Training in RKHS},
author={Zain ul Abdeen and Ruoxi Jia and Vassilis Kekatos and Ming Jin},
year={2023},
journal = {IFAC World Congress},
url_pdf={Sobolev_training2022.pdf},
keywords = {Machine Learning, Optimization},
abstract={Recent works empirically demonstrated that incorporating target derivatives, in addition to the conventional usage of target values, during the training process improves the accuracy of the predictor and data efficiency. Despite the successful application of gradient data in the learning process, very little is understood theoretically about their performance guarantee. In this paper, our goal is to highlight (i) the limitations of gradient data on their performance guarantees, especially in low-data regimes, and (ii) the extent to which the gradients affect the learning rate. Our result implies that in a low-data regime, if the Lipschitz of the target function is below a threshold, gradient data for Sobolev training outperforms the classical training in terms of sample efficiency. For a target function with a large Lipschitz constant, there is a threshold for training data size beyond which the gradient data perform better than conventional training. The convergence behavior of gradient data for Sobolev training is studied, and the learning rate of order $\mathcal{O}(n^{-\frac{1}{2}+\epsilon})$ is derived. Experiments are conducted to determine the effect of gradient data in the learning process. }
}

Downloads: 0

{"_id":"wgg87Dohbx4PKnJR8","bibbaseid":"ulabdeen-jia-kekatos-jin-atheoreticalanalysisofusinggradientdataforsobolevtraininginrkhs-2023","author_short":["ul Abdeen, Z.","Jia, R.","Kekatos, V.","Jin, M."],"bibdata":{"bibtype":"article","type":"article","title":"A Theoretical Analysis of Using Gradient Data for Sobolev Training in RKHS","author":[{"firstnames":["Zain"],"propositions":["ul"],"lastnames":["Abdeen"],"suffixes":[]},{"firstnames":["Ruoxi"],"propositions":[],"lastnames":["Jia"],"suffixes":[]},{"firstnames":["Vassilis"],"propositions":[],"lastnames":["Kekatos"],"suffixes":[]},{"firstnames":["Ming"],"propositions":[],"lastnames":["Jin"],"suffixes":[]}],"year":"2023","journal":"IFAC World Congress","url_pdf":"Sobolev_training2022.pdf","keywords":"Machine Learning, Optimization","abstract":"Recent works empirically demonstrated that incorporating target derivatives, in addition to the conventional usage of target values, during the training process improves the accuracy of the predictor and data efficiency. Despite the successful application of gradient data in the learning process, very little is understood theoretically about their performance guarantee. In this paper, our goal is to highlight (i) the limitations of gradient data on their performance guarantees, especially in low-data regimes, and (ii) the extent to which the gradients affect the learning rate. Our result implies that in a low-data regime, if the Lipschitz of the target function is below a threshold, gradient data for Sobolev training outperforms the classical training in terms of sample efficiency. For a target function with a large Lipschitz constant, there is a threshold for training data size beyond which the gradient data perform better than conventional training. The convergence behavior of gradient data for Sobolev training is studied, and the learning rate of order $\\mathcal{O}(n^{-\\frac{1}{2}+ε})$ is derived. Experiments are conducted to determine the effect of gradient data in the learning process. ","bibtex":"@article{2023_2C_SobolevTrain,\n title={A Theoretical Analysis of Using Gradient Data for Sobolev Training in RKHS},\n author={Zain ul Abdeen and Ruoxi Jia and Vassilis Kekatos and Ming Jin},\n year={2023},\n journal = {IFAC World Congress}, \n url_pdf={Sobolev_training2022.pdf},\n keywords = {Machine Learning, Optimization},\n abstract={Recent works empirically demonstrated that incorporating target derivatives, in addition to the conventional usage of target values, during the training process improves the accuracy of the predictor and data efficiency. Despite the successful application of gradient data in the learning process, very little is understood theoretically about their performance guarantee. In this paper, our goal is to highlight (i) the limitations of gradient data on their performance guarantees, especially in low-data regimes, and (ii) the extent to which the gradients affect the learning rate. Our result implies that in a low-data regime, if the Lipschitz of the target function is below a threshold, gradient data for Sobolev training outperforms the classical training in terms of sample efficiency. For a target function with a large Lipschitz constant, there is a threshold for training data size beyond which the gradient data perform better than conventional training. The convergence behavior of gradient data for Sobolev training is studied, and the learning rate of order $\\mathcal{O}(n^{-\\frac{1}{2}+\\epsilon})$ is derived. Experiments are conducted to determine the effect of gradient data in the learning process. }\n}\n\n\n","author_short":["ul Abdeen, Z.","Jia, R.","Kekatos, V.","Jin, M."],"key":"2023_2C_SobolevTrain","id":"2023_2C_SobolevTrain","bibbaseid":"ulabdeen-jia-kekatos-jin-atheoreticalanalysisofusinggradientdataforsobolevtraininginrkhs-2023","role":"author","urls":{" pdf":"http://www.jinming.tech/papers/Sobolev_training2022.pdf"},"keyword":["Machine Learning","Optimization"],"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"http://www.jinming.tech/papers/myref.bib","dataSources":["sTzDHHaipTZWjp8oe","Y64tp2HnDCfXgLdc5"],"keywords":["machine learning","optimization"],"search_terms":["theoretical","analysis","using","gradient","data","sobolev","training","rkhs","ul abdeen","jia","kekatos","jin"],"title":"A Theoretical Analysis of Using Gradient Data for Sobolev Training in RKHS","year":2023,"downloads":3}