Improved cross-validation for classifiers that make algorithmic choices to minimise runtime without compromising output correctness

Improved cross-validation for classifiers that make algorithmic choices to minimise runtime without compromising output correctness. Florescu, D. & England, M. arXiv:1911.12672 [cs], November, 2019. arXiv: 1911.12672

Paper abstract bibtex

Our topic is the use of machine learning to improve software by making choices which do not compromise the correctness of the output, but do affect the time taken to produce such output. We are particularly concerned with computer algebra systems (CASs), and in particular, our experiments are for selecting the variable ordering to use when performing a cylindrical algebraic decomposition of $n$-dimensional real space with respect to the signs of a set of polynomials. In our prior work we explored the different ML models that could be used, and how to identify suitable features of the input polynomials. In the present paper we both repeat our prior experiments on problems which have more variables (and thus exponentially more possible orderings), and examine the metric which our ML classifiers targets. The natural metric is computational runtime, with classifiers trained to pick the ordering which minimises this. However, this leads to the situation were models do not distinguish between any of the non-optimal orderings, whose runtimes may still vary dramatically. In this paper we investigate a modification to the cross-validation algorithms of the classifiers so that they do distinguish these cases, leading to improved results.

@article{florescu_improved_2019,
	title = {Improved cross-validation for classifiers that make algorithmic choices to minimise runtime without compromising output correctness},
	url = {http://arxiv.org/abs/1911.12672},
	abstract = {Our topic is the use of machine learning to improve software by making choices which do not compromise the correctness of the output, but do affect the time taken to produce such output. We are particularly concerned with computer algebra systems (CASs), and in particular, our experiments are for selecting the variable ordering to use when performing a cylindrical algebraic decomposition of \$n\$-dimensional real space with respect to the signs of a set of polynomials. In our prior work we explored the different ML models that could be used, and how to identify suitable features of the input polynomials. In the present paper we both repeat our prior experiments on problems which have more variables (and thus exponentially more possible orderings), and examine the metric which our ML classifiers targets. The natural metric is computational runtime, with classifiers trained to pick the ordering which minimises this. However, this leads to the situation were models do not distinguish between any of the non-optimal orderings, whose runtimes may still vary dramatically. In this paper we investigate a modification to the cross-validation algorithms of the classifiers so that they do distinguish these cases, leading to improved results.},
	urldate = {2019-12-08},
	journal = {arXiv:1911.12672 [cs]},
	author = {Florescu, Dorian and England, Matthew},
	month = nov,
	year = {2019},
	note = {arXiv: 1911.12672},
	keywords = {68W30, 68T05, 03C10, I.1.0, I.2.6, machine learning, mentions sympy, symbolic computation},
}

Downloads: 0

{"_id":"RezY34Mk99bg36m7b","bibbaseid":"florescu-england-improvedcrossvalidationforclassifiersthatmakealgorithmicchoicestominimiseruntimewithoutcompromisingoutputcorrectness-2019","authorIDs":[],"author_short":["Florescu, D.","England, M."],"bibdata":{"bibtype":"article","type":"article","title":"Improved cross-validation for classifiers that make algorithmic choices to minimise runtime without compromising output correctness","url":"http://arxiv.org/abs/1911.12672","abstract":"Our topic is the use of machine learning to improve software by making choices which do not compromise the correctness of the output, but do affect the time taken to produce such output. We are particularly concerned with computer algebra systems (CASs), and in particular, our experiments are for selecting the variable ordering to use when performing a cylindrical algebraic decomposition of $n$-dimensional real space with respect to the signs of a set of polynomials. In our prior work we explored the different ML models that could be used, and how to identify suitable features of the input polynomials. In the present paper we both repeat our prior experiments on problems which have more variables (and thus exponentially more possible orderings), and examine the metric which our ML classifiers targets. The natural metric is computational runtime, with classifiers trained to pick the ordering which minimises this. However, this leads to the situation were models do not distinguish between any of the non-optimal orderings, whose runtimes may still vary dramatically. In this paper we investigate a modification to the cross-validation algorithms of the classifiers so that they do distinguish these cases, leading to improved results.","urldate":"2019-12-08","journal":"arXiv:1911.12672 [cs]","author":[{"propositions":[],"lastnames":["Florescu"],"firstnames":["Dorian"],"suffixes":[]},{"propositions":[],"lastnames":["England"],"firstnames":["Matthew"],"suffixes":[]}],"month":"November","year":"2019","note":"arXiv: 1911.12672","keywords":"68W30, 68T05, 03C10, I.1.0, I.2.6, machine learning, mentions sympy, symbolic computation","bibtex":"@article{florescu_improved_2019,\n\ttitle = {Improved cross-validation for classifiers that make algorithmic choices to minimise runtime without compromising output correctness},\n\turl = {http://arxiv.org/abs/1911.12672},\n\tabstract = {Our topic is the use of machine learning to improve software by making choices which do not compromise the correctness of the output, but do affect the time taken to produce such output. We are particularly concerned with computer algebra systems (CASs), and in particular, our experiments are for selecting the variable ordering to use when performing a cylindrical algebraic decomposition of \\$n\\$-dimensional real space with respect to the signs of a set of polynomials. In our prior work we explored the different ML models that could be used, and how to identify suitable features of the input polynomials. In the present paper we both repeat our prior experiments on problems which have more variables (and thus exponentially more possible orderings), and examine the metric which our ML classifiers targets. The natural metric is computational runtime, with classifiers trained to pick the ordering which minimises this. However, this leads to the situation were models do not distinguish between any of the non-optimal orderings, whose runtimes may still vary dramatically. In this paper we investigate a modification to the cross-validation algorithms of the classifiers so that they do distinguish these cases, leading to improved results.},\n\turldate = {2019-12-08},\n\tjournal = {arXiv:1911.12672 [cs]},\n\tauthor = {Florescu, Dorian and England, Matthew},\n\tmonth = nov,\n\tyear = {2019},\n\tnote = {arXiv: 1911.12672},\n\tkeywords = {68W30, 68T05, 03C10, I.1.0, I.2.6, machine learning, mentions sympy, symbolic computation},\n}\n\n\n\n\n\n\n\n\n\n\n\n","author_short":["Florescu, D.","England, M."],"key":"florescu_improved_2019","id":"florescu_improved_2019","bibbaseid":"florescu-england-improvedcrossvalidationforclassifiersthatmakealgorithmicchoicestominimiseruntimewithoutcompromisingoutputcorrectness-2019","role":"author","urls":{"Paper":"http://arxiv.org/abs/1911.12672"},"keyword":["68W30","68T05","03C10","I.1.0","I.2.6","machine learning","mentions sympy","symbolic computation"],"metadata":{"authorlinks":{}},"downloads":0},"bibtype":"article","biburl":"https://bibbase.org/zotero-group/nicoguaro/525293","creationDate":"2020-07-15T19:11:20.348Z","downloads":0,"keywords":["68w30","68t05","03c10","i.1.0","i.2.6","machine learning","mentions sympy","symbolic computation"],"search_terms":["improved","cross","validation","classifiers","make","algorithmic","choices","minimise","runtime","without","compromising","output","correctness","florescu","england"],"title":"Improved cross-validation for classifiers that make algorithmic choices to minimise runtime without compromising output correctness","year":2019,"dataSources":["YtBDXPDiQEyhyEDZC","fhHfrQgj3AaGp7e9E","qzbMjEJf5d9Lk78vE","45tA9RFoXA9XeH4MM","MeSgs2KDKZo3bEbxH","nSXCrcahhCNfzvXEY","ecatNAsyr4f2iQyGq","tpWeaaCgFjPTYCjg3"]}