Bilevel optimization for feature selection in the data-driven newsvendor problem. Serrano, B., Minner, S., Schiffer, M., & Vidal, T. Technical Report arXiv:2209.05093, 2022. Paper abstract bibtex We study the feature-based newsvendor problem, in which a decision-maker has access to historical data consisting of demand observations and exogenous features. In this setting, we investigate feature selection, aiming to derive sparse, explainable models with improved out-of-sample performance. Up to now, state-of-the-art methods utilize regularization, which penalizes the number of selected features or the norm of the solution vector. As an alternative, we introduce a novel bilevel programming formulation. The upper-level problem selects a subset of features that minimizes an estimate of the out-of-sample cost of ordering decisions based on a held-out validation set. The lower-level problem learns the optimal coefficients of the decision function on a training set, using only the features selected by the upper-level. We present a mixed integer linear program reformulation for the bilevel program, which can be solved to optimality with standard optimization solvers. Our computational experiments show that the method accurately recovers ground-truth features already for instances with a sample size of a few hundred observations. In contrast, regularization-based techniques often fail at feature recovery or require thousands of observations to obtain similar accuracy. Regarding out-of-sample generalization, we achieve improved or comparable cost performance.
@techreport{Serrano2022,
abstract = {We study the feature-based newsvendor problem, in which a decision-maker has access to historical data consisting of demand observations and exogenous features. In this setting, we investigate feature selection, aiming to derive sparse, explainable models with improved out-of-sample performance. Up to now, state-of-the-art methods utilize regularization, which penalizes the number of selected features or the norm of the solution vector. As an alternative, we introduce a novel bilevel programming formulation. The upper-level problem selects a subset of features that minimizes an estimate of the out-of-sample cost of ordering decisions based on a held-out validation set. The lower-level problem learns the optimal coefficients of the decision function on a training set, using only the features selected by the upper-level. We present a mixed integer linear program reformulation for the bilevel program, which can be solved to optimality with standard optimization solvers. Our computational experiments show that the method accurately recovers ground-truth features already for instances with a sample size of a few hundred observations. In contrast, regularization-based techniques often fail at feature recovery or require thousands of observations to obtain similar accuracy. Regarding out-of-sample generalization, we achieve improved or comparable cost performance.},
archivePrefix = {arXiv},
arxivId = {2209.05093},
author = {Serrano, B. and Minner, S. and Schiffer, M. and Vidal, T.},
eprint = {2209.05093},
file = {:C$\backslash$:/Users/Thibaut/Documents/Mendeley-Articles/Serrano et al/Serrano et al. - 2022 - Bilevel optimization for feature selection in the data-driven newsvendor problem.pdf:pdf},
institution = {arXiv:2209.05093},
title = {{Bilevel optimization for feature selection in the data-driven newsvendor problem}},
url = {https://arxiv.org/pdf/2209.05093.pdf},
year = {2022}
}
Downloads: 0
{"_id":"Qg7SA3u8qMnKzuynN","bibbaseid":"serrano-minner-schiffer-vidal-bileveloptimizationforfeatureselectioninthedatadrivennewsvendorproblem-2022","author_short":["Serrano, B.","Minner, S.","Schiffer, M.","Vidal, T."],"bibdata":{"bibtype":"techreport","type":"techreport","abstract":"We study the feature-based newsvendor problem, in which a decision-maker has access to historical data consisting of demand observations and exogenous features. In this setting, we investigate feature selection, aiming to derive sparse, explainable models with improved out-of-sample performance. Up to now, state-of-the-art methods utilize regularization, which penalizes the number of selected features or the norm of the solution vector. As an alternative, we introduce a novel bilevel programming formulation. The upper-level problem selects a subset of features that minimizes an estimate of the out-of-sample cost of ordering decisions based on a held-out validation set. The lower-level problem learns the optimal coefficients of the decision function on a training set, using only the features selected by the upper-level. We present a mixed integer linear program reformulation for the bilevel program, which can be solved to optimality with standard optimization solvers. Our computational experiments show that the method accurately recovers ground-truth features already for instances with a sample size of a few hundred observations. In contrast, regularization-based techniques often fail at feature recovery or require thousands of observations to obtain similar accuracy. Regarding out-of-sample generalization, we achieve improved or comparable cost performance.","archiveprefix":"arXiv","arxivid":"2209.05093","author":[{"propositions":[],"lastnames":["Serrano"],"firstnames":["B."],"suffixes":[]},{"propositions":[],"lastnames":["Minner"],"firstnames":["S."],"suffixes":[]},{"propositions":[],"lastnames":["Schiffer"],"firstnames":["M."],"suffixes":[]},{"propositions":[],"lastnames":["Vidal"],"firstnames":["T."],"suffixes":[]}],"eprint":"2209.05093","file":":C$\\$:/Users/Thibaut/Documents/Mendeley-Articles/Serrano et al/Serrano et al. - 2022 - Bilevel optimization for feature selection in the data-driven newsvendor problem.pdf:pdf","institution":"arXiv:2209.05093","title":"Bilevel optimization for feature selection in the data-driven newsvendor problem","url":"https://arxiv.org/pdf/2209.05093.pdf","year":"2022","bibtex":"@techreport{Serrano2022,\nabstract = {We study the feature-based newsvendor problem, in which a decision-maker has access to historical data consisting of demand observations and exogenous features. In this setting, we investigate feature selection, aiming to derive sparse, explainable models with improved out-of-sample performance. Up to now, state-of-the-art methods utilize regularization, which penalizes the number of selected features or the norm of the solution vector. As an alternative, we introduce a novel bilevel programming formulation. The upper-level problem selects a subset of features that minimizes an estimate of the out-of-sample cost of ordering decisions based on a held-out validation set. The lower-level problem learns the optimal coefficients of the decision function on a training set, using only the features selected by the upper-level. We present a mixed integer linear program reformulation for the bilevel program, which can be solved to optimality with standard optimization solvers. Our computational experiments show that the method accurately recovers ground-truth features already for instances with a sample size of a few hundred observations. In contrast, regularization-based techniques often fail at feature recovery or require thousands of observations to obtain similar accuracy. Regarding out-of-sample generalization, we achieve improved or comparable cost performance.},\narchivePrefix = {arXiv},\narxivId = {2209.05093},\nauthor = {Serrano, B. and Minner, S. and Schiffer, M. and Vidal, T.},\neprint = {2209.05093},\nfile = {:C$\\backslash$:/Users/Thibaut/Documents/Mendeley-Articles/Serrano et al/Serrano et al. - 2022 - Bilevel optimization for feature selection in the data-driven newsvendor problem.pdf:pdf},\ninstitution = {arXiv:2209.05093},\ntitle = {{Bilevel optimization for feature selection in the data-driven newsvendor problem}},\nurl = {https://arxiv.org/pdf/2209.05093.pdf},\nyear = {2022}\n}\n","author_short":["Serrano, B.","Minner, S.","Schiffer, M.","Vidal, T."],"key":"Serrano2022","id":"Serrano2022","bibbaseid":"serrano-minner-schiffer-vidal-bileveloptimizationforfeatureselectioninthedatadrivennewsvendorproblem-2022","role":"author","urls":{"Paper":"https://arxiv.org/pdf/2209.05093.pdf"},"metadata":{"authorlinks":{}}},"bibtype":"techreport","biburl":"https://w1.cirrelt.ca/~vidalt/resources/My%20Collection.bib","dataSources":["yinfondEAJRbDM9sJ","sempRA6PhmAdGk3yG"],"keywords":[],"search_terms":["bilevel","optimization","feature","selection","data","driven","newsvendor","problem","serrano","minner","schiffer","vidal"],"title":"Bilevel optimization for feature selection in the data-driven newsvendor problem","year":2022}