Assessing the Quality and Stability of Recommender Systems

Assessing the Quality and Stability of Recommender Systems. Shriver, D. Master's thesis, University of Nebraska - Lincoln, 2018. Publication Title: Computer Science and Engineering

Paper abstract bibtex

Recommender systems help users to find products they may like when lacking personal experience or facing an overwhelmingly large set of items. However, assessing the quality and stability of recommender systems can present challenges for developers. First, traditional accuracy metrics, such as precision and recall, for validating the quality of recommendations, offer only a coarse, one-dimensional view of the system performance. Second, assessing the stability of a recommender systems requires generating new data and retraining a system, which is expensive. In this work, we present two new approaches for assessing the quality and stability of recommender systems to address these challenges. We first present a general and extensible approach for assessing the quality of the behavior of a recommender system using logical property templates. The approach is general in that it defines recommendation systems in terms of sets of rankings, ratings, users, and items on which property templates are defined. It is extensible in that these property templates define a space of properties that can be instantiated and parameterized to characterize a recommendation system. We study the application of the approach to several recommendation systems. Our findings demonstrate the potential of these properties, illustrating the insights they can provide about the different algorithms and evolving datasets. We also present an approach for influence-guided fuzz testing of recommender system stability. We infer influence models for aspects of a dataset, such as users or items, from the recommendations produced by a recommender system and its training data. We define dataset fuzzing heuristics that use these influence models for generating modifications to an original dataset and we present a test oracle based on a threshold of acceptable instability. We implement our approach and evaluate it on several recommender algorithms using the MovieLens dataset and we find that influence-guided fuzzing can effectively find small sets of modifications that cause significantly more instability than random approaches. Adviser: Sebastian Elbaum

@mastersthesis{shriver_assessing_2018,
	title = {Assessing the {Quality} and {Stability} of {Recommender} {Systems}},
	url = {https://digitalcommons.unl.edu/computerscidiss/147},
	abstract = {Recommender systems help users to find products they may like when lacking
personal experience or facing an overwhelmingly large set of items.
However, assessing the quality and stability of recommender systems can
present challenges for developers. First, traditional accuracy metrics,
such as precision and recall, for validating the quality of
recommendations, offer only a coarse, one-dimensional view of the system
performance. Second, assessing the stability of a recommender systems
requires generating new data and retraining a system, which is expensive.
In this work, we present two new approaches for assessing the quality and
stability of recommender systems to address these challenges. We first
present a general and extensible approach for assessing the quality of the
behavior of a recommender system using logical property templates. The
approach is general in that it defines recommendation systems in terms of
sets of rankings, ratings, users, and items on which property templates
are defined. It is extensible in that these property templates define a
space of properties that can be instantiated and parameterized to
characterize a recommendation system. We study the application of the
approach to several recommendation systems. Our findings demonstrate the
potential of these properties, illustrating the insights they can provide
about the different algorithms and evolving datasets. We also present an
approach for influence-guided fuzz testing of recommender system
stability. We infer influence models for aspects of a dataset, such as
users or items, from the recommendations produced by a recommender system
and its training data. We define dataset fuzzing heuristics that use these
influence models for generating modifications to an original dataset and
we present a test oracle based on a threshold of acceptable instability.
We implement our approach and evaluate it on several recommender
algorithms using the MovieLens dataset and we find that influence-guided
fuzzing can effectively find small sets of modifications that cause
significantly more instability than random approaches. Adviser: Sebastian
Elbaum},
	urldate = {2018-05-08},
	school = {University of Nebraska - Lincoln},
	author = {Shriver, David},
	collaborator = {Elbaum, Sebastian},
	year = {2018},
	note = {Publication Title: Computer Science and Engineering},
}

Downloads: 0

{"_id":"twjvptM7Ym4MN6384","bibbaseid":"shriver-assessingthequalityandstabilityofrecommendersystems-2018","authorIDs":[],"author_short":["Shriver, D."],"bibdata":{"bibtype":"mastersthesis","type":"mastersthesis","title":"Assessing the Quality and Stability of Recommender Systems","url":"https://digitalcommons.unl.edu/computerscidiss/147","abstract":"Recommender systems help users to find products they may like when lacking personal experience or facing an overwhelmingly large set of items. However, assessing the quality and stability of recommender systems can present challenges for developers. First, traditional accuracy metrics, such as precision and recall, for validating the quality of recommendations, offer only a coarse, one-dimensional view of the system performance. Second, assessing the stability of a recommender systems requires generating new data and retraining a system, which is expensive. In this work, we present two new approaches for assessing the quality and stability of recommender systems to address these challenges. We first present a general and extensible approach for assessing the quality of the behavior of a recommender system using logical property templates. The approach is general in that it defines recommendation systems in terms of sets of rankings, ratings, users, and items on which property templates are defined. It is extensible in that these property templates define a space of properties that can be instantiated and parameterized to characterize a recommendation system. We study the application of the approach to several recommendation systems. Our findings demonstrate the potential of these properties, illustrating the insights they can provide about the different algorithms and evolving datasets. We also present an approach for influence-guided fuzz testing of recommender system stability. We infer influence models for aspects of a dataset, such as users or items, from the recommendations produced by a recommender system and its training data. We define dataset fuzzing heuristics that use these influence models for generating modifications to an original dataset and we present a test oracle based on a threshold of acceptable instability. We implement our approach and evaluate it on several recommender algorithms using the MovieLens dataset and we find that influence-guided fuzzing can effectively find small sets of modifications that cause significantly more instability than random approaches. Adviser: Sebastian Elbaum","urldate":"2018-05-08","school":"University of Nebraska - Lincoln","author":[{"propositions":[],"lastnames":["Shriver"],"firstnames":["David"],"suffixes":[]}],"collaborator":"Elbaum, Sebastian","year":"2018","note":"Publication Title: Computer Science and Engineering","bibtex":"@mastersthesis{shriver_assessing_2018,\n\ttitle = {Assessing the {Quality} and {Stability} of {Recommender} {Systems}},\n\turl = {https://digitalcommons.unl.edu/computerscidiss/147},\n\tabstract = {Recommender systems help users to find products they may like when lacking\npersonal experience or facing an overwhelmingly large set of items.\nHowever, assessing the quality and stability of recommender systems can\npresent challenges for developers. First, traditional accuracy metrics,\nsuch as precision and recall, for validating the quality of\nrecommendations, offer only a coarse, one-dimensional view of the system\nperformance. Second, assessing the stability of a recommender systems\nrequires generating new data and retraining a system, which is expensive.\nIn this work, we present two new approaches for assessing the quality and\nstability of recommender systems to address these challenges. We first\npresent a general and extensible approach for assessing the quality of the\nbehavior of a recommender system using logical property templates. The\napproach is general in that it defines recommendation systems in terms of\nsets of rankings, ratings, users, and items on which property templates\nare defined. It is extensible in that these property templates define a\nspace of properties that can be instantiated and parameterized to\ncharacterize a recommendation system. We study the application of the\napproach to several recommendation systems. Our findings demonstrate the\npotential of these properties, illustrating the insights they can provide\nabout the different algorithms and evolving datasets. We also present an\napproach for influence-guided fuzz testing of recommender system\nstability. We infer influence models for aspects of a dataset, such as\nusers or items, from the recommendations produced by a recommender system\nand its training data. We define dataset fuzzing heuristics that use these\ninfluence models for generating modifications to an original dataset and\nwe present a test oracle based on a threshold of acceptable instability.\nWe implement our approach and evaluate it on several recommender\nalgorithms using the MovieLens dataset and we find that influence-guided\nfuzzing can effectively find small sets of modifications that cause\nsignificantly more instability than random approaches. Adviser: Sebastian\nElbaum},\n\turldate = {2018-05-08},\n\tschool = {University of Nebraska - Lincoln},\n\tauthor = {Shriver, David},\n\tcollaborator = {Elbaum, Sebastian},\n\tyear = {2018},\n\tnote = {Publication Title: Computer Science and Engineering},\n}\n\n","author_short":["Shriver, D."],"key":"shriver_assessing_2018","id":"shriver_assessing_2018","bibbaseid":"shriver-assessingthequalityandstabilityofrecommendersystems-2018","role":"author","urls":{"Paper":"https://digitalcommons.unl.edu/computerscidiss/147"},"metadata":{"authorlinks":{}},"downloads":0},"bibtype":"mastersthesis","biburl":"https://api.zotero.org/users/6655/collections/TJPPJ92X/items?key=VFvZhZXIoHNBbzoLZ1IM2zgf&format=bibtex&limit=100","creationDate":"2020-03-27T02:34:35.398Z","downloads":0,"keywords":[],"search_terms":["assessing","quality","stability","recommender","systems","shriver"],"title":"Assessing the Quality and Stability of Recommender Systems","year":2018,"dataSources":["5Dp4QphkvpvNA33zi","jfoasiDDpStqkkoZB","BiuuFc45aHCgJqDLY"]}