A Framework and Toolkit for Testing the Correctness of Recommendation Algorithms

A Framework and Toolkit for Testing the Correctness of Recommendation Algorithms. Michiels, L., Verachtert, R., Ferraro, A., Falk, K., & Goethals, B. ACM Trans. Recomm. Syst., April, 2023. Place: New York, NY, USA Publisher: Association for Computing Machinery

Paper doi abstract bibtex 10 downloads

Evaluating recommender systems adequately and thoroughly is an important task. Significant efforts are dedicated to proposing metrics, methods and protocols for doing so. However, there has been little discussion in the recommender systems’ literature on the topic of testing. In this work, we adopt and adapt concepts from the software testing domain, e.g., code coverage, metamorphic testing, or property-based testing, to help researchers to detect and correct faults in recommendation algorithms. We propose a test suite that can be used to validate the correctness of a recommendation algorithm, and thus identify and correct issues that can affect the performance and behavior of these algorithms. Our test suite contains both black box and white box tests at every level of abstraction, i.e., system, integration and unit. To facilitate adoption, we release RecPack Tests, an open-source Python package containing template test implementations. We use it to test four popular Python packages for recommender systems: RecPack, PyLensKit, Surprise and Cornac. Despite the high test coverage of each of these packages, we find that we are still able to uncover undocumented functional requirements and even some bugs. This validates our thesis that testing the correctness of recommendation algorithms can complement traditional methods for evaluating recommendation algorithms.

@article{michiels_framework_2023,
	title = {A {Framework} and {Toolkit} for {Testing} the {Correctness} of {Recommendation} {Algorithms}},
	url = {https://doi.org/10.1145/3591109},
	doi = {10.1145/3591109},
	abstract = {Evaluating recommender systems adequately and thoroughly is an important
task. Significant efforts are dedicated to proposing metrics, methods and
protocols for doing so. However, there has been little discussion in the
recommender systems’ literature on the topic of testing. In this work, we
adopt and adapt concepts from the software testing domain, e.g., code
coverage, metamorphic testing, or property-based testing, to help
researchers to detect and correct faults in recommendation algorithms. We
propose a test suite that can be used to validate the correctness of a
recommendation algorithm, and thus identify and correct issues that can
affect the performance and behavior of these algorithms. Our test suite
contains both black box and white box tests at every level of abstraction,
i.e., system, integration and unit. To facilitate adoption, we release
RecPack Tests, an open-source Python package containing template test
implementations. We use it to test four popular Python packages for
recommender systems: RecPack, PyLensKit, Surprise and Cornac. Despite the
high test coverage of each of these packages, we find that we are still
able to uncover undocumented functional requirements and even some bugs.
This validates our thesis that testing the correctness of recommendation
algorithms can complement traditional methods for evaluating
recommendation algorithms.},
	journal = {ACM Trans. Recomm. Syst.},
	author = {Michiels, Lien and Verachtert, Robin and Ferraro, Andres and Falk, Kim and Goethals, Bart},
	month = apr,
	year = {2023},
	note = {Place: New York, NY, USA
Publisher: Association for Computing Machinery},
}

Downloads: 10

{"_id":"boekmzwBScJiAEufL","bibbaseid":"michiels-verachtert-ferraro-falk-goethals-aframeworkandtoolkitfortestingthecorrectnessofrecommendationalgorithms-2023","author_short":["Michiels, L.","Verachtert, R.","Ferraro, A.","Falk, K.","Goethals, B."],"bibdata":{"bibtype":"article","type":"article","title":"A Framework and Toolkit for Testing the Correctness of Recommendation Algorithms","url":"https://doi.org/10.1145/3591109","doi":"10.1145/3591109","abstract":"Evaluating recommender systems adequately and thoroughly is an important task. Significant efforts are dedicated to proposing metrics, methods and protocols for doing so. However, there has been little discussion in the recommender systems’ literature on the topic of testing. In this work, we adopt and adapt concepts from the software testing domain, e.g., code coverage, metamorphic testing, or property-based testing, to help researchers to detect and correct faults in recommendation algorithms. We propose a test suite that can be used to validate the correctness of a recommendation algorithm, and thus identify and correct issues that can affect the performance and behavior of these algorithms. Our test suite contains both black box and white box tests at every level of abstraction, i.e., system, integration and unit. To facilitate adoption, we release RecPack Tests, an open-source Python package containing template test implementations. We use it to test four popular Python packages for recommender systems: RecPack, PyLensKit, Surprise and Cornac. Despite the high test coverage of each of these packages, we find that we are still able to uncover undocumented functional requirements and even some bugs. This validates our thesis that testing the correctness of recommendation algorithms can complement traditional methods for evaluating recommendation algorithms.","journal":"ACM Trans. Recomm. Syst.","author":[{"propositions":[],"lastnames":["Michiels"],"firstnames":["Lien"],"suffixes":[]},{"propositions":[],"lastnames":["Verachtert"],"firstnames":["Robin"],"suffixes":[]},{"propositions":[],"lastnames":["Ferraro"],"firstnames":["Andres"],"suffixes":[]},{"propositions":[],"lastnames":["Falk"],"firstnames":["Kim"],"suffixes":[]},{"propositions":[],"lastnames":["Goethals"],"firstnames":["Bart"],"suffixes":[]}],"month":"April","year":"2023","note":"Place: New York, NY, USA Publisher: Association for Computing Machinery","bibtex":"@article{michiels_framework_2023,\n\ttitle = {A {Framework} and {Toolkit} for {Testing} the {Correctness} of {Recommendation} {Algorithms}},\n\turl = {https://doi.org/10.1145/3591109},\n\tdoi = {10.1145/3591109},\n\tabstract = {Evaluating recommender systems adequately and thoroughly is an important\ntask. Significant efforts are dedicated to proposing metrics, methods and\nprotocols for doing so. However, there has been little discussion in the\nrecommender systems’ literature on the topic of testing. In this work, we\nadopt and adapt concepts from the software testing domain, e.g., code\ncoverage, metamorphic testing, or property-based testing, to help\nresearchers to detect and correct faults in recommendation algorithms. We\npropose a test suite that can be used to validate the correctness of a\nrecommendation algorithm, and thus identify and correct issues that can\naffect the performance and behavior of these algorithms. Our test suite\ncontains both black box and white box tests at every level of abstraction,\ni.e., system, integration and unit. To facilitate adoption, we release\nRecPack Tests, an open-source Python package containing template test\nimplementations. We use it to test four popular Python packages for\nrecommender systems: RecPack, PyLensKit, Surprise and Cornac. Despite the\nhigh test coverage of each of these packages, we find that we are still\nable to uncover undocumented functional requirements and even some bugs.\nThis validates our thesis that testing the correctness of recommendation\nalgorithms can complement traditional methods for evaluating\nrecommendation algorithms.},\n\tjournal = {ACM Trans. Recomm. Syst.},\n\tauthor = {Michiels, Lien and Verachtert, Robin and Ferraro, Andres and Falk, Kim and Goethals, Bart},\n\tmonth = apr,\n\tyear = {2023},\n\tnote = {Place: New York, NY, USA\nPublisher: Association for Computing Machinery},\n}\n\n","author_short":["Michiels, L.","Verachtert, R.","Ferraro, A.","Falk, K.","Goethals, B."],"key":"michiels_framework_2023","id":"michiels_framework_2023","bibbaseid":"michiels-verachtert-ferraro-falk-goethals-aframeworkandtoolkitfortestingthecorrectnessofrecommendationalgorithms-2023","role":"author","urls":{"Paper":"https://doi.org/10.1145/3591109"},"metadata":{"authorlinks":{}},"downloads":10},"bibtype":"article","biburl":"https://api.zotero.org/users/6655/collections/3TB3KT36/items?key=VFvZhZXIoHNBbzoLZ1IM2zgf&format=bibtex&limit=100","dataSources":["HB6fr7bPytW2CAAzC","7KNAjxiv2tsagmbgY"],"keywords":[],"search_terms":["framework","toolkit","testing","correctness","recommendation","algorithms","michiels","verachtert","ferraro","falk","goethals"],"title":"A Framework and Toolkit for Testing the Correctness of Recommendation Algorithms","year":2023,"downloads":10}