A Framework and Toolkit for Testing the Correctness of Recommendation Algorithms. Michiels, L., Verachtert, R., Ferraro, A., Falk, K., & Goethals, B. ACM Trans. Recomm. Syst., April, 2023. Place: New York, NY, USA Publisher: Association for Computing Machinery
A Framework and Toolkit for Testing the Correctness of Recommendation Algorithms [link]Paper  doi  abstract   bibtex   10 downloads  
Evaluating recommender systems adequately and thoroughly is an important task. Significant efforts are dedicated to proposing metrics, methods and protocols for doing so. However, there has been little discussion in the recommender systems’ literature on the topic of testing. In this work, we adopt and adapt concepts from the software testing domain, e.g., code coverage, metamorphic testing, or property-based testing, to help researchers to detect and correct faults in recommendation algorithms. We propose a test suite that can be used to validate the correctness of a recommendation algorithm, and thus identify and correct issues that can affect the performance and behavior of these algorithms. Our test suite contains both black box and white box tests at every level of abstraction, i.e., system, integration and unit. To facilitate adoption, we release RecPack Tests, an open-source Python package containing template test implementations. We use it to test four popular Python packages for recommender systems: RecPack, PyLensKit, Surprise and Cornac. Despite the high test coverage of each of these packages, we find that we are still able to uncover undocumented functional requirements and even some bugs. This validates our thesis that testing the correctness of recommendation algorithms can complement traditional methods for evaluating recommendation algorithms.
@article{michiels_framework_2023,
	title = {A {Framework} and {Toolkit} for {Testing} the {Correctness} of {Recommendation} {Algorithms}},
	url = {https://doi.org/10.1145/3591109},
	doi = {10.1145/3591109},
	abstract = {Evaluating recommender systems adequately and thoroughly is an important
task. Significant efforts are dedicated to proposing metrics, methods and
protocols for doing so. However, there has been little discussion in the
recommender systems’ literature on the topic of testing. In this work, we
adopt and adapt concepts from the software testing domain, e.g., code
coverage, metamorphic testing, or property-based testing, to help
researchers to detect and correct faults in recommendation algorithms. We
propose a test suite that can be used to validate the correctness of a
recommendation algorithm, and thus identify and correct issues that can
affect the performance and behavior of these algorithms. Our test suite
contains both black box and white box tests at every level of abstraction,
i.e., system, integration and unit. To facilitate adoption, we release
RecPack Tests, an open-source Python package containing template test
implementations. We use it to test four popular Python packages for
recommender systems: RecPack, PyLensKit, Surprise and Cornac. Despite the
high test coverage of each of these packages, we find that we are still
able to uncover undocumented functional requirements and even some bugs.
This validates our thesis that testing the correctness of recommendation
algorithms can complement traditional methods for evaluating
recommendation algorithms.},
	journal = {ACM Trans. Recomm. Syst.},
	author = {Michiels, Lien and Verachtert, Robin and Ferraro, Andres and Falk, Kim and Goethals, Bart},
	month = apr,
	year = {2023},
	note = {Place: New York, NY, USA
Publisher: Association for Computing Machinery},
}

Downloads: 10