Evaluating the performance-deviation of itemKNN in RecBole and LensKit

Evaluating the performance-deviation of itemKNN in RecBole and LensKit. Schmidt, M., Nitschke, J., & Prinz, T. July, 2024. arXiv:2407.13531 [cs]

Paper abstract bibtex

This study evaluates the performance variations of item-based kNearest Neighbors (ItemKNN) algorithms implemented in the recommender system libraries, RecBole and LensKit. By using four datasets (Anime, Modcloth, ML-100K, and ML-1M), we explore the efficiency, accuracy, and scalability of each library’s implementation of ItemKNN. The study involves replicating and reproducing experiments to ensure the reliability of results. We are using key metrics such as normalized discounted cumulative gain (nDCG), precision, and recall to evaluate performance with our main focus on nDCG. Our initial findings indicate that RecBole is more performant than LensKit on two out of three metrics. It achieved a 18% higher nDCG, a 14% higher Precision and a 35% lower Recall. To ensure a fair comparison, we adjusted LensKit’s nDCG calculation implementation to match RecBole’s approach. After aligning the nDCG calculations implementation, the performance of the two libraries became more comparable. Using implicit feedback, LensKit achieved an nDCG value of 0.2540, whereas RecBole attained a value of 0.2674. Further analysis revealed that the deviations were caused by differences in the implementation of the similarity matrix calculation. Our findings show that RecBole’s implementation outperforms the LensKit algorithm on three out of our four datasets. Following the implementation of a similarity matrix calculation, where only the top K similar items for each item are retained (a method already incorporated in RecBole’s ItemKNN), we observed nearly identical nDCG values across all four of our datasets. For example, Lenskit achieved an nDCG value of 0.2586 for the ML-1M dataset with a random seed set to 42. Similarly, RecBole attained the same nDCG value of 0.2586 under identical conditions. Using the original implementation of LensKit’s ItemKNN, a higher nDCG value was obtained only on the ModCloth data set.

@misc{schmidt_evaluating_2024,
	title = {Evaluating the performance-deviation of {itemKNN} in {RecBole} and {LensKit}},
	url = {http://arxiv.org/abs/2407.13531},
	abstract = {This study evaluates the performance variations of item-based kNearest Neighbors (ItemKNN) algorithms implemented in the recommender system libraries, RecBole and LensKit. By using four datasets (Anime, Modcloth, ML-100K, and ML-1M), we explore the efficiency, accuracy, and scalability of each library’s implementation of ItemKNN. The study involves replicating and reproducing experiments to ensure the reliability of results. We are using key metrics such as normalized discounted cumulative gain (nDCG), precision, and recall to evaluate performance with our main focus on nDCG. Our initial findings indicate that RecBole is more performant than LensKit on two out of three metrics. It achieved a 18\% higher nDCG, a 14\% higher Precision and a 35\% lower Recall. To ensure a fair comparison, we adjusted LensKit’s nDCG calculation implementation to match RecBole’s approach. After aligning the nDCG calculations implementation, the performance of the two libraries became more comparable. Using implicit feedback, LensKit achieved an nDCG value of 0.2540, whereas RecBole attained a value of 0.2674. Further analysis revealed that the deviations were caused by differences in the implementation of the similarity matrix calculation. Our findings show that RecBole’s implementation outperforms the LensKit algorithm on three out of our four datasets. Following the implementation of a similarity matrix calculation, where only the top K similar items for each item are retained (a method already incorporated in RecBole’s ItemKNN), we observed nearly identical nDCG values across all four of our datasets. For example, Lenskit achieved an nDCG value of 0.2586 for the ML-1M dataset with a random seed set to 42. Similarly, RecBole attained the same nDCG value of 0.2586 under identical conditions. Using the original implementation of LensKit’s ItemKNN, a higher nDCG value was obtained only on the ModCloth data set.},
	language = {en},
	urldate = {2024-08-15},
	publisher = {arXiv},
	author = {Schmidt, Michael and Nitschke, Jannik and Prinz, Tim},
	month = jul,
	year = {2024},
	note = {arXiv:2407.13531 [cs]},
}

Downloads: 0

{"_id":"KJJAsdLKTixtZhryd","bibbaseid":"schmidt-nitschke-prinz-evaluatingtheperformancedeviationofitemknninrecboleandlenskit-2024","author_short":["Schmidt, M.","Nitschke, J.","Prinz, T."],"bibdata":{"bibtype":"misc","type":"misc","title":"Evaluating the performance-deviation of itemKNN in RecBole and LensKit","url":"http://arxiv.org/abs/2407.13531","abstract":"This study evaluates the performance variations of item-based kNearest Neighbors (ItemKNN) algorithms implemented in the recommender system libraries, RecBole and LensKit. By using four datasets (Anime, Modcloth, ML-100K, and ML-1M), we explore the efficiency, accuracy, and scalability of each library’s implementation of ItemKNN. The study involves replicating and reproducing experiments to ensure the reliability of results. We are using key metrics such as normalized discounted cumulative gain (nDCG), precision, and recall to evaluate performance with our main focus on nDCG. Our initial findings indicate that RecBole is more performant than LensKit on two out of three metrics. It achieved a 18% higher nDCG, a 14% higher Precision and a 35% lower Recall. To ensure a fair comparison, we adjusted LensKit’s nDCG calculation implementation to match RecBole’s approach. After aligning the nDCG calculations implementation, the performance of the two libraries became more comparable. Using implicit feedback, LensKit achieved an nDCG value of 0.2540, whereas RecBole attained a value of 0.2674. Further analysis revealed that the deviations were caused by differences in the implementation of the similarity matrix calculation. Our findings show that RecBole’s implementation outperforms the LensKit algorithm on three out of our four datasets. Following the implementation of a similarity matrix calculation, where only the top K similar items for each item are retained (a method already incorporated in RecBole’s ItemKNN), we observed nearly identical nDCG values across all four of our datasets. For example, Lenskit achieved an nDCG value of 0.2586 for the ML-1M dataset with a random seed set to 42. Similarly, RecBole attained the same nDCG value of 0.2586 under identical conditions. Using the original implementation of LensKit’s ItemKNN, a higher nDCG value was obtained only on the ModCloth data set.","language":"en","urldate":"2024-08-15","publisher":"arXiv","author":[{"propositions":[],"lastnames":["Schmidt"],"firstnames":["Michael"],"suffixes":[]},{"propositions":[],"lastnames":["Nitschke"],"firstnames":["Jannik"],"suffixes":[]},{"propositions":[],"lastnames":["Prinz"],"firstnames":["Tim"],"suffixes":[]}],"month":"July","year":"2024","note":"arXiv:2407.13531 [cs]","bibtex":"@misc{schmidt_evaluating_2024,\n\ttitle = {Evaluating the performance-deviation of {itemKNN} in {RecBole} and {LensKit}},\n\turl = {http://arxiv.org/abs/2407.13531},\n\tabstract = {This study evaluates the performance variations of item-based kNearest Neighbors (ItemKNN) algorithms implemented in the recommender system libraries, RecBole and LensKit. By using four datasets (Anime, Modcloth, ML-100K, and ML-1M), we explore the efficiency, accuracy, and scalability of each library’s implementation of ItemKNN. The study involves replicating and reproducing experiments to ensure the reliability of results. We are using key metrics such as normalized discounted cumulative gain (nDCG), precision, and recall to evaluate performance with our main focus on nDCG. Our initial findings indicate that RecBole is more performant than LensKit on two out of three metrics. It achieved a 18\\% higher nDCG, a 14\\% higher Precision and a 35\\% lower Recall. To ensure a fair comparison, we adjusted LensKit’s nDCG calculation implementation to match RecBole’s approach. After aligning the nDCG calculations implementation, the performance of the two libraries became more comparable. Using implicit feedback, LensKit achieved an nDCG value of 0.2540, whereas RecBole attained a value of 0.2674. Further analysis revealed that the deviations were caused by differences in the implementation of the similarity matrix calculation. Our findings show that RecBole’s implementation outperforms the LensKit algorithm on three out of our four datasets. Following the implementation of a similarity matrix calculation, where only the top K similar items for each item are retained (a method already incorporated in RecBole’s ItemKNN), we observed nearly identical nDCG values across all four of our datasets. For example, Lenskit achieved an nDCG value of 0.2586 for the ML-1M dataset with a random seed set to 42. Similarly, RecBole attained the same nDCG value of 0.2586 under identical conditions. Using the original implementation of LensKit’s ItemKNN, a higher nDCG value was obtained only on the ModCloth data set.},\n\tlanguage = {en},\n\turldate = {2024-08-15},\n\tpublisher = {arXiv},\n\tauthor = {Schmidt, Michael and Nitschke, Jannik and Prinz, Tim},\n\tmonth = jul,\n\tyear = {2024},\n\tnote = {arXiv:2407.13531 [cs]},\n}\n\n","author_short":["Schmidt, M.","Nitschke, J.","Prinz, T."],"key":"schmidt_evaluating_2024","id":"schmidt_evaluating_2024","bibbaseid":"schmidt-nitschke-prinz-evaluatingtheperformancedeviationofitemknninrecboleandlenskit-2024","role":"author","urls":{"Paper":"http://arxiv.org/abs/2407.13531"},"metadata":{"authorlinks":{}}},"bibtype":"misc","biburl":"https://api.zotero.org/users/6655/collections/3TB3KT36/items?key=VFvZhZXIoHNBbzoLZ1IM2zgf&format=bibtex&limit=100","dataSources":["7KNAjxiv2tsagmbgY"],"keywords":[],"search_terms":["evaluating","performance","deviation","itemknn","recbole","lenskit","schmidt","nitschke","prinz"],"title":"Evaluating the performance-deviation of itemKNN in RecBole and LensKit","year":2024}