Scaling Collaborative Filtering to Large-Scale Bipartite Rating Graphs Using Lenskit and Spark

Scaling Collaborative Filtering to Large-Scale Bipartite Rating Graphs Using Lenskit and Spark. Sardianos, C, Varlamis, I, & Eirinaki, M In pages 70–79, April, 2017.

Paper doi abstract bibtex

Popular social networking applications such as Facebook, Twitter, Friendster, etc. generate very large graphs with different characteristics. These social networks are huge, comprising millions of nodes and edges that push existing graph mining algorithms and architectures to their limits. In product-rating graphs, users connect with each other and rate items in tandem. In such bipartite graphs users and items are the nodes and ratings are the edges and collaborative filtering algorithms use the edge information (i.e. user ratings for items) in order to suggest items of potential interest to users. Existing algorithms can hardly scale up to the size of the entire graph and require unlimited resources to finish. This work employs a machine learning method for predicting the performance of Collaborative Filtering algorithms using the structural features of the bipartite graphs. Using a fast graph partitioning algorithm and information from the user friendship graph, the original bipartite graph is partitioned into different schemes (i.e. sets of smaller bipartite graphs). The schemes are evaluated against the predicted performance of the Collaborative Filtering algorithm and the best partitioning scheme is employed for generating the recommendations. As a result, the Collaborative Filtering algorithms are applied to smaller bipartite graphs, using limited resources and allowing the problem to scale or be parallelized. Tests on a large, real-life, rating graph, show that the proposed method allows the collaborative filtering algorithms to run in parallel and complete using limited resources.

@inproceedings{sardianos_scaling_2017,
	title = {Scaling {Collaborative} {Filtering} to {Large}-{Scale} {Bipartite} {Rating} {Graphs} {Using} {Lenskit} and {Spark}},
	url = {http://dx.doi.org/10.1109/BigDataService.2017.28},
	doi = {10.1109/BigDataService.2017.28},
	abstract = {Popular social networking applications such as Facebook, Twitter,
Friendster, etc. generate very large graphs with different
characteristics. These social networks are huge, comprising millions of
nodes and edges that push existing graph mining algorithms and
architectures to their limits. In product-rating graphs, users connect
with each other and rate items in tandem. In such bipartite graphs users
and items are the nodes and ratings are the edges and collaborative
filtering algorithms use the edge information (i.e. user ratings for
items) in order to suggest items of potential interest to users. Existing
algorithms can hardly scale up to the size of the entire graph and require
unlimited resources to finish. This work employs a machine learning method
for predicting the performance of Collaborative Filtering algorithms using
the structural features of the bipartite graphs. Using a fast graph
partitioning algorithm and information from the user friendship graph, the
original bipartite graph is partitioned into different schemes (i.e. sets
of smaller bipartite graphs). The schemes are evaluated against the
predicted performance of the Collaborative Filtering algorithm and the
best partitioning scheme is employed for generating the recommendations.
As a result, the Collaborative Filtering algorithms are applied to smaller
bipartite graphs, using limited resources and allowing the problem to
scale or be parallelized. Tests on a large, real-life, rating graph, show
that the proposed method allows the collaborative filtering algorithms to
run in parallel and complete using limited resources.},
	author = {Sardianos, C and Varlamis, I and Eirinaki, M},
	month = apr,
	year = {2017},
	keywords = {Bipartite graph, Collaboration, Collaborative Filtering, Graph Metrics, Graph Partitioning, Lenskit, Machine learning algorithms, Partitioning algorithms, Prediction algorithms, Recommender Systems, Recommender systems, Social Networks, Social network services, Spark, bipartite graphs, collaborative filtering, collaborative filtering algorithms, data mining, fast graph partitioning algorithm, graph theory, large-scale bipartite rating graphs, learning (artificial intelligence), machine learning, product-rating graphs, social networking (online), social networking applications, structural features, user-friendship graph},
	pages = {70--79},
}

Downloads: 0

{"_id":"y8rvdEfKd3YC67t8T","bibbaseid":"sardianos-varlamis-eirinaki-scalingcollaborativefilteringtolargescalebipartiteratinggraphsusinglenskitandspark-2017","authorIDs":[],"author_short":["Sardianos, C","Varlamis, I","Eirinaki, M"],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","title":"Scaling Collaborative Filtering to Large-Scale Bipartite Rating Graphs Using Lenskit and Spark","url":"http://dx.doi.org/10.1109/BigDataService.2017.28","doi":"10.1109/BigDataService.2017.28","abstract":"Popular social networking applications such as Facebook, Twitter, Friendster, etc. generate very large graphs with different characteristics. These social networks are huge, comprising millions of nodes and edges that push existing graph mining algorithms and architectures to their limits. In product-rating graphs, users connect with each other and rate items in tandem. In such bipartite graphs users and items are the nodes and ratings are the edges and collaborative filtering algorithms use the edge information (i.e. user ratings for items) in order to suggest items of potential interest to users. Existing algorithms can hardly scale up to the size of the entire graph and require unlimited resources to finish. This work employs a machine learning method for predicting the performance of Collaborative Filtering algorithms using the structural features of the bipartite graphs. Using a fast graph partitioning algorithm and information from the user friendship graph, the original bipartite graph is partitioned into different schemes (i.e. sets of smaller bipartite graphs). The schemes are evaluated against the predicted performance of the Collaborative Filtering algorithm and the best partitioning scheme is employed for generating the recommendations. As a result, the Collaborative Filtering algorithms are applied to smaller bipartite graphs, using limited resources and allowing the problem to scale or be parallelized. Tests on a large, real-life, rating graph, show that the proposed method allows the collaborative filtering algorithms to run in parallel and complete using limited resources.","author":[{"propositions":[],"lastnames":["Sardianos"],"firstnames":["C"],"suffixes":[]},{"propositions":[],"lastnames":["Varlamis"],"firstnames":["I"],"suffixes":[]},{"propositions":[],"lastnames":["Eirinaki"],"firstnames":["M"],"suffixes":[]}],"month":"April","year":"2017","keywords":"Bipartite graph, Collaboration, Collaborative Filtering, Graph Metrics, Graph Partitioning, Lenskit, Machine learning algorithms, Partitioning algorithms, Prediction algorithms, Recommender Systems, Recommender systems, Social Networks, Social network services, Spark, bipartite graphs, collaborative filtering, collaborative filtering algorithms, data mining, fast graph partitioning algorithm, graph theory, large-scale bipartite rating graphs, learning (artificial intelligence), machine learning, product-rating graphs, social networking (online), social networking applications, structural features, user-friendship graph","pages":"70–79","bibtex":"@inproceedings{sardianos_scaling_2017,\n\ttitle = {Scaling {Collaborative} {Filtering} to {Large}-{Scale} {Bipartite} {Rating} {Graphs} {Using} {Lenskit} and {Spark}},\n\turl = {http://dx.doi.org/10.1109/BigDataService.2017.28},\n\tdoi = {10.1109/BigDataService.2017.28},\n\tabstract = {Popular social networking applications such as Facebook, Twitter,\nFriendster, etc. generate very large graphs with different\ncharacteristics. These social networks are huge, comprising millions of\nnodes and edges that push existing graph mining algorithms and\narchitectures to their limits. In product-rating graphs, users connect\nwith each other and rate items in tandem. In such bipartite graphs users\nand items are the nodes and ratings are the edges and collaborative\nfiltering algorithms use the edge information (i.e. user ratings for\nitems) in order to suggest items of potential interest to users. Existing\nalgorithms can hardly scale up to the size of the entire graph and require\nunlimited resources to finish. This work employs a machine learning method\nfor predicting the performance of Collaborative Filtering algorithms using\nthe structural features of the bipartite graphs. Using a fast graph\npartitioning algorithm and information from the user friendship graph, the\noriginal bipartite graph is partitioned into different schemes (i.e. sets\nof smaller bipartite graphs). The schemes are evaluated against the\npredicted performance of the Collaborative Filtering algorithm and the\nbest partitioning scheme is employed for generating the recommendations.\nAs a result, the Collaborative Filtering algorithms are applied to smaller\nbipartite graphs, using limited resources and allowing the problem to\nscale or be parallelized. Tests on a large, real-life, rating graph, show\nthat the proposed method allows the collaborative filtering algorithms to\nrun in parallel and complete using limited resources.},\n\tauthor = {Sardianos, C and Varlamis, I and Eirinaki, M},\n\tmonth = apr,\n\tyear = {2017},\n\tkeywords = {Bipartite graph, Collaboration, Collaborative Filtering, Graph Metrics, Graph Partitioning, Lenskit, Machine learning algorithms, Partitioning algorithms, Prediction algorithms, Recommender Systems, Recommender systems, Social Networks, Social network services, Spark, bipartite graphs, collaborative filtering, collaborative filtering algorithms, data mining, fast graph partitioning algorithm, graph theory, large-scale bipartite rating graphs, learning (artificial intelligence), machine learning, product-rating graphs, social networking (online), social networking applications, structural features, user-friendship graph},\n\tpages = {70--79},\n}\n\n","author_short":["Sardianos, C","Varlamis, I","Eirinaki, M"],"key":"sardianos_scaling_2017","id":"sardianos_scaling_2017","bibbaseid":"sardianos-varlamis-eirinaki-scalingcollaborativefilteringtolargescalebipartiteratinggraphsusinglenskitandspark-2017","role":"author","urls":{"Paper":"http://dx.doi.org/10.1109/BigDataService.2017.28"},"keyword":["Bipartite graph","Collaboration","Collaborative Filtering","Graph Metrics","Graph Partitioning","Lenskit","Machine learning algorithms","Partitioning algorithms","Prediction algorithms","Recommender Systems","Recommender systems","Social Networks","Social network services","Spark","bipartite graphs","collaborative filtering","collaborative filtering algorithms","data mining","fast graph partitioning algorithm","graph theory","large-scale bipartite rating graphs","learning (artificial intelligence)","machine learning","product-rating graphs","social networking (online)","social networking applications","structural features","user-friendship graph"],"metadata":{"authorlinks":{}},"downloads":0},"bibtype":"inproceedings","biburl":"https://api.zotero.org/users/6655/collections/TJPPJ92X/items?key=VFvZhZXIoHNBbzoLZ1IM2zgf&format=bibtex&limit=100","creationDate":"2020-03-27T02:34:35.365Z","downloads":0,"keywords":["bipartite graph","collaboration","collaborative filtering","graph metrics","graph partitioning","lenskit","machine learning algorithms","partitioning algorithms","prediction algorithms","recommender systems","recommender systems","social networks","social network services","spark","bipartite graphs","collaborative filtering","collaborative filtering algorithms","data mining","fast graph partitioning algorithm","graph theory","large-scale bipartite rating graphs","learning (artificial intelligence)","machine learning","product-rating graphs","social networking (online)","social networking applications","structural features","user-friendship graph"],"search_terms":["scaling","collaborative","filtering","large","scale","bipartite","rating","graphs","using","lenskit","spark","sardianos","varlamis","eirinaki"],"title":"Scaling Collaborative Filtering to Large-Scale Bipartite Rating Graphs Using Lenskit and Spark","year":2017,"dataSources":["5Dp4QphkvpvNA33zi","jfoasiDDpStqkkoZB","BiuuFc45aHCgJqDLY"]}