Search Personalization using Machine Learning. Yoganarasimhan, H.
abstract   bibtex   
Firms typically use query-based search to help consumers find information/products on their websites. We consider the problem of optimally ranking a set of results shown in response to a query. We propose a personalized ranking mechanism based on a user’s search and click history. Our machine learning framework consists of three modules – (a) Feature generation, (b) NDCG-based LambdaMART algorithm, and (c) Feature selection wrapper. We deploy our framework on large-scale data from a leading search engine using Amazon EC2 servers. Personalization improves clicks to the top position by 3.5% and reduces the Average Error in Rank of a Click by 9.43% over the baseline. Personalization based on short-term history or within-session behavior is shown to be less valuable than long-term or across-session personalization. We find that there is significant heterogeneity in returns to personalization as a function of user history and query type. The quality of personalized results increases monotonically with the length of a user’s history. Queries can be classified based on user intent as Do-Know-Go, i.e., transactional, informational, or navigational, where the latter two benefit more from personalization. We also find that returns to personalization are negatively correlated with a query’s past average performance. Finally, we demonstrate the scalability of our framework and derive the set of optimal features that maximizes accuracy while minimizing computing time.
@article{yoganarasimhan_search_nodate,
	title = {Search {Personalization} using {Machine} {Learning}},
	abstract = {Firms typically use query-based search to help consumers find information/products on their websites. We consider the problem of optimally ranking a set of results shown in response to a query. We propose a personalized ranking mechanism based on a user’s search and click history. Our machine learning framework consists of three modules – (a) Feature generation, (b) NDCG-based LambdaMART algorithm, and (c) Feature selection wrapper. We deploy our framework on large-scale data from a leading search engine using Amazon EC2 servers. Personalization improves clicks to the top position by 3.5\% and reduces the Average Error in Rank of a Click by 9.43\% over the baseline. Personalization based on short-term history or within-session behavior is shown to be less valuable than long-term or across-session personalization. We find that there is significant heterogeneity in returns to personalization as a function of user history and query type. The quality of personalized results increases monotonically with the length of a user’s history. Queries can be classified based on user intent as Do-Know-Go, i.e., transactional, informational, or navigational, where the latter two benefit more from personalization. We also find that returns to personalization are negatively correlated with a query’s past average performance. Finally, we demonstrate the scalability of our framework and derive the set of optimal features that maximizes accuracy while minimizing computing time.},
	language = {en},
	author = {Yoganarasimhan, Hema},
	pages = {52}
}

Downloads: 0