Blend and Match: Distilling Semantic Search Models with Different Inductive Biases and Model Architectures. Bonab, H., Joshi, A., Bhatia, R., Gandhi, A., Huddar, V., Naik, J., Al-Darabsah, M., Teo, C. H., May, J., Agarwal, T., & Petricek, V. In Companion Proceedings of the ACM Web Conference 2023, of WWW '23 Companion, pages 869–877, New York, NY, USA, 2023. Association for Computing Machinery.
Blend and Match: Distilling Semantic Search Models with Different Inductive Biases and Model Architectures [link]Paper  doi  abstract   bibtex   
Commercial search engines use different semantic models to augment lexical matches. These models provide candidate items for a user’s query from a target space of millions to billions of items. Models with different inductive biases provide relatively different predictions, making it desirable to launch multiple semantic models in production. However, latency and resource constraints make simultaneously deploying multiple models impractical. In this paper, we introduce a distillation approach, called Blend and Match (BM), to unify two different semantic search models into a single model. We use a Bi-encoder semantic matching model as our primary model and propose a novel loss function to incorporate eXtreme Multi-label Classification (XMC) predictions as the secondary model. Our experiments conducted on two large-scale datasets, collected from a popular e-commerce store, show that our proposed approach significantly improves the recall of the primary Bi-encoder model by 11% to 17% with a minimal loss in precision. We show that traditional knowledge distillation approaches result in a sub-optimal performance for our problem setting, and our BM approach yields comparable rankings with strong Rank Fusion (RF) methods used only if one could deploy multiple models.
@inproceedings{10.1145/3543873.3587629,
author = {Bonab, Hamed and Joshi, Ashutosh and Bhatia, Ravi and Gandhi, Ankit and Huddar, Vijay and Naik, Juhi and Al-Darabsah, Mutasem and Teo, Choon Hui and May, Jonathan and Agarwal, Tarun and Petricek, Vaclav},
title = {Blend and Match: Distilling Semantic Search Models with Different Inductive Biases and Model Architectures},
year = {2023},
isbn = {9781450394192},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3543873.3587629},
doi = {10.1145/3543873.3587629},
abstract = {Commercial search engines use different semantic models to augment lexical matches. These models provide candidate items for a user’s query from a target space of millions to billions of items. Models with different inductive biases provide relatively different predictions, making it desirable to launch multiple semantic models in production. However, latency and resource constraints make simultaneously deploying multiple models impractical. In this paper, we introduce a distillation approach, called Blend and Match (BM), to unify two different semantic search models into a single model. We use a Bi-encoder semantic matching model as our primary model and propose a novel loss function to incorporate eXtreme Multi-label Classification (XMC) predictions as the secondary model. Our experiments conducted on two large-scale datasets, collected from a popular e-commerce store, show that our proposed approach significantly improves the recall of the primary Bi-encoder model by 11\% to 17\% with a minimal loss in precision. We show that traditional knowledge distillation approaches result in a sub-optimal performance for our problem setting, and our BM approach yields comparable rankings with strong Rank Fusion (RF) methods used only if one could deploy multiple models.},
booktitle = {Companion Proceedings of the ACM Web Conference 2023},
pages = {869–877},
numpages = {9},
keywords = {Semantic Search, Ranking Distillation, Product Search, Model Blending},
location = {Austin, TX, USA},
series = {WWW '23 Companion}
}

Downloads: 0