Information Retrieval Models: Foundations and Relationships

Information Retrieval Models: Foundations and Relationships. Roelleke, T. Morgan & Claypool, 2013.
doi abstract bibtex

Information Retrieval (IR) models are a core component of IR research and IR systems. The past decade brought a consolidation of the family of IR models, which by 2000 consisted of relatively isolated views on TF-IDF (Term-Frequency times Inverse-Document-Frequency) as the weighting scheme in the vector-space model (VSM), the probabilistic relevance framework (PRF), the binary independence retrieval (BIR) model, BM25 (Best-Match Version 25, the main instantiation of the PRF/BIR), and language modelling (LM). Also, the early 2000s saw the arrival of divergence from randomness (DFR). Regarding intuition and simplicity, though LM is clear from a probabilistic point of view, several people stated: "It is easy to understand TF-IDF and BM25. For LM, however, we understand the math, but we do not fully understand why it works." This book takes a horizontal approach gathering the foundations of TF-IDF, PRF, BIR, Poisson, BM25, LM, probabilistic inference networks (PIN's), and divergence-based models. The aim is to create a consolidated and balanced view on the main models. A particular focus of this book is on the "relationships between models." This includes an overview over the main frameworks (PRF, logical IR, VSM, generalized VSM) and a pairing of TF-IDF with other models. It becomes evident that TF-IDF and LM measure the same, namely the dependence (overlap) between document and query. The Poisson probability helps to establish probabilistic, non-heuristic roots for TF-IDF, and the Poisson parameter, average term frequency, is a binding link between several retrieval models and model parameters. Table of Contents: List of Figures / Preface / Acknowledgments / Introduction / Foundations of IR Models / Relationships Between IR Models / Summary & Research Outlook / Bibliography / Author's Biography / Index

@book{Roelleke2013,
	title = {Information {Retrieval} {Models}: {Foundations} and {Relationships}},
	abstract = {Information Retrieval (IR) models are a core component of IR research and IR systems. The past decade brought a consolidation of the family of IR models, which by 2000 consisted of relatively isolated views on TF-IDF (Term-Frequency times Inverse-Document-Frequency) as the weighting scheme in the vector-space model (VSM), the probabilistic relevance framework (PRF), the binary independence retrieval (BIR) model, BM25 (Best-Match Version 25, the main instantiation of the PRF/BIR), and language modelling (LM). Also, the early 2000s saw the arrival of divergence from randomness (DFR). Regarding intuition and simplicity, though LM is clear from a probabilistic point of view, several people stated: "It is easy to understand TF-IDF and BM25. For LM, however, we understand the math, but we do not fully understand why it works." This book takes a horizontal approach gathering the foundations of TF-IDF, PRF, BIR, Poisson, BM25, LM, probabilistic inference networks (PIN's), and divergence-based models. The aim is to create a consolidated and balanced view on the main models. A particular focus of this book is on the "relationships between models." This includes an overview over the main frameworks (PRF, logical IR, VSM, generalized VSM) and a pairing of TF-IDF with other models. It becomes evident that TF-IDF and LM measure the same, namely the dependence (overlap) between document and query. The Poisson probability helps to establish probabilistic, non-heuristic roots for TF-IDF, and the Poisson parameter, average term frequency, is a binding link between several retrieval models and model parameters. Table of Contents: List of Figures / Preface / Acknowledgments / Introduction / Foundations of IR Models / Relationships Between IR Models / Summary \& Research Outlook / Bibliography / Author's Biography / Index},
	publisher = {Morgan \& Claypool},
	author = {Roelleke, Thomas},
	year = {2013},
	doi = {10.2200/S00494ED1V01Y201304ICR027},
	keywords = {information retrieval, models},
}

Downloads: 0

{"_id":"3SR3iAYmDwjFQDhCa","bibbaseid":"roelleke-informationretrievalmodelsfoundationsandrelationships-2013","author_short":["Roelleke, T."],"bibdata":{"bibtype":"book","type":"book","title":"Information Retrieval Models: Foundations and Relationships","abstract":"Information Retrieval (IR) models are a core component of IR research and IR systems. The past decade brought a consolidation of the family of IR models, which by 2000 consisted of relatively isolated views on TF-IDF (Term-Frequency times Inverse-Document-Frequency) as the weighting scheme in the vector-space model (VSM), the probabilistic relevance framework (PRF), the binary independence retrieval (BIR) model, BM25 (Best-Match Version 25, the main instantiation of the PRF/BIR), and language modelling (LM). Also, the early 2000s saw the arrival of divergence from randomness (DFR). Regarding intuition and simplicity, though LM is clear from a probabilistic point of view, several people stated: \"It is easy to understand TF-IDF and BM25. For LM, however, we understand the math, but we do not fully understand why it works.\" This book takes a horizontal approach gathering the foundations of TF-IDF, PRF, BIR, Poisson, BM25, LM, probabilistic inference networks (PIN's), and divergence-based models. The aim is to create a consolidated and balanced view on the main models. A particular focus of this book is on the \"relationships between models.\" This includes an overview over the main frameworks (PRF, logical IR, VSM, generalized VSM) and a pairing of TF-IDF with other models. It becomes evident that TF-IDF and LM measure the same, namely the dependence (overlap) between document and query. The Poisson probability helps to establish probabilistic, non-heuristic roots for TF-IDF, and the Poisson parameter, average term frequency, is a binding link between several retrieval models and model parameters. Table of Contents: List of Figures / Preface / Acknowledgments / Introduction / Foundations of IR Models / Relationships Between IR Models / Summary & Research Outlook / Bibliography / Author's Biography / Index","publisher":"Morgan & Claypool","author":[{"propositions":[],"lastnames":["Roelleke"],"firstnames":["Thomas"],"suffixes":[]}],"year":"2013","doi":"10.2200/S00494ED1V01Y201304ICR027","keywords":"information retrieval, models","bibtex":"@book{Roelleke2013,\n\ttitle = {Information {Retrieval} {Models}: {Foundations} and {Relationships}},\n\tabstract = {Information Retrieval (IR) models are a core component of IR research and IR systems. The past decade brought a consolidation of the family of IR models, which by 2000 consisted of relatively isolated views on TF-IDF (Term-Frequency times Inverse-Document-Frequency) as the weighting scheme in the vector-space model (VSM), the probabilistic relevance framework (PRF), the binary independence retrieval (BIR) model, BM25 (Best-Match Version 25, the main instantiation of the PRF/BIR), and language modelling (LM). Also, the early 2000s saw the arrival of divergence from randomness (DFR). Regarding intuition and simplicity, though LM is clear from a probabilistic point of view, several people stated: \"It is easy to understand TF-IDF and BM25. For LM, however, we understand the math, but we do not fully understand why it works.\" This book takes a horizontal approach gathering the foundations of TF-IDF, PRF, BIR, Poisson, BM25, LM, probabilistic inference networks (PIN's), and divergence-based models. The aim is to create a consolidated and balanced view on the main models. A particular focus of this book is on the \"relationships between models.\" This includes an overview over the main frameworks (PRF, logical IR, VSM, generalized VSM) and a pairing of TF-IDF with other models. It becomes evident that TF-IDF and LM measure the same, namely the dependence (overlap) between document and query. The Poisson probability helps to establish probabilistic, non-heuristic roots for TF-IDF, and the Poisson parameter, average term frequency, is a binding link between several retrieval models and model parameters. Table of Contents: List of Figures / Preface / Acknowledgments / Introduction / Foundations of IR Models / Relationships Between IR Models / Summary \\& Research Outlook / Bibliography / Author's Biography / Index},\n\tpublisher = {Morgan \\& Claypool},\n\tauthor = {Roelleke, Thomas},\n\tyear = {2013},\n\tdoi = {10.2200/S00494ED1V01Y201304ICR027},\n\tkeywords = {information retrieval, models},\n}\n\n","author_short":["Roelleke, T."],"key":"Roelleke2013","id":"Roelleke2013","bibbaseid":"roelleke-informationretrievalmodelsfoundationsandrelationships-2013","role":"author","urls":{},"keyword":["information retrieval","models"],"metadata":{"authorlinks":{}},"html":""},"bibtype":"book","biburl":"https://bibbase.org/zotero/ifromm","dataSources":["N4kJAiLiJ7kxfNsoh"],"keywords":["information retrieval","models"],"search_terms":["information","retrieval","models","foundations","relationships","roelleke"],"title":"Information Retrieval Models: Foundations and Relationships","year":2013}