Models in Information Retrieval. Fuhr, N. In Agosti, M, Crestani, F, & Pasi, G, editors, Lectures in Information Retrieval, pages 21–50. Springer, Heidelberg et al., 2001.
abstract   bibtex   
Retrieval models form the theoretical basis for computing the answer to a query. They differ not only in the syntax and expressiveness of the query language, but also in the representation of the documents. Following Rijsbergen's approach of regarding IR as uncertain inference, we can distinguish models according to the expressiveness of the underlying logic and the way uncertainty is handled. Classical retrieval models are based on propositional logic. Boolean retrieval ignores uncertainty, whereas fuzzy retrieval uses fuzzy logic for this purpose, and probabilistic retrieval is based on probability theory. In the vector space model, documents and queries are represented as vectors in a vector space spanned by the index terms, and uncertainty is modelled by considering geometric similarity. Probabilistic models make assumptions about the distribution of terms in relevant and nonrelevant documents in order to estimate the probability of relevance of a document for a query. Language models compute the probability that the query is generated from a document. For IR applications dealing not only with texts, but also with multimedia or factual data, propositional logic is not sufficient. Therefore, advanced IR models use restricted forms of predicate logic as basis. Terminological/description logics are rooted in semantic networks and terminological languages like e.g. KL-ONE. Datalog uses function-free horn clauses. Probabilistic versions of both approaches are able to cope with the intrinsic uncertainty of IR.
@incollection{Fuhr:00a,
	address = {Heidelberg et al.},
	title = {Models in {Information} {Retrieval}},
	abstract = {Retrieval models form the theoretical basis for
computing the answer to a query. They differ not only
in the syntax and expressiveness of the query language,
but also in the representation of the documents.
Following Rijsbergen's approach of regarding IR as
uncertain inference, we can distinguish models
according to the expressiveness of the underlying logic
and the way uncertainty is handled. Classical retrieval
models are based on propositional logic. Boolean
retrieval ignores uncertainty, whereas fuzzy retrieval
uses fuzzy logic for this purpose, and probabilistic
retrieval is based on probability theory. In the vector
space model, documents and queries are represented as
vectors in a vector space spanned by the index terms,
and uncertainty is modelled by considering geometric
similarity. Probabilistic models make assumptions about
the distribution of terms in relevant and nonrelevant
documents in order to estimate the probability of
relevance of a document for a query. Language models
compute the probability that the query is generated
from a document. For IR applications dealing not only
with texts, but also with multimedia or factual data,
propositional logic is not sufficient. Therefore,
advanced IR models use restricted forms of predicate
logic as basis. Terminological/description logics are
rooted in semantic networks and terminological
languages like e.g. KL-ONE. Datalog uses function-free
horn clauses. Probabilistic versions of both approaches
are able to cope with the intrinsic uncertainty of IR.},
	booktitle = {Lectures in {Information} {Retrieval}},
	publisher = {Springer},
	author = {Fuhr, Norbert},
	editor = {Agosti, M and Crestani, F and Pasi, G},
	year = {2001},
	pages = {21--50},
}

Downloads: 0