Deep Query Likelihood Model for Information Retrieval

Shengyao Zhuang, Hang Li, Guido Zuccon

Published in Advances in Information Retrieval - 43rd European Conference on IR Research, ECIR 2021, 2021, Short paper


Abstract

The query likelihood model (QLM) for information retrieval has been thoroughly investigated and utilised. At the basis of this method is the representation of queries and documents as language models; then retrieval corresponds to evaluate the likelihood that the query could be generated by the document. Several approaches have arisen to compute such probability, including by maximum likelihood, smoothing and considering translation probabilities from related terms. In this paper, we consider estimating this likelihood using modern pre-trained deep language models, and in particular the text-to-text transfer transformer (T5) – giving rise to the QLM-T5. This approach is evaluated on the passage ranking task of the MS MARCO dataset; empirical results show that QLM-T5 significantly outperforms traditional QLM methods, as well as a recent ad-hoc methods that exploits T5 for this task.

Download paper here

Code