An initial investigation of query expansion bias

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Query expansion is a useful retrieval mechanism for creating more verbose queries from the users initial keyword search. Query expansion generally have multiple parameters that allowthe user to define how many terms and where those terms come from are introduced to the expanded query. However, the idea that query expansion may be introducing biases into the system by selecting terms from overly retrievable documents has never been formally evaluated. In this work, the relationship between performance and retrievability bias is explored when various query expansion methods are employed to aide retrieval. Several parameters are altered, independently, to identify those that have an impact on bias. Parameters altered include; Rocchio's beta, length normalisation parameters, the number of terms added and the number of documents those terms are extracted from. The evaluation performed here identifies a strong correlation between performance and retrievability bias, suggesting that performance is increased by making the system more biased thus more likely to pick terms from a set of overly retrievable documents.

Original languageEnglish
Title of host publicationICTIR 2017 - Proceedings of the 2017 ACM SIGIR International Conference on the Theory of Information Retrieval
Place of PublicationNew York
Pages285-288
Number of pages4
DOIs
StatePublished - 1 Oct 2017
Event7th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2017 - Amsterdam, Netherlands

Conference

Conference7th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2017
CountryNetherlands
CityAmsterdam
Period1/10/174/10/17

    Research areas

  • performance, retrievability bias, information retieval

View graph of relations