In this paper, we face the so called "ranked list problem" of Web searches, that occurs when users submit short requests to search engines. Generally, as a consequence of terms' ambiguity and polysemy, users engage long cycles of query reformulation in an attempt to capture relevant information in the top ranked results. The overall objective of the proposal is to support the user in optimizing Web searches, by reducing the need for long search iterations. Specifically, in this paper we describe an iterative query disambiguation mechanism that follows three main phases. (1) The results of a Web search performed by the user (by submitting a query to a search engine) are clustered. (2) Clusters are ranked, based on a personalized balance of their content-similarity to the query and their novelty. (3) From each cluster, a disambiguated query that highlights the main contents of the cluster is generated, in such a way the new query is potentially capable to retrieve new documents, not previously retrieved; the disambiguated queries are suggestions for possibly new and more focused searches. The paper describes the proposal, illustrating a sample application of the mechanism. Finally, the paper presents a user's evaluation experiment of the proposed approach, comparing it with common practice based on the direct use of search engines.

Disambiguated query suggestions and personalized content-similarity and novelty ranking of clustered results to optimize web searches

PSAILA, Giuseppe;
2012-05-01

Abstract

In this paper, we face the so called "ranked list problem" of Web searches, that occurs when users submit short requests to search engines. Generally, as a consequence of terms' ambiguity and polysemy, users engage long cycles of query reformulation in an attempt to capture relevant information in the top ranked results. The overall objective of the proposal is to support the user in optimizing Web searches, by reducing the need for long search iterations. Specifically, in this paper we describe an iterative query disambiguation mechanism that follows three main phases. (1) The results of a Web search performed by the user (by submitting a query to a search engine) are clustered. (2) Clusters are ranked, based on a personalized balance of their content-similarity to the query and their novelty. (3) From each cluster, a disambiguated query that highlights the main contents of the cluster is generated, in such a way the new query is potentially capable to retrieve new documents, not previously retrieved; the disambiguated queries are suggestions for possibly new and more focused searches. The paper describes the proposal, illustrating a sample application of the mechanism. Finally, the paper presents a user's evaluation experiment of the proposed approach, comparing it with common practice based on the direct use of search engines.
journal article - articolo
1-mag-2012
Bordogna, Gloria; Campi, Alessandro; Psaila, Giuseppe; Ronchi, Stefania
File allegato/i alla scheda:
File Dimensione del file Formato  
preprintIPM2012.pdf

Solo gestori di archivio

Descrizione: publisher's version - versione dell'editore
Licenza: Creative commons
Dimensione del file 2.42 MB
Formato Adobe PDF
2.42 MB Adobe PDF   Visualizza/Apri
Pubblicazioni consigliate

Questo articolo è pubblicato sotto una Licenza Licenza Creative Commons Creative Commons

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/28461
Citazioni
  • Scopus 20
  • ???jsp.display-item.citation.isi??? 12
social impact