The crowd can be an incredible source of information. In particular, this is true for reviews about products of any kind, freely provided by customers through specialized web sites. In other words, they are social knowledge, that can be exploited by other customers. The Hints From the Crowd (HFC) prototype, presented in this paper, is a NoSQL database system for large collections of product reviews; the database is queried by expressing a natural language sentence; the result is a list of products ranked based on the relevance of reviews w.r.t. the natural language sentence. The best ranked products in the result list can be seen as the best hints for the user based on crowd opinions (the reviews). In this paper, we mainly describe the query engine, and we show that our prototype obtains good performance in terms of execution time, demonstrating that our approach is feasible. The IMDb dataset, that includes more than 2 million reviews for more than 100,000 movies, is used to evaluate performance.
(2013). Hints from the Crowd: A Novel NoSQL Database [conference presentation - intervento a convegno]. Retrieved from http://hdl.handle.net/10446/30152
Hints from the Crowd: A Novel NoSQL Database
FOSCI, Paolo;PSAILA, Giuseppe;
2013-01-01
Abstract
The crowd can be an incredible source of information. In particular, this is true for reviews about products of any kind, freely provided by customers through specialized web sites. In other words, they are social knowledge, that can be exploited by other customers. The Hints From the Crowd (HFC) prototype, presented in this paper, is a NoSQL database system for large collections of product reviews; the database is queried by expressing a natural language sentence; the result is a list of products ranked based on the relevance of reviews w.r.t. the natural language sentence. The best ranked products in the result list can be seen as the best hints for the user based on crowd opinions (the reviews). In this paper, we mainly describe the query engine, and we show that our prototype obtains good performance in terms of execution time, demonstrating that our approach is feasible. The IMDb dataset, that includes more than 2 million reviews for more than 100,000 movies, is used to evaluate performance.File | Dimensione del file | Formato | |
---|---|---|---|
MEDI2013.pdf
accesso aperto
Descrizione: draft - bozza
Dimensione del file
361.08 kB
Formato
Adobe PDF
|
361.08 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo