The novel context of Big Data has demonstrated that classical relational databases are not suitable: novel platforms for managing an incredible variety of datasets have become necessary, as demonstrated by the popularity of “data lakes” and “data lakehouses”. One common issue of modern data platforms is to detect pairs of datasets that concern the same topic. However, a matching that is purely syntactic is not effective: the exploitation of modern AI techniques for Natural-Language Processing, such as word embedding and sentence embedding, promise to address the issue in a (more or less) semantic way. The contribution of the paper is a novel methodology (called “TopicRank”) for flexible querying data platforms, so as to find out pairs of datasets that concern the same topic, on the basis of the textual description that accompany datasets as meta-data. The paper presents the results of a preliminary experiment that was conducted on a real pool of datasets.

(2025). Detecting Semantic Relationships Among Datasets . Retrieved from https://hdl.handle.net/10446/310987

Detecting Semantic Relationships Among Datasets

Fosci, Paolo;Psaila, Giuseppe;
2025-01-01

Abstract

The novel context of Big Data has demonstrated that classical relational databases are not suitable: novel platforms for managing an incredible variety of datasets have become necessary, as demonstrated by the popularity of “data lakes” and “data lakehouses”. One common issue of modern data platforms is to detect pairs of datasets that concern the same topic. However, a matching that is purely syntactic is not effective: the exploitation of modern AI techniques for Natural-Language Processing, such as word embedding and sentence embedding, promise to address the issue in a (more or less) semantic way. The contribution of the paper is a novel methodology (called “TopicRank”) for flexible querying data platforms, so as to find out pairs of datasets that concern the same topic, on the basis of the textual description that accompany datasets as meta-data. The paper presents the results of a preliminary experiment that was conducted on a real pool of datasets.
paolo.fosci@unibg.it
2025
Inglese
Flexible Query Answering Systems. 16th International Conference, FQAS 2025, Burgas, Bulgaria, September 11–13, 2025, Proceedings
De Tré, Guy; Sotirov, Sotir; Kacprzyk, Janusz; Psaila, Giuseppe; Smits, Grégory; Andreasen, Troels; Bordogna, Gloria; Larsen, Henrik Legind
9783032056061
978-3-032-05607-8
16119
219
231
cartaceo
online
Switzerland
Cham
Springer
FQAS 2025: 16th International Conference on Flexible Query Answering Systems, Burgas, Bulgaria, 11-13 September 2025
16th
Burgas, Bulgaria
11-13 September 2025
internazionale
contributo
Settore IINF-05/A - Sistemi di elaborazione delle informazioni
Big Data Platforms; Flexible Query on Datasets; Language Models for Information Retrieval; Methodology for Topic Detection; Semantic Topic Detection
   Growing Resilient Inclusive And Sustainable (GRINS)
   GRINS
   MUR - MINISTERO DELL'UNIVERSITA' E DELLA RICERCA - Segretariato generale Direzione generale della ricerca - Ufficio IV
info:eu-repo/semantics/conferenceObject
7
Fosci, Paolo; Carbone, Vincenzo; Leo, Matteo; Marmorato, Andrea; Psaila, Giuseppe; Rosa, Giampiero; Torabi, Mohammadsadegh
1.4 Contributi in atti di convegno - Contributions in conference proceedings::1.4.01 Contributi in atti di convegno - Conference presentations
reserved
Non definito
273
(2025). Detecting Semantic Relationships Among Datasets . Retrieved from https://hdl.handle.net/10446/310987
File allegato/i alla scheda:
File Dimensione del file Formato  
Detecting Semantic Relationships Fosci Paolo_ridotto.pdf

Solo gestori di archivio

Versione: publisher's version - versione editoriale
Licenza: Licenza default Aisberg
Dimensione del file 239.04 kB
Formato Adobe PDF
239.04 kB Adobe PDF   Visualizza/Apri
Pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/310987
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact