Large Language Models (LLMs) have shown significant potential in natural language processing tasks, including various applications in clinical and biomedical domains. This study explores the use of LLMs for analyzing a real dataset from Italian clinical reports and proposes a pipeline for automatically clustering these reports based on the described symptoms. The pipeline incorporates two approaches: (1) direct analysis of textual descriptions in the clinical reports, and (2) standardized processing through the automatic extraction of Human Phenotype Ontology terms using LLM-based methods. The obtained clusters will serve as the foundation for further predictive analyses, such as estimating the likelihood of a patient carrying specific genetic mutations. Our investigation compares the performance of direct text analysis against phenotype-standardized descriptions, highlighting the strengths and limitations of each approach.

(2025). Automated Phenotype-Based Clustering of Clinical Reports Using Large Language Models . Retrieved from https://hdl.handle.net/10446/306126

Automated Phenotype-Based Clustering of Clinical Reports Using Large Language Models

Saletta, Martina;Bombarda, Andrea;Cazzaniga, Paolo;Savo, Domenico Fabio
2025-01-01

Abstract

Large Language Models (LLMs) have shown significant potential in natural language processing tasks, including various applications in clinical and biomedical domains. This study explores the use of LLMs for analyzing a real dataset from Italian clinical reports and proposes a pipeline for automatically clustering these reports based on the described symptoms. The pipeline incorporates two approaches: (1) direct analysis of textual descriptions in the clinical reports, and (2) standardized processing through the automatic extraction of Human Phenotype Ontology terms using LLM-based methods. The obtained clusters will serve as the foundation for further predictive analyses, such as estimating the likelihood of a patient carrying specific genetic mutations. Our investigation compares the performance of direct text analysis against phenotype-standardized descriptions, highlighting the strengths and limitations of each approach.
2025
Saletta, Martina; Bombarda, Andrea; Bellini, Matteo; Goisis, Lucrezia; Cazzaniga, Paolo; Iascone, Maria; Savo, Domenico Fabio
File allegato/i alla scheda:
File Dimensione del file Formato  
978-3-031-95841-0 chapter.pdf

Solo gestori di archivio

Versione: publisher's version - versione editoriale
Licenza: Licenza default Aisberg
Dimensione del file 706.38 kB
Formato Adobe PDF
706.38 kB Adobe PDF   Visualizza/Apri
Pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/306126
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact