Large Language Models (LLMs) have shown significant potential in natural language processing tasks, including various applications in clinical and biomedical domains. This study explores the use of LLMs for analyzing a real dataset from Italian clinical reports and proposes a pipeline for automatically clustering these reports based on the described symptoms. The pipeline incorporates two approaches: (1) direct analysis of textual descriptions in the clinical reports, and (2) standardized processing through the automatic extraction of Human Phenotype Ontology terms using LLM-based methods. The obtained clusters will serve as the foundation for further predictive analyses, such as estimating the likelihood of a patient carrying specific genetic mutations. Our investigation compares the performance of direct text analysis against phenotype-standardized descriptions, highlighting the strengths and limitations of each approach.

(2025). Automated Phenotype-Based Clustering of Clinical Reports Using Large Language Models . Retrieved from https://hdl.handle.net/10446/306126

Automated Phenotype-Based Clustering of Clinical Reports Using Large Language Models

Saletta, Martina;Bombarda, Andrea;Cazzaniga, Paolo;Savo, Domenico Fabio
2025-01-01

Abstract

Large Language Models (LLMs) have shown significant potential in natural language processing tasks, including various applications in clinical and biomedical domains. This study explores the use of LLMs for analyzing a real dataset from Italian clinical reports and proposes a pipeline for automatically clustering these reports based on the described symptoms. The pipeline incorporates two approaches: (1) direct analysis of textual descriptions in the clinical reports, and (2) standardized processing through the automatic extraction of Human Phenotype Ontology terms using LLM-based methods. The obtained clusters will serve as the foundation for further predictive analyses, such as estimating the likelihood of a patient carrying specific genetic mutations. Our investigation compares the performance of direct text analysis against phenotype-standardized descriptions, highlighting the strengths and limitations of each approach.
2025
Inglese
Artificial Intelligence in Medicine. 23rd International Conference, AIME 2025 Pavia, Italy, June 23–26, 2025 Proceedings, Part II
9783031958403
15735 LNAI
345
350
cartaceo
online
Switzerland
Springer
AIME 2025: 23rd International Conference on Artificial Intelligence in Medicine; Pavia, Italy, 23-26 June 2025
23rd
Pavia, Italy
23-26 June 2025
internazionale
contributo
Settore IINF-05/A - Sistemi di elaborazione delle informazioni
Large Language Models; Phenotype Clustering; Human Phenotype Ontology; Clinical Reports
   ANTHEM - AdvaNced Technologies for Human-centrEd Medicine
   ANTHEM
   MUR - MINISTERO DELL'UNIVERSITA' E DELLA RICERCA - Segretariato generale Direzione generale della ricerca - Ufficio IV
info:eu-repo/semantics/conferenceObject
7
Saletta, Martina; Bombarda, Andrea; Bellini, Matteo; Goisis, Lucrezia; Cazzaniga, Paolo; Iascone, Maria; Savo, Domenico Fabio
1.4 Contributi in atti di convegno - Contributions in conference proceedings::1.4.01 Contributi in atti di convegno - Conference presentations
reserved
Non definito
273
(2025). Automated Phenotype-Based Clustering of Clinical Reports Using Large Language Models . Retrieved from https://hdl.handle.net/10446/306126
File allegato/i alla scheda:
File Dimensione del file Formato  
978-3-031-95841-0 chapter.pdf

Solo gestori di archivio

Versione: publisher's version - versione editoriale
Licenza: Licenza default Aisberg
Dimensione del file 706.38 kB
Formato Adobe PDF
706.38 kB Adobe PDF   Visualizza/Apri
Pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/306126
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact