This study explores the application of Large Language Models (LLMs) for the analysis of unstructured data from Customer Satisfaction Surveys in the manufacturing sector. The analysis is divided into several stages: firstly, models are used for the summarization of textual comments, identifying strengths and areas of improvement. Subsequently, a topic extraction and sentiment analysis is conducted by the models GPT-4o, Gemini 1.5 Pro, Claude 3.5, Llama 3.1 and Llama 3.3. The task includes identifying the topics covered in the comments and determining their sentiment (positive, negative, neutral or mixed). To assess the quality of responses, results are compared with a Ground Truth generated by a human analyst. The performance metrics selected to measure the similarity between the human and models classifications are the Jaccard Index and the accuracy. Furthermore, prompt engineering is also being tested, by creating a more structured prompt, called few-shot prompt, to test whether a prompt with more detailed topics explanations can improve model performance. Finally, the study analyses the trade-off between costs and benefits, comparing performance, response times and computational costs of different models. The results show that more advanced models offer higher accuracy but at high costs, while open-source models represent a cheaper alternative with lower performance.

(2025). Applications of large language models to customer satisfaction survey for summarization and topic extraction in manufacturing [journal article - articolo]. In RESULTS IN ENGINEERING. Retrieved from https://hdl.handle.net/10446/309950

Applications of large language models to customer satisfaction survey for summarization and topic extraction in manufacturing

Antonini, Laura;Manzoni, Vincenzo;Giardini, Claudio;Quarto, Mariangela
2025-01-01

Abstract

This study explores the application of Large Language Models (LLMs) for the analysis of unstructured data from Customer Satisfaction Surveys in the manufacturing sector. The analysis is divided into several stages: firstly, models are used for the summarization of textual comments, identifying strengths and areas of improvement. Subsequently, a topic extraction and sentiment analysis is conducted by the models GPT-4o, Gemini 1.5 Pro, Claude 3.5, Llama 3.1 and Llama 3.3. The task includes identifying the topics covered in the comments and determining their sentiment (positive, negative, neutral or mixed). To assess the quality of responses, results are compared with a Ground Truth generated by a human analyst. The performance metrics selected to measure the similarity between the human and models classifications are the Jaccard Index and the accuracy. Furthermore, prompt engineering is also being tested, by creating a more structured prompt, called few-shot prompt, to test whether a prompt with more detailed topics explanations can improve model performance. Finally, the study analyses the trade-off between costs and benefits, comparing performance, response times and computational costs of different models. The results show that more advanced models offer higher accuracy but at high costs, while open-source models represent a cheaper alternative with lower performance.
articolo
2025
Antonini, Laura; Manzoni, Vincenzo; Giardini, Claudio; Quarto, Mariangela
(2025). Applications of large language models to customer satisfaction survey for summarization and topic extraction in manufacturing [journal article - articolo]. In RESULTS IN ENGINEERING. Retrieved from https://hdl.handle.net/10446/309950
File allegato/i alla scheda:
File Dimensione del file Formato  
1-s2.0-S2590123025032335-main.pdf

accesso aperto

Versione: publisher's version - versione editoriale
Licenza: Creative commons
Dimensione del file 2.94 MB
Formato Adobe PDF
2.94 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/309950
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact