Penalized Model-Based Clustering with Group-Dependent Shrinkage Estimation

Gaussian mixture models (GMM) are the most-widely employed approach to perform model-based clustering of continuous features. Grievously, with the increasing availability of high-dimensional datasets, their direct applicability is put at stake: GMMs suffer from the curse of dimensionality issue, as the number of parameters grows quadratically with the number of variables. To this extent, a methodological link between Gaussian mixtures and Gaussian graphical models has recently been established in order to provide a framework for performing penalized model-based clustering in presence of large precision matrices. Notwithstanding, current methodologies do not account for the fact that groups may be under or over-connected, thus implicitly assuming similar levels of sparsity across clusters. We overcome this limitation by defining data-driven and component specific penalty factors, automatically accounting for different degrees of connections within groups. A real data experiment on handwritten digits recognition showcases the validity of our proposal.

(2023). Penalized Model-Based Clustering with Group-Dependent Shrinkage Estimation . Retrieved from https://hdl.handle.net/10446/269567

Penalized Model-Based Clustering with Group-Dependent Shrinkage Estimation

Casa, Alessandro;Cappozzo, Andrea;Fop, Michael

2023-01-01

Abstract

Gaussian mixture models (GMM) are the most-widely employed approach to perform model-based clustering of continuous features. Grievously, with the increasing availability of high-dimensional datasets, their direct applicability is put at stake: GMMs suffer from the curse of dimensionality issue, as the number of parameters grows quadratically with the number of variables. To this extent, a methodological link between Gaussian mixtures and Gaussian graphical models has recently been established in order to provide a framework for performing penalized model-based clustering in presence of large precision matrices. Notwithstanding, current methodologies do not account for the fact that groups may be under or over-connected, thus implicitly assuming similar levels of sparsity across clusters. We overcome this limitation by defining data-driven and component specific penalty factors, automatically accounting for different degrees of connections within groups. A real data experiment on handwritten digits recognition showcases the validity of our proposal.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				2023
			
	Tutti gli autori
	
						Casa, Alessandro; Cappozzo, Andrea; Fop, Michael
					
	Nelle collezioni:
	
				1.4.01 Contributi in atti di convegno - Conference presentations

File allegato/i alla scheda:

File	Dimensione del file	Formato
Casa et al_Building Bridges.pdf Solo gestori di archivio Versione: publisher's version - versione editoriale Licenza: Licenza default Aisberg Dimensione del file 382.48 kB Formato Adobe PDF Visualizza/Apri	382.48 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/269567

Citazioni

ND

0

social impact