We propose a new model for cluster analysis in a Bayesian nonparametric framework.Our model combines two ingredients, species sampling mixture models of Gaussian distributions on one hand, and a deterministic clustering procedure (DBSCAN) on the other. Here, two observations from the underlying species sampling mixture model share the same cluster if the distance between the densities corresponding to their latent parameters is smaller than a threshold; this yields a random partition which is coarser than the one induced by the species sampling mixture. Since this procedure depends on the value of the threshold, we suggest a strategy to fix it. In addition, we discuss implementation and applications of the model; comparison with more standard clustering algorithms will be given as well. Supplementary materials for the article are available online.

(2014). A “Density-Based” Algorithm for Cluster Analysis Using Species Sampling Gaussian Mixture Models [journal article - articolo]. In JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS. Retrieved from http://hdl.handle.net/10446/193124

A “Density-Based” Algorithm for Cluster Analysis Using Species Sampling Gaussian Mixture Models

Argiento, Raffaele;
2014-01-01

Abstract

We propose a new model for cluster analysis in a Bayesian nonparametric framework.Our model combines two ingredients, species sampling mixture models of Gaussian distributions on one hand, and a deterministic clustering procedure (DBSCAN) on the other. Here, two observations from the underlying species sampling mixture model share the same cluster if the distance between the densities corresponding to their latent parameters is smaller than a threshold; this yields a random partition which is coarser than the one induced by the species sampling mixture. Since this procedure depends on the value of the threshold, we suggest a strategy to fix it. In addition, we discuss implementation and applications of the model; comparison with more standard clustering algorithms will be given as well. Supplementary materials for the article are available online.
articolo
2014
Argiento, Raffaele; Cremaschi, Andrea; Guglielmi, Alessandra
(2014). A “Density-Based” Algorithm for Cluster Analysis Using Species Sampling Gaussian Mixture Models [journal article - articolo]. In JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS. Retrieved from http://hdl.handle.net/10446/193124
File allegato/i alla scheda:
File Dimensione del file Formato  
12-3_JCGS_4aperto.pdf

Open Access dal 22/10/2015

Descrizione: "This is an Accepted Manuscript version of the following article, accepted for publication in Journal of Computational and Graphical Statistics. "Raffaele Argiento, Andrea Cremaschi, Marina Vannucci. (2020) Hierarchical Normalized Completely Random Measures to Cluster Grouped Data. Journal of the American Statistical Association 115:529, pages". It is deposited under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.”
Versione: postprint - versione referata/accettata senza referaggio
Licenza: Creative commons
Dimensione del file 1.41 MB
Formato Adobe PDF
1.41 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/193124
Citazioni
  • Scopus 17
  • ???jsp.display-item.citation.isi??? 17
social impact