Quality control methods for multivariate data are generally based on using robust estimates of parameters for the multivariate normal (MVN) distribution. However, many multivariate data generating processes do not produce elliptical contours, and in such cases, error detection using the MVN distribution would lead to many legitimate observations being erroneously flagged. In this work, we develop a semi-parametric method for identifying errors in skewed multivariate data that also has a functional component. In the first step, we remove potential outliers by assigning each multivariate observation or function a depth score and remove those observations that fall beyond a given thresh- old. The remaining observations are used to estimate the parameters in a multivariate skew-t (MVST) distribution, and this estimated distribution is used in assigning all observations a probability of having been generated from this MVST. We test the performance of this two-step approach in simulation against a more common MVN method adapted for functional data. When the observations are skewed, our approach has a higher percentage of correctly identified outliers and a lower percentage false positives. Finally, we show how our method can be used in practice with radiosonde launches at a Denver, Colorado station of horizontal and vertical wind components measured at 8 vertical pressure levels.

(2014). A Semi-Parametric Method for Robust Multivariate Error Detection in Skewed Functional Data with Application to Historical Radiosonde Winds [conference presentation - intervento a convegno]. Retrieved from http://hdl.handle.net/10446/31703

A Semi-Parametric Method for Robust Multivariate Error Detection in Skewed Functional Data with Application to Historical Radiosonde Winds

2014-01-01

Abstract

Quality control methods for multivariate data are generally based on using robust estimates of parameters for the multivariate normal (MVN) distribution. However, many multivariate data generating processes do not produce elliptical contours, and in such cases, error detection using the MVN distribution would lead to many legitimate observations being erroneously flagged. In this work, we develop a semi-parametric method for identifying errors in skewed multivariate data that also has a functional component. In the first step, we remove potential outliers by assigning each multivariate observation or function a depth score and remove those observations that fall beyond a given thresh- old. The remaining observations are used to estimate the parameters in a multivariate skew-t (MVST) distribution, and this estimated distribution is used in assigning all observations a probability of having been generated from this MVST. We test the performance of this two-step approach in simulation against a more common MVN method adapted for functional data. When the observations are skewed, our approach has a higher percentage of correctly identified outliers and a lower percentage false positives. Finally, we show how our method can be used in practice with radiosonde launches at a Denver, Colorado station of horizontal and vertical wind components measured at 8 vertical pressure levels.
2014
Sun, Y.; Hering, A. S.; Nychka, D.
File allegato/i alla scheda:
File Dimensione del file Formato  
3188-6566-1-PB.pdf

accesso aperto

Descrizione: publisher's version - versione dell'editore
Dimensione del file 866.55 kB
Formato Adobe PDF
866.55 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/31703
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact