This paper recasts the problem of missing values in the covariates of a regression model as a latent Gaussian Markov random feld (GMRF) model in a fully Bayesian framework. The proposed approach is based on the defnition of the covariate imputation sub-model as a latent effect with a GMRF structure. This formulation works for continuous covariates but for categorical covariates a typical multiple imputation approach is employed. Both techniques can be easily combined for the case in which continuous and categorical variables have missing values. The resulting Bayesian hierarchical model naturally fts within the integrated nested Laplace approximation (INLA) framework, which is used for model ftting. Hence, this work flls an important gap in the INLA methodology as it allows to treat models with missing values in the covariates. As in any other fully Bayesian framework, by relying on INLA for model ftting it is possible to formulate a joint model for the data, the imputed covariates and their missingness mechanism. In this way, it is possible to tackle the more general problem of assessing the missingness mechanism by conducting a sensitivity analysis on the different alternatives to model the non-observed covariates. Finally, the proposed approach is illustrated in two examples on modeling health risk factors and disease mapping.
(2022). Missing data analysis and imputation via latent Gaussian Markov random fields [journal article - articolo]. In SORT. Retrieved from https://hdl.handle.net/10446/233872
Missing data analysis and imputation via latent Gaussian Markov random fields
Cameletti, Michela;Blangiardo, Marta
2022-01-01
Abstract
This paper recasts the problem of missing values in the covariates of a regression model as a latent Gaussian Markov random feld (GMRF) model in a fully Bayesian framework. The proposed approach is based on the defnition of the covariate imputation sub-model as a latent effect with a GMRF structure. This formulation works for continuous covariates but for categorical covariates a typical multiple imputation approach is employed. Both techniques can be easily combined for the case in which continuous and categorical variables have missing values. The resulting Bayesian hierarchical model naturally fts within the integrated nested Laplace approximation (INLA) framework, which is used for model ftting. Hence, this work flls an important gap in the INLA methodology as it allows to treat models with missing values in the covariates. As in any other fully Bayesian framework, by relying on INLA for model ftting it is possible to formulate a joint model for the data, the imputed covariates and their missingness mechanism. In this way, it is possible to tackle the more general problem of assessing the missingness mechanism by conducting a sensitivity analysis on the different alternatives to model the non-observed covariates. Finally, the proposed approach is illustrated in two examples on modeling health risk factors and disease mapping.File | Dimensione del file | Formato | |
---|---|---|---|
46.2.3.Gómez-Rubio-etal.prov.pdf
accesso aperto
Versione:
publisher's version - versione editoriale
Licenza:
Creative commons
Dimensione del file
803.53 kB
Formato
Adobe PDF
|
803.53 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo