Bias Correction in Clustered Underreported Data

Data quality from poor and socially deprived regions have given rise to many statistical challenges. One of them is the underreporting of vital events leading to biased estimates for the associated risks. To deal with underreported count data, models based on compound Poisson distributions have been commonly assumed. To be identifiable, such models usually require extra and strong information about the probability of reporting the event in all areas of interest, which is not always available. We introduce a novel approach for the compound Poisson model assuming that the areas are clustered according to their data quality. We leverage these clusters to create a hierarchical structure in which the reporting probabilities decrease as we move from the best group to the worst ones.We obtain constraints for model identifiability and prove that only prior information about the reporting probability in areas experiencing the best data quality is required. Several approaches to model the uncertainty about the reporting probabilities are presented, including reference priors. Different features regarding the proposed methodology are studied through simulation. We apply our model to map the early neonatal mortality risks in Minas Gerais, a Brazilian state that presents heterogeneous characteristics and a relevant socio-economical inequality.

(2022). Bias Correction in Clustered Underreported Data [journal article - articolo]. In BAYESIAN ANALYSIS. Retrieved from http://hdl.handle.net/10446/193475

Bias Correction in Clustered Underreported Data

Lopes de Oliveira, Guilherme;Argiento, Raffaele;Loschi, Rosangela Helena;Martins Assuncao, Renato;Ruggeri, Fabrizio;D’Elia Branco, Marcia

2022-01-01

Abstract

Data quality from poor and socially deprived regions have given rise to many statistical challenges. One of them is the underreporting of vital events leading to biased estimates for the associated risks. To deal with underreported count data, models based on compound Poisson distributions have been commonly assumed. To be identifiable, such models usually require extra and strong information about the probability of reporting the event in all areas of interest, which is not always available. We introduce a novel approach for the compound Poisson model assuming that the areas are clustered according to their data quality. We leverage these clusters to create a hierarchical structure in which the reporting probabilities decrease as we move from the best group to the worst ones.We obtain constraints for model identifiability and prove that only prior information about the reporting probability in areas experiencing the best data quality is required. Several approaches to model the uncertainty about the reporting probabilities are presented, including reference priors. Different features regarding the proposed methodology are studied through simulation. We apply our model to map the early neonatal mortality risks in Minas Gerais, a Brazilian state that presents heterogeneous characteristics and a relevant socio-economical inequality.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di articolo
	
				articolo
			
	Data di pubblicazione
	
				2022
			
	Rivista in ANCE
	
				BAYESIAN ANALYSIS
			
	Tutti gli autori
	
						Lopes de Oliveira, Guilherme; Argiento, Raffaele; Loschi, Rosangela Helena; Martins Assuncao, Renato; Ruggeri, Fabrizio; D’Elia Branco, Marcia...espandi
						
	Citazione
	
				(2022). Bias Correction in Clustered Underreported Data  [journal article - articolo]. In BAYESIAN ANALYSIS. Retrieved from http://hdl.handle.net/10446/193475
			
	Nelle collezioni:
	
				1.1.01 Articoli/Saggi in rivista - Journal Articles/Essays

File allegato/i alla scheda:

File	Dimensione del file	Formato
20-BA1244.pdf accesso aperto Versione: publisher's version - versione editoriale Licenza: Creative commons Dimensione del file 4.53 MB Formato Adobe PDF Visualizza/Apri	4.53 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/193475

Citazioni

12

13

social impact