In recent years, a large international effort has been placed in compiling a complete list of Antarctic mollusc distributional records based both on historical occurrences, dating back to 1899, and on newly collected data. Such dataset is highly asymmetrical in the quality of contained information, due to the variety of sampling gears used and the amount of information recorded at each sampling station (e.g. sampling gear used, sieve mesh size used, etc.). This dataset stimulates to deploy all statistical potential in terms of data representation, estimation, clusterization and prediction. In this paper we aim at selecting an appropriate statistical model for this dataset in order to explain species richness (i.e. the number of observed species) as a function of several covariates, such as gear used, latitude, etc.. Given the nature of data, we preliminary implement a Poisson regression model and we extend it with a Negative Binomial regression to manage over-dispersion. Generalized linear mixed models (GLMM) and generalized additive models (GAM) are also explored to capture a possible extra explicative power of the covariates. However, preliminary results under them suggest that more sophisticated models are needed. Therefore, we introduce a hierarchical Bayesian model, involving a nonparametric approach through the assumption of random effects with a Dirichlet Process prior.
(2015). Statistical models for species richness in the Ross Sea [conference presentation - intervento a convegno]. Retrieved from http://hdl.handle.net/10446/48758
Statistical models for species richness in the Ross Sea
2015-01-01
Abstract
In recent years, a large international effort has been placed in compiling a complete list of Antarctic mollusc distributional records based both on historical occurrences, dating back to 1899, and on newly collected data. Such dataset is highly asymmetrical in the quality of contained information, due to the variety of sampling gears used and the amount of information recorded at each sampling station (e.g. sampling gear used, sieve mesh size used, etc.). This dataset stimulates to deploy all statistical potential in terms of data representation, estimation, clusterization and prediction. In this paper we aim at selecting an appropriate statistical model for this dataset in order to explain species richness (i.e. the number of observed species) as a function of several covariates, such as gear used, latitude, etc.. Given the nature of data, we preliminary implement a Poisson regression model and we extend it with a Negative Binomial regression to manage over-dispersion. Generalized linear mixed models (GLMM) and generalized additive models (GAM) are also explored to capture a possible extra explicative power of the covariates. However, preliminary results under them suggest that more sophisticated models are needed. Therefore, we introduce a hierarchical Bayesian model, involving a nonparametric approach through the assumption of random effects with a Dirichlet Process prior.File | Dimensione del file | Formato | |
---|---|---|---|
3305-6778-1-DR.pdf
accesso aperto
Versione:
publisher's version - versione editoriale
Licenza:
Licenza default Aisberg
Dimensione del file
151.6 kB
Formato
Adobe PDF
|
151.6 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo