BIRDIE DST: Model summary
v11-birdie-dst-model-summary.Rmd
Introduction
The species distributions module (DST) of the BIRDIE pipeline has four main steps: data preparation, model fitting, model diagnostics and model summary. See the BIRDIE: basics and BIRDIE: species distributions vignettes for general details about BIRDIE and about the DST module, respectively. In this vignette, we will go through the different tasks that are performed during the step of the DST module: model summary.
The main function used for summarising a model fit is
ppl_summarise_occu()
. This is a ppl_
function,
and therefore it doesn’t do much processing itself (see BIRDIE:
basics if this is confusing), but it does call the right functions
to do the work.
This step is relatively simple. Here we need to conduct two tasks: predict occurrence and detection probabilities for all pentads in South Africa from a fitted model and summarise these predictions.
Predicting from a model fit
predictSpOccu()
is used for predicting from a model fit
and it is basically a wrapper around
spOccupancy::predict.PGOcc()
. However, it has a few data
preparation steps to make sure the data we pass on to the predict
function has the same variables and on the same scale as the data we
used for fitting the model. Model are fitted using only those pentads
that were visited in any given year, but we want to predict for all
pentads in South Africa. To maintain the same scale, we make use of the
covariate scale information that we stored in the model fit object (see
BIRDIE DST: model fitting vignette).
The output of predictSpOccu()
is a list with two
elements containing posterior predictive samples for \(\psi\) (psi, probability of occurrence) and
\(p\) (probability of detection) for
each pentad.
Summarising predictions
In this step, we use the function summariseSpOccu()
to
extract the 0.025, 0.5 and 0.975 quantiles of the predictive samples for
\(\psi\) and \(p\) obtained from
predictSpOccu()
. We will store these quantiles to display
on the BIRDIE website rather than all of the posterior predictive
samples.
We also compute the realized occupancy from the posterior predictive samples and the data. The realized occupancy is the probability of occurrence conditional on the observed data, such that
\[P[occu | obs = 1] = 1\] \[P[occu | obs = 0] = \frac{\psi q}{(1 - \psi + \psi q)}\] where \(obs = 1\) when the species was detected at a site on any visit, and \(obs = 0\) when the species was not detected at a site on any visit, \(\psi\) is the probability of occurrence estimated by our model and \(q = \prod_{i=1}^N (1 - p_i)\), with \(p_i\) being the probability of detection on visit \(i\) and \(N\) the total number of visits to a given site.
All these summaries are stored in a file called
analysis/output/occu_pred_spOccupancy_YYYY_sp_code.csv
,
where YYYY
is the year of the data we are fitting a model
to and sp_code
is the species code. These files will be
exported to the database for storage and display on the BIRDIE
website.
If no model could be fitted because of lack of enough data, then, we
use the raw data to create the prediction file mentioned above The file
still has the exact same structure, but now \(\psi\) and \(p\) will be NA
for all pentads
(because they could not be calculated). Raw data will be treated as
realized occupancy and no confidence limits will be estimated. The
function that performs these tasks is
createPredFromAbap()
.