Is my SDM fit for purpose?

Just back from ICCB 2015 in beautiful Montpellier. I really enjoyed it! Lots of interesting talks and engaging discussions, plus I met many old friends and made new ones… what else could one ask from a conference? 🙂

At ICCB, José talked about a paper about species distribution models (SDMs) that we published with several colleagues early this year (Guillera-Arroita et al. 2015, GEB). People seemed to enjoy his talk so I thought I could write a bit about this work in this space too.

In our paper, we looked at the properties of species occurrence data types in terms of their information content about a species distribution, and the implications that this has for different application of SDMs. We looked at presence-background data (only presence records plus information about the environmental conditions in the area), presence-absence data (presence and absence records) and detection data (presence-absence data collected in a way that allows modeling the detection process). Our work provides a synthesis about issues that have been discussed in the literature, which we summarized in this figure:

PowerPoint Presentation

The figure shows the different quantities that an SDM can estimate depending on the type of data and conditions. The dark arrows indicate what can be estimated by default with a given type of data. All this assuming many other things, such as good predictors, model structure, sample size, etc, etc…. By the way, psi = occupancy and p* = cumulative detectability.

I am not going to repeat the paper here, but just highlight a few key issues to remember:

  • Presence-background data are prone to problems with sampling bias
  • Presence-background data at best can only provide a relative measure of occurrence probability (yes, that’s it, one cannot estimate actual species occupancy probability with Maxent, or other PB methods!) *
  • Presence-absence data gives estimates of species occurrence probability if detectability is perfect, but not otherwise. This is particularly an issue if detectability varies with environmental covariates (see also this paper)

So the thing is that, depending on the type of data we use for our SDM, we might be estimating one thing or another. Now, the important question is whether this matters for your application. With this in mind, we reviewed a large number of applications that use SDMs from the point of view of the information that they require from the SDM (e.g. does it need information about actual probabilities, or knowledge about relative likelihood of occurrence is fine?). We constructed a (gigantic!) table (in Appendix S3) that we hope can guide SDM users evaluate whether the data they have at hand are suitable for their needs. And we explored five of those applications in detail via simulations.

… oh yes, and we also talked about the widespread practice of reducing SDM outputs to a binary map by applying a threshold. But I’ll write about that some other time!


* Note: There has been some work showing ways to do so (e.g. here), but the methods are very sensitive to mild deviations from parametric assumptions (see an example in Appendix S2). So in practice we think it is best to assume that only a relative estimation is obtained. This makes sense: if absence data are unavailable we do not have information about the prevalence of the species… how could we tell whether a species with few records is rare or whether the sampling was very sparse?

This entry was posted in Uncategorized. Bookmark the permalink.

2 Responses to Is my SDM fit for purpose?

  1. Pingback: Is your SDM fit for purpose? | José J. Lahoz-Monfort

  2. Pingback: Dbytes #208 (18 August 2015) | Dbytes

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s