Online subject on Species Distribution Modelling! Interested?

Are you interested in modelling? Are you a graduate student, and your project involves studying species distributions? Or maybe you are a research professional or a manager wanting to expand your quantitative skills? Species distribution modelling is one of the most highly cited areas of ecological research. And it is not just about research […]

via Wanting to learn Species Distribution Modelling? Consider enrolling in our online subject! — The Quantitative & Applied Ecology Group

Posted in Uncategorized | Leave a comment

Accounting for imperfect detection in the modelling of species distributions, range dynamics and communities

Wanting to catch up with the methods? Check this paper out!

Understanding where species currently are, and where they are likely to be, is a central question in ecology. One way to obtain such knowledge is to build correlative models that describe how species respond to environmental conditions. Apart from helping answer scientific questions, models of species distributions and range dynamics are frequently used to support different types of environmental decisions.

Now, taking a bunch of species observation records, a bunch of predictors and fitting a model can be done very quickly, supported by a wide range of available user-friendly software packages and tools. A challenge remains however: ensuring that models are carefully built, so that they yield useful predictions. There are many important issues that require attention during model construction (e.g. selection of predictors, resolution, extent…). In the species distribution modelling literature, there is a good selection of papers covering many of these critical topics. One aspect that also needs attention is imperfect detection of species. As species detectability may vary in space and/or time, disregarding the likelihood of species detection can lead to biased inference about species distributions and their dynamics, which can potentially misguide environmental decisions [see this and this].

Fig2_P.jpg

Whether a species tends to be more or less difficult to detect depends on several factors (Fig 2 in the paper)

Over the past 10-15 years, there have been significant methodological advances addressing this problem. A modelling framework has been developed, with models that explicitly describe the observation process, in addition to the latent ecological process (the species distribution, and its drivers). Imperfect detection is a central theme in my research, so last year, I decided to write this review paper (now published in Ecography), with the aim of providing a comprehensive overview of advances in this area. The paper summarizes modelling developments, discusses evidence about effects of imperfect detection and the difficulties of working with it, and concludes with the current outlook for future research and application of these methods.

fig1_blog

Models have two components: one that describes the distribution of the species as a function of environmental covariates; and one that describes how that distribution pattern is observed, which can depend both on environmental covariates at the site level and on the characteristics of the specific survey visit (part of Fig 1 in the paper) 

I wanted this paper to be a helpful tool, not only as a reference for those familiar with the topic, but also as a quick point of entry for those wanting to catch up with these methods. For that purpose, I included this table (click to enlarge) that summarises at a glance some of the key papers in the discipline. I initially built this table as a quick reference guide for myself. I found it useful so I realised it was worth sharing. I hope some of you find it a handy resource too!

Fig4_P.jpg

Posted in Uncategorized | Leave a comment

PhD opportunity (ecological modelling and monitoring)

I’m seeking applications from highly motivated candidates interested in conducting PhD research on wildlife monitoring or species distribution modelling, particularly from a methodological angle. Work could involve understanding how existing modelling tools work, evaluating how they perform under different circumstances, and developing extensions and guidelines for their use.

As the research will develop at the interface between ecology and statistics, desirable candidates include ecologists with skills in statistical modelling, as well as statisticians and mathematicians with strong interest in ecological applications. A successful candidate would ideally start early 2017, and will be co-supervised by other QAECO principal investigators. The successful candidate must secure a scholarship through the University of Melbourne (APA or MRS).

To apply: send a one-page statement outlining your research interests and ideas, together with a CV, academic transcript and contact details for 2 academic references to gguillera[at]unimelb.edu.au. Candidates should contact to discuss details about this opportunity at least several weeks prior to deadline for applications (28 October; or 30 September for international applicants).

Full advertisement here

banner

Posted in Uncategorized | Leave a comment

Talk at the International Biometric Conference

Last week, the 2016 International Biometric Conference was held in Victoria, BC (Canada). As recipients of the ‘2014 best JABES paper’ award, me and my co-authors Byron Morgan, and Martin Ridout had been invited to present our work at the conference. Unfortunately, this time none of us could make it to the conference in person, which is a pity, because judging from the program there were lots of interesting sessions. Yet, the organisers gave us the chance to present in video format. And this is the result! :o)

Posted in Uncategorized | Leave a comment

Modelling course in Spain

Yesterday José Lahoz, Marc Kéry and I finished teaching our 5-day course on Modelling the distribution of species and communities accounting for detection using R and BUGS/JAGS. The course was hosted by the Population Ecology Group of the Mediterranean Institute for Advanced Studies (IMEDEA) in Esporles, Mallorca (Spain)… a really beautiful place!  

some_of_us_cr.jpg

with Marc and José

We had a fantastic and varied group of attendants with a range of backgrounds. They were from 12 different countries (spain, portugal, france, uk, netherlands, italy, germany, switzerland, greece, brazil, estonia and canada) and came with plenty of interesting questions and ideas for discussion, so I really enjoyed the week!

group_cr.jpg

The group

In the course, we first reviewed the bases of statistical inference (maximum likelihood and Bayesian). We then discussed and applied the occupancy-detection modelling framework for modelling species distribution patterns, range dynamics and communities. We had a bit of time dedicated to pracs, and Stefano Canessa assisted us with those, which was really helpful.

class1.JPG

In the class…

The course was intensive and we worked hard… but we also had time to enjoy. We made new friends and enjoyed the Mallorcan cuisine!

dinner1.JPG

… and having fun!

Posted in Uncategorized | Leave a comment

Is my SDM fit for purpose?

Just back from ICCB 2015 in beautiful Montpellier. I really enjoyed it! Lots of interesting talks and engaging discussions, plus I met many old friends and made new ones… what else could one ask from a conference? 🙂

At ICCB, José talked about a paper about species distribution models (SDMs) that we published with several colleagues early this year (Guillera-Arroita et al. 2015, GEB). People seemed to enjoy his talk so I thought I could write a bit about this work in this space too.

In our paper, we looked at the properties of species occurrence data types in terms of their information content about a species distribution, and the implications that this has for different application of SDMs. We looked at presence-background data (only presence records plus information about the environmental conditions in the area), presence-absence data (presence and absence records) and detection data (presence-absence data collected in a way that allows modeling the detection process). Our work provides a synthesis about issues that have been discussed in the literature, which we summarized in this figure:

PowerPoint Presentation

The figure shows the different quantities that an SDM can estimate depending on the type of data and conditions. The dark arrows indicate what can be estimated by default with a given type of data. All this assuming many other things, such as good predictors, model structure, sample size, etc, etc…. By the way, psi = occupancy and p* = cumulative detectability.

I am not going to repeat the paper here, but just highlight a few key issues to remember:

  • Presence-background data are prone to problems with sampling bias
  • Presence-background data at best can only provide a relative measure of occurrence probability (yes, that’s it, one cannot estimate actual species occupancy probability with Maxent, or other PB methods!) *
  • Presence-absence data gives estimates of species occurrence probability if detectability is perfect, but not otherwise. This is particularly an issue if detectability varies with environmental covariates (see also this paper)

So the thing is that, depending on the type of data we use for our SDM, we might be estimating one thing or another. Now, the important question is whether this matters for your application. With this in mind, we reviewed a large number of applications that use SDMs from the point of view of the information that they require from the SDM (e.g. does it need information about actual probabilities, or knowledge about relative likelihood of occurrence is fine?). We constructed a (gigantic!) table (in Appendix S3) that we hope can guide SDM users evaluate whether the data they have at hand are suitable for their needs. And we explored five of those applications in detail via simulations.

… oh yes, and we also talked about the widespread practice of reducing SDM outputs to a binary map by applying a threshold. But I’ll write about that some other time!

G

* Note: There has been some work showing ways to do so (e.g. here), but the methods are very sensitive to mild deviations from parametric assumptions (see an example in Appendix S2). So in practice we think it is best to assume that only a relative estimation is obtained. This makes sense: if absence data are unavailable we do not have information about the prevalence of the species… how could we tell whether a species with few records is rare or whether the sampling was very sparse?

Posted in Uncategorized | 2 Comments

SODA: Occupancy-detection simulations

I’m going to use this post to rescue a little piece of code that I produced for one of my very first papers, one that I wrote as part of my PhD (Guillera-Arroita et al., 2010, MEE). I find this piece of code quite useful, both for my own work and as a teaching tool. So I thought I should share it in this site, too! This is an R function that runs simulations to assess the performance of the constant occupancy-detection model, given a scenario specified by the user. Very (un)creatively, I decided to call it SODA, for species occupancy model design assistant…

Occupancy-detection models are an extension of logistic regression to account for imperfect detection (MacKenzie et al, 2002; Tyre et al, 2003). Their aim is to estimate species occurrence while accounting for the fact that the species might be missed during surveys at sites it occupies (there are models for false positives too, but I’m not considering those here). Data are needed to describe the detection process, and this is often (although not necessarily) achieved by conducting repeat surveys at the sampling sites.

In SODA, a simulation scenario is defined by the probability of species occupancy (psi), the probability of detecting the species during a survey at a site it occupies (p), the number of replicate visits per site (K) and the total survey budget (E). The latter is expressed in units of repeat survey visits. Increased costs for the first survey visit can also be accommodated with a separate parameter (C1, so that E = S * (K – 1 + C1) where S is the number of sites; if C1 = 1 then E = S*K). Once a scenario is defined, SODA quickly computes estimator bias, precision and accuracy… but, most importantly, it produces cool plots!! … well, at least I think they are cool ;-). The plots are quite intuitive, in that one can immediately get a sense of how biased/imprecise (or not) the estimator is in a given scenario. The dots represent potential outcomes of a study carried out in such conditions. In order words, each dot shows the (maximum-likelihood) estimates for occupancy (y-axis) and detectability (x-axis) obtained when analyzing one of the potential datasets collected given the survey design (number of sampling sites S and replicate visits K) and species characteristics (occupancy psi and detectability p). The color scheme indicates which outcomes are more likely (hot) than others (cold).

SODA’s plots give a sense of how many data are needed to get meaningful results. Looking at them can be eye opening. For instance, the plots below how K=2 surveys at S=50 sites lead to decent performance when detectability and occupancy are high (left), but how this same amount of survey effort is unlikely to be sufficient for rare and elusive species (right). In this case, the estimates have great spread and obtaining boundary estimates (psi=1) is quite possible.

Rplot

SODA’s plots can also help assess trade-offs. For instance, look how increasing the number of replicates is more efficient than increasing the number of sites in the latter scenario:

Rplot

Unarguably SODA’s simulations are simplistic (constant model, model assumptions perfectly met, same effort applied to all sites), yet SODA can be a handy tool to assist in survey design. It quickly provides a basic understanding of whether a given survey effort is appropriate or not in “ideal” conditions. So it sets an important lower bound for survey effort requirements. I trust someone out there will find SODA useful too!

Posted in Uncategorized | 1 Comment