Maxent Is Not a Presence-Absence Method: A Comment on Thibaud et~Al. Guillera-Arroita, G., Lahoz-Monfort, J. J., & Elith, J. Methods in Ecology and Evolution, 5(11):1192–1197, November, 2014.
doi  abstract   bibtex   
[Summary] [::1] Thibaud et al. (Methods in Ecology and Evolution 2014) present a framework for simulating species and evaluating the relative effects of factors affecting the predictions from species distribution models (SDMs). They demonstrate their approach by generating presence-absence data sets for different simulated species and analysing them using four modelling methods: three presence-absence methods and Maxent, which is a presence-background modelling tool. One of their results is striking: that their use of Maxent performs well in estimating occupancy probabilities and even outperforms the other methods on small sample sizes. This result is of concern to us, because it suggests that Maxent directly offers a useful alternative for modelling presence-absence data, which may prompt widespread adoption of this use of Maxent. In this paper, we explore why this would be a mistake. [::2] We draw on the theory underlying how the Maxent model operates and on simulations to discover: (i) why Maxent appears to fare as well as it does in their evaluation and (ii) why the best-suited presence-absence method for data analysis (the generating model; a GLM) does not perform as well as we would expect. [::3] We demonstrate that (i) the good performance observed for Maxent is largely a coincidence; the simulated species match well the arbitrary default parameter that Maxent applies to map its relative output into a 0-1 scale, but errors are much larger for other species we simulate; (ii) the performance of the GLM is poorer than expected because Thibaud et al. do not use model selection and fit a model that is too complex for the amount of data available. [::4] Maxent is a presence-background method and only provides estimates of relative suitability regardless of how the background sample is specified. When presence-absence data are available, one can transform Maxent's relative estimates into estimates of occupancy probability, and we provide methods to do so. However, this requires the user to post-process Maxent's output. Proper PA methods such as GLMs can perform well under small sample sizes, provided care is taken during modelling to avoid overfitting. We demonstrate an effective method using regularisation with the R package glmnet.
@article{guillera-arroitaMaxentNotPresenceabsence2014,
  title = {Maxent Is Not a Presence-Absence Method: A Comment on {{Thibaud}} et~Al},
  author = {{Guillera-Arroita}, Gurutzeta and {Lahoz-Monfort}, Jos{\'e} J. and Elith, Jane},
  year = {2014},
  month = nov,
  volume = {5},
  pages = {1192--1197},
  issn = {2041-210X},
  doi = {10.1111/2041-210x.12252},
  abstract = {[Summary] [::1] Thibaud et al. (Methods in Ecology and Evolution 2014) present a framework for simulating species and evaluating the relative effects of factors affecting the predictions from species distribution models (SDMs). They demonstrate their approach by generating presence-absence data sets for different simulated species and analysing them using four modelling methods: three presence-absence methods and Maxent, which is a presence-background modelling tool. One of their results is striking: that their use of Maxent performs well in estimating occupancy probabilities and even outperforms the other methods on small sample sizes. This result is of concern to us, because it suggests that Maxent directly offers a useful alternative for modelling presence-absence data, which may prompt widespread adoption of this use of Maxent. In this paper, we explore why this would be a mistake.

[::2] We draw on the theory underlying how the Maxent model operates and on simulations to discover: (i) why Maxent appears to fare as well as it does in their evaluation and (ii) why the best-suited presence-absence method for data analysis (the generating model; a GLM) does not perform as well as we would expect.

[::3] We demonstrate that (i) the good performance observed for Maxent is largely a coincidence; the simulated species match well the arbitrary default parameter that Maxent applies to map its relative output into a 0-1 scale, but errors are much larger for other species we simulate; (ii) the performance of the GLM is poorer than expected because Thibaud et al. do not use model selection and fit a model that is too complex for the amount of data available.

[::4] Maxent is a presence-background method and only provides estimates of relative suitability regardless of how the background sample is specified. When presence-absence data are available, one can transform Maxent's relative estimates into estimates of occupancy probability, and we provide methods to do so. However, this requires the user to post-process Maxent's output. Proper PA methods such as GLMs can perform well under small sample sizes, provided care is taken during modelling to avoid overfitting. We demonstrate an effective method using regularisation with the R package glmnet.},
  journal = {Methods in Ecology and Evolution},
  keywords = {*imported-from-citeulike-INRMM,~INRMM-MiD:c-14538022,computational-science,computational-science-literacy,ecology,environmental-modelling,maxent,model-comparison,modelling,modelling-uncertainty,presence-absence,presence-background,presence-only,uncertainty},
  lccn = {INRMM-MiD:c-14538022},
  number = {11}
}

Downloads: 0