23 min read

Maxent Best Practices

Sampling Bias

Sampling bias in occurrence data is an issue because it means we can’t be sure a species is detected under certain conditions because that’s it’s preferred habitat, or because those are the conditions in the locations we prefer to search.

“The uniform sampling assumption does not require a uniformly random sample from geographic space, but instead that environmental conditions are sampled in proportion to their availability, regardless of their spatial pattern” (Merow et al., 2013).

This problem can be addressed by thinning records (also called spatial filtering, Radosavljevic and Anderson (2014)), such that multiple records from within the same area are represented by only one or a few of the total records. This is a bit crude, but should remove the worst biases, such as a particular field station getting preferentially sampled by recurring visits from scientists or students, or general biases towards sampling roadsides and popular trails.

Lee-Yaw et al. (2018) developed their own method to thin species records, using kernel smoothing estimates to reduce the number of samples from a neighbourhood, and selecting which samples to keep via novel environments. I don’t think this is widespread, and feels a bit like overkill.

Subsampling based on raster grids is a simpler, more intuitive approach provided by Hijmans et al. (2017). It doesn’t account for the possibility that local density may be an accurate reflection of the niche requirements of a species, as the approach of Lee-Yaw et al. (2018) does.


### create an empty raster
r <- raster::raster(trich) 
###  set the resolution:
###   to match the environmental layers
res(r) <- res(env)

###   to set to one minute, ca 2km square
res(r) <- 1/60 

###   double the environmental layer
res(r) <- 2 * res(env) 

### IMPORTANT: extend it outwards one cell
### otherwise, points at the edge of the range
### might get dropped!!
r <- extend(r, extent(r) + 1) 

### sample one point per cell:
trichsel <- gridSample(trich, r, n = 1) 

Aiello-Lammens et al. (2015) provides an alternative approach based on imposing a minimum permissible nearest-neighbour distance, and then finding the set that retains the most samples through repeated random samples.

Thinning by Nearest-Neighbour:

## thin.par sets minimum distance in km
trichthin <- thin(data.frame(LAT = coordinates(trich)[, "Y"],
                             LONG = coordinates(trich)[, "X"],
                             SPEC = rep("tplan", nrow(trich))),
                  thin.par = 2, reps = 1, write.files = FALSE,
                  locs.thinned.list.return = TRUE) 

Radosavljevic and Anderson (2014) show that unfiltered/unthinned data produces elevated assessment of model performance, as a consequence of over-fitting to spatially auto-correlated data. So filtering works.

See also Boria et al. 2014, Varela et al. 2014 (unread)

Merow et al. (2013) provide two more rigorous approaches, depending on whether or not data on search effort is available. When search effort is known, it can be used to construct a biased prior.

When search effort is unknown, we can create a biased background sample to account for bias in presence data, via Target Group Sampling. Under TGS, records that are collected using the same surveys/methods as the focal species are form the background points. i.e., the set of all herbarium records in GBIF may be an appropriately biased background for any one of those plant species. This assumes that the target plant is collected/detected at the same rate as the reference set. It may be appropriate to subset the reference set to increase the likelihood of this being true: use only graminoids as biased background for sedges, or woody plants as background for a tree?

Study Extent

Discussed extensively in Barve et al. (2011). They identified three general approaches to consider:

  1. Biotic regions (ecozones etc). A good compromise between biological realism and tractability
  2. Niche-model reconstructions: back-project a niche model over the appropriate time period (i.e., previous glacial maximum or interglacial) to identify the area that the species could have occupied over an extended period. Nice idea, but a real risk of circularity?
  3. Detailed simulations. Sounds great, but I think if we had enough data to properly parameterize such a model, we wouldn’t need to resort to sdms in the first place.

If you wanted to improve on biotic regions, things to consider in developing a more rigorous approach should include:

  1. Dispersal characteristics of the species
  2. Crude estimate of the niche (again, circularity?)
  3. Establish relevant time span
  4. Identify relevant environmental changes

Soberón (2010) is often cited together with Barve et al. (2011), but the latter provides more explicit discussion of best practices for sdm model construction. I think the deference to Soberon is probably due to their creation of the BAM model (in earlier publications), which Barve’s system is based on (Biotic, Abiotic, Movement).

Merow et al. (2013) provide a shorter discussion, and emphasize matching the study extent to the biological question of interest. Prioritizing sites for protection within the range of a species should constrain the extent to the existing range of the species; evaluating invasion potential should use an extent large enough to encompass the areas of concern (i.e., global, or continental scale for novel invasives).

Variable Selection

Variables == predictors, the spatial layers used as the environmental/dependent variables in the model.

Interesting discussion in Guisan et al. (2017) (section 6.4, page 102+): variables that are measured most accurately often/usually are only indirectly related to a species’ niche; e.g., elevation, slope, aspect. Very precise and accurate spatial layers are available for these.

Variables with a direct relationship to a species niche are usually created through interpolation from sparse reference points (weather stations), and this involves unavoidable error propagation and imprecision.

Over small extents, it may be preferable to use indirect variables, as they offer greater precision in quantifying the local environment. However, as extent increases, the relative value of direct variables increases. The indirect variables are likely not stationary on large scales - a species relationship to slope and elevation are likely different in southern US vs northern Canada, for instance. On the other hand, a species relationship to temperature, however coarsely it is mapped, is likely similar across its geographic range.

Merow et al. (2013) identify two general approaches to selecting variables. The machine learning approach is based on the understanding that the Maxent algorithm will, by design, select the most useful variables and features, so we can include all reasonable variables.

However, this probably only applies when the objective is to provide accurate predictions of occurrences in the same context in which the model is built. Efforts to understand the environmental constraints on that distribution, or projecting it to a new context, will be potentially confounded when the model includes correlated variables.

To minimize this problem, Merow et al. (2013) recommends taking a statistical approach (i.e., treating a Maxent model as a ‘conventional’ statistical model). In this case, they recommend prescreening variables to limit colinearity, and emphasize biologically relevant variables. This should produce more parsimonious and interpretable models.

Pairwise correlations can be used to identify pairs or groups of variables that are highly correlated. ENMTools (Warren et al., 2019) provides several helper functions for this, including raster.cor.matrix, raster.cor.plot.

I prefer using hclust based on 1 - abs(cor) to visualize correlated groups:

## "predictors" is a raster stack

## Calculate correlations:
cors <- raster.cor.matrix(predictors) # from ENMTools

threshold = 0.7  ## the maximum permissible correlation

dists <- as.dist(1 - abs(cors))

clust <- hclust(dists)
groups <- cutree(clust, h = 1 - threshold)

## Visualize groups:
abline(h = 1 - threshold, col = 'red', lwd = 2)

## Print the groups:
##  bio1 bio12 bio16 bio17  bio5  bio6  bio7  bio8 biome 
##     1     2     2     3     4     1     5     4     6

After running this, groups will identify which cluster each variable belongs to. Keep at most one variable from each group. This doesn’t absolutely guarantee that variables in different groups will have correlations less than threshold, but in most cases this will be true. When it isn’t, the highest inter-group correlation will still be very close to the threshold. If you’re concerned, double check the correlations of the variables after you’ve picked them.

However, this won’t address multicollinearity among three or more variables. Guisan et al. (2017) suggest using the function usdm::vif instead, which calculates variable inflation. They recommend keeping the vif values under 10, but different authors will use cutoffs from 5-20.

Petitpierre et al. (2017) explicitly tested different approaches to model selection for use in projecting models in space and time. Their results support Merow’s statistical approach: modelers should use a small number of ‘proximal’ variables (i.e., variables known to be biologically relevant to the species in question), or the first few PCA axes of a larger set of environmental variables. PCA axes are orthogonal (i.e., not collinear) by construction, but interpretation may be tricky if they incorporate a large number of variables.

Feature Selection

Features == the statistical models used to fit the variables to the response variables (presences). i.e., linear, quadratic, product, hinge, threshold, categorical.

Merow et al. (2013) recommend selecting features on biological grounds. They provide a short discussion, noting that the fundamental niche is likely quadratic for most variables over a large enough extent, but may be better approximated by a linear function if the study extent is truncated with respect to the species’ tolerance for that variable (ala Whittaker). Interesting ideas, but not much to go on unless you actually do know a fair bit about your species.

Warren and Seifert (2011) describe a process for selecting features to keep/include in the model (linear, quadratic, polynomial, hinge, threshold, categorical). It uses the AIC to identify the optimal combination. Easy and quick to do with the ENMEval package (note that many references cite ENMTools for these tests, but they’ve been moved to ENMEval nowadays).

NB applying different spatial filtering/thinning to your data can produce different ‘optimal’ models (i.e., different retained features and regularization value), as determined by the AIC criterion. ## Regularization Regularization is used to penalize complexity. Low values will produce models with many predictors and features, with 0 leading to all features and variables being included. This can lead to problems with over-fitting and interpretation. Higher regularization values will lead to ‘smoother’, and hopefully more general and transferable models. There will be a trade-off between over- and under-fitting.

The default values in Maxent are based on empirical tests on a large number of species. These are probably not unreasonable, but it’s pretty standard to mention that they’re a compromise, and we improved them for our the needs of our particular species and context by doing X (for various values of X).

The approach of Warren and Seifert (2011) (see previous) can be used here as well, testing a range of regularization (aka beta) values, and selecting the one that generates the lowest AIC. It may also be worth selecting the simplest model that is within a certain similarity of the ‘best’ model? That’s more to explain to reviewers though.

Warren and Seifert’s simulations demonstrate that models with a similar number of parameters to the true model produce more accurate models, in terms of suitability, variable assessment, and ranking of habitat suitability, both for the training extent and for models projected in space/time. Furthermore, AIC and BIC are the most effective approaches to model tuning to achieve the correct number of parameters.

Radosavljevic and Anderson (2014) also consider the impact of the regularization parameter on over-fitting. They find that the default value often leads to over-fitting, especially when spatial auto-correlation is not accounted for in model fitting. They conclude that regularization should be set deliberately for a study, following the results of experiments exploring a range of potential values.

Note that specifying the regularization is done via the betamultiplier argument, which applies to each of the different feature classes. That is, the actual regularization value will be set by Maxent automatically for each class, subject to the multiplier value specified by the user. We don’t set the regularization values for each class directly (which is possible via the options beta_lqp, beta_threshold etc. (Phillips, 2017), although Radosavljevic and Anderson (2014) suggest experiments to explore this should be done.

Output type

Raw: values are Relative Occurrence Rate (ROR), which will sum to 1 over the extent of the study.

Cumulative: the sum of all cells with <= to the raw value of the cell. Rescaled to range from 0-100.

Logistic: Not sure what this actually means.

Merow et al. (2013) recommends sticking to Raw whenever possible, which means using the same species in the same extent. Note that the raw values will change for different extents, even for identical models, so they can’t be compared across projections without additional post-processing.



AUC assesses the success of the model in correctly ranking a random background point and a random presence point; that is, it should predict the suitability of the presence point higher than the background point. It is threshold-independent.

Lobo et al. (2008) identified five problems with AUC:

  1. it ignores the predicted probability values and the goodness-of-fit of the model;
  2. it summarises the test performance over regions of the ROC space in which one would rarely operate;
  3. it weights omission and commission errors equally;
  4. it does not give information about the spatial distribution of model errors; and, most importantly,
  5. the total extent to which models are carried out highly influences the rate of well-predicted absences and the AUC scores.

Additionally, Radosavljevic and Anderson (2014) point out that AUC doesn’t assess over-fitting or goodness-of-fit; rather, it is a measure of discrimination capacity.

However, comparing the difference in AUC for the training and testing data does give an estimate of overfitting. If the model fit perfectly, without overfitting, the AUC should be identical. It won’t be, and the difference reflects the degree to which the model is over-fit on the training data. In other words, the extent to which the model is fit to noise in the data, or environmental bias, if geographic masking is used in the k-fold partitions.


Hirzel et al. (2006) evaluated a variety of sdm evaluation measures; on data sets with more than 50 presences, most evaluators had > 0.70 correlation with each other. Which is a little reassuring I suppose? They used AUC on presence/absence data as the ‘gold standard’, and found that the continuous Boyce index (which uses presence-only data) performed best.

This can be calculated with the function ecospat::boyce(), which takes a raster of suitability values and a matrix or data.frame containing the coordinates of presences.


Radosavljevic and Anderson (2014) Threshold-dependent evaluation requires identifying a threshold in values predicted by the model to generate a binary suitable/unsuitable map. Setting the threshold to the lowest predicted value for a presence location may produce undesireable results if the lowest values is associated with an observation from an extreme outlier. More robust is setting the threshold to a particular quantile (10%), to exclude weirdos from establishing what’s suitable.

Again, if the model is perfectly fit, the omission rate in the testing data should be the same as in the training data. That is, setting the threshold at 10% to create the binary suitability map, we expect the omission rate in the test data to be 10%. Higher omission in the testing data reflects over-fitting (noise and/or bias).

For presence-only data commission error is unknown/unknowable. Accordingly, Radosavljevic and Anderson (2014) defined an optimal model as one that “(1) reduced omission rates to the lowest observed value (or near it) and minimized the difference between calibration and evaluation AUC [i.e., minimized over-fitting]; and (2) still led to maximal or near maximal observed values for the evaluation AUC (which assesses discriminatory ability). When more than one regularization multiplier fulfilled these criteria equally well, we chose the lowest one, to promote discriminatory ability (and hence, counter any tendency towards underfitting).”


Radosavljevic and Anderson (2014) evaluated cross-validation using random k-fold partitions, geographical structuring, and geographic masking of partitions. Random partitions suffer from preserving biases in the training data in the testing data.

Geographic structuring, which uses occurrences from a pre-defined geographic area (rather than a random sample) as the test set, introduces additional spatial bias, and should be avoided. However, geographic structuring combined with masking (which excludes both presences and background from the specified geographic region from the test set) may substantially reduce overfitting, and yields more realistic models than random partitions.

Checkerboard partitions offer a nice compromise - this is geographic structuring and masking on a fine scale, and so should reduce spatial correlation between training and testing data. A version was used by Pearson et al. (2013), without a lot of discussion. Functions to do checkerboard cross validation are provided by Muscarella et al. (2014), but without a lot of discussion. The cited references suggest this might be intended more for species with limited occurrence data? Also, as implemented it looks like they only allow for 2-fold and 4-fold cross-validation. I’m not sure there’s any reason not to use checkerboards to do 9- or 16- fold cross validation?

Spatial Null Models

Not often used, but see Rodríguez-Rey et al. (2013), and discussion of Bahn and McGill (2007).

The dismo package (Hijmans et al., 2017) provides the function geoDist to serve as a spatial null model. Provided a matrix of occurence points, it generates a simple model of occurence based on the distribution of those points (i.e., the likelihood of an occurence at a location is inversely proportional to the distance of that location from known occurences).

geoDist returns a raster of ‘suitability’ values that can be evaluated just as the output from an SDM model projection.


Complex models, with large numbers of predictors, or that use complex features, are more likely to be overfit. Overfitting sacrifices generality (i.e., capacity to project to different scenarios) in favour of better fit to training data. Consequently, for the purposes of projection, we may want to emphasize smaller, simpler models. See also Petitpierre et al. (2017) for discussion of variable selection.

Guisan et al. (2017): two related issues to consider when comparing the environment in the training region to that in the region into which the model is projected:

  • availability: are environment ‘types’ similarly abundant and available?

  • analog: are the environment ‘types’ in the projected range also present in the training range?

The act of projection implicitly assumes environments are fully analagous with equal availability.

This is to a large extent unavoidable, so we have to take measures to reduce it in data preparation, and/or account for it in interpretation of results.


Maxent includes the option of “clamping” projections. This constrains the values for environmental values in the projected range to the limit of that variable that is found in the training range. This has the effect of setting the predicted value of all non-analog cells to the value for the most extreme environments that are found in the training region.

This will reduce the occurence of unrealistic patterns emerging from the extension of complex models beyond the range of values they were trained on. It’s probably better than not clamping, but no reason to expect it’s particularly realistic.

Guisan et al. (2017) doesn’t mention clamping at all. Instead, they recommend explicitly identifying non-analog environments and excluding them from the projection; or at least, identifying them clearly and giving them due consideration in the interpretation of the results.


Multivariate Environmental Similarity Surfaces, Elith et al. (2010), provide a way to identify non-analog environments, and to quantify the extent to which they differ from environments in the training area. The functions dismo::mess and ecospat::ecospat.mess can be used to calculate MESS values. MESS is typically applied in geographic space.


The ecospat package also provides functions to contrast environmental conditions between training and projection regions in E-space. E-space and G-space provide complementary views of the niche. E-space shows the environmental distribution of a species in the context of all possible environmental combinations; G-space shows the distribution of the species, constrained to environmental conditions that actually exist in the landscape.

ecospat provides all the functions necessary to test niche differences in E-space and G-space, in the COUE framework (centroid, overlap, underfilling and expansion).

Climate Models

Climate projections are not predictions of future conditions—they are model-derived descriptions of possible future climates under a given set of plausible scenarios of climate forcings. The intention of simulating future climate is not to make accurate predictions regarding the future state of the climate system at any given point in time but to represent the range of plausible futures and establish the envelope that the future climate could conceivably occupy.

Harris et al. (2014)


From Harris et al. (2014) (originally presented as 9 points):

  1. Include a high and a low emissions scenario, to capture ‘best-case’ and ‘worst-case’ scenarios.

    Note that as of 2014 we are trending close to the high emissions RCP8.5 scenario. RCP2.6 represents an increasingly unlikely aggressive mitigation approach.

  2. Time and resources may limit us to one or two emissions scenarios, but more than one GCM should be used.

    Different GCMs have different strengths, weaknesses and biases. Some models are known to be ‘wet’ or ‘dry’ or ‘hot’, compared to the mean of all GCMs. These biases are also spatially variable: some GCMs may be ‘hot’ for Africa, and ‘cold’ for North America.

    Ostenibly different GCMs may share code and assumptions, and thus share biases in their projections.

  3. Consider the most appropriate way to present the output from multiple climate models;

    • If the multi-model mean is presented, report the range or standard deviation;
    • best-case and worst-case scenarios can be presented using envelopes/binary maps: envelopes based on locations where any of the models predict suitable habitat show worst case (i.e., all areas identified as being suitable habitat in any model), envelopes based on locations identified by all models show best case (i.e., only areas that all models agree are suitable are identified) [good approach for IAS].
  4. Choose a baseline time period appropriate to the study

    • baselines are preferably defined on data amalgamated over 30 years, to account for stochastic inter-annual variation. Shorter time periods may be unacceptably influenced by noisy weather.
    • baseline climate should correspond to the time period in which the data (i.e., observations) were collected
    • NB: different climate sources use different baseline periods!
    • Further discussed in Roubicek et al. (2010). They tested the sensitivity of SDMs trained on a different baseline than the one used to simulate the data, and found those models were signficantly worse than ones trained on the correct baseline.
  5. Be aware of the real resolution of the climate data used;

    • GCMs are coarse resolution, and need to be down-scaled for use with SDMs.
  6. Maintain a dialog with climate modelers, to keep up-to-date with developments in climate models.


WorldClim 2.1 has CMIP6 data: 9 GCMs for four SSP, at resolutions down to 2.5 minutes (30 seconds overdue for release in March 2020), projected to 2040, 2060, 2080 and 2100.

WorldClim 1.4 has CMIP5 data: 19 GCMs, four RCP, projected to 2050 and 2070, at resolutions as low as 30 seconds.


Assessment Report for the IPCC. AR4 was released in 2007, AR5 in 2014, AR6 is scheduled for 2022
Coupled Model Intercomparison Project, IPCC collection of GCMs, Numbered to correspond to the AR (eg. CMIP5 for AR5). “Models are only admitted to the CMIP archive if they meet a suite of rigorous requirements, including consistency with relevant observations, both past and present, and with fundamental physical principles.” IPCC doesn’t judge the models beyond being deemed fit for service. As such, the collection of models represents set of plausible future climates under a given emissions scenario.
General Circulation Model: 3D numerical representation of the climate system. 50-250km cell size, 10-20 layers in the atmosphere. Includes Atmosphere-Ocean GCM (AOGCM), incorportating interactions between the oceans and atmosphere, and Earth System Models (ESM), GCMs that incorporate biogeochemical cycles (eg. carbon cycle); ESM may also contain dynamic global vegetation models (DGVM)
Special Report on Emissions Scenarios, socioeconomic analysis of future climate emissions under varying conditions, used in AR4. Updated to Representative Concentration Pathways, in AR5. Updated to Shared Socionomic Pathways, in AR6. RCP2.6/ SSP 126 is aggressive mitigation (best case); RCP8.5/SSP 585 is ~current trajectory (worst case)


GCMs have a native resolution in the range of 50-250km2, much coarser than what is used for SDMs. To use for model projections, we need to downscale them to the appropriate scale. Three options:

Dynamic Downscaling
uses the coarse-resolution GCMs as the input for similarly complex fine-scale models for the region of interest. Lots of added value compared to the GCMs, but computationally expensive, requiring expert skill to produce and interpret. Thus availability is limited.
Statistical Downscaling
uses past coars and fine-scale data to establish statistical/numerical models mapping regional data to local data. Then uses the resulting model to create future local data from future projections. Easier to do than Dynamic Downscaling, but less realistic, more implicit assumptions.
Simple Scaling
break coarse pixels up into smaller pixels containing the same value as the parent; doesn’t create any new values, or introduce any error.




Boria, R. A. et al. 2014. Spatial filtering to reduce sampling bias can improve the performance of ecological niche models. – Ecol. Model. 275: 73–77.

Varela, S. et al. 2014. Environmental filters reduce the effects of sampling bias and improve predictions of ecological niche models. – Ecography 37: 1084–1091.


Aiello-Lammens, M. E., R. A. Boria, A. Radosavljevic, B. Vilela, and R. P. Anderson. 2015. spThin: An R package for spatial thinning of species occurrence records for use in ecological niche models. Ecography 38: 541–545.

Bahn, V., and B. J. McGill. 2007. Can niche-based distribution models outperform spatial interpolation? Global Ecology and Biogeography 16: 733–742.

Barve, N., V. Barve, A. Jiménez-Valverde, A. Lira-Noriega, S. P. Maher, A. T. Peterson, J. Soberón, and F. Villalobos. 2011. The crucial role of the accessible area in ecological niche modeling and species distribution modeling. Ecological Modelling 222: 1810–1819.

Elith, J., M. Kearney, and S. Phillips. 2010. The art of modelling range-shifting species. Methods in Ecology and Evolution 1: 330–342.

Guisan, A., W. Thuiller, and N. E. Zimmermann. 2017. Habitat Suitability and Distribution Models: With Applications in R. Cambridge University Press.

Harris, R. M. B., M. R. Grose, G. Lee, N. L. Bindoff, L. L. Porfirio, and P. Fox-Hughes. 2014. Climate projections for ecologists. WIREs Climate Change 5: 621–637.

Hijmans, R. J., S. Phillips, J. Leathwick, and J. Elith. 2017. Dismo R package.

Hirzel, A. H., G. Le Lay, V. Helfer, C. Randin, and A. Guisan. 2006. Evaluating the ability of habitat suitability models to predict species presences. Ecological Modelling 199: 142–152.

Lee-Yaw, J. A., M. Fracassetti, and Y. Willi. 2018. Environmental marginality and geographic range limits: A case study with Arabidopsis lyrata ssp. Lyrata. Ecography 41: 622–634.

Lobo, J. M., A. Jiménez-Valverde, and R. Real. 2008. AUC: A misleading measure of the performance of predictive distribution models. Global Ecology and Biogeography 17: 145–151.

Merow, C., M. J. Smith, and J. A. Silander. 2013. A practical guide to MaxEnt for modeling species’ distributions: What it does, and why inputs and settings matter. Ecography 36: 1058–1069.

Muscarella, R., P. J. Galante, M. Soley-Guardia, R. A. Boria, J. M. Kass, M. Uriarte, and R. P. Anderson. 2014. ENMeval: An R package for conducting spatially independent evaluations and estimating optimal model complexity for Maxent ecological niche models. Methods in Ecology and Evolution 5: 1198–1205.

Pearson, R. G., S. J. Phillips, M. M. Loranty, P. S. A. Beck, T. Damoulas, S. J. Knight, and S. J. Goetz. 2013. Shifts in Arctic vegetation and associated feedbacks under climate change. Nature Climate Change 3: 673–677.

Petitpierre, B., O. Broennimann, C. Kueffer, C. Daehler, and A. Guisan. 2017. Selecting predictors to maximize the transferability of species distribution models: Lessons from cross-continental plant invasions. Global Ecology and Biogeography 26: 275–287.

Phillips, S. J. 2017. A Brief Tutorial on Maxent.

Radosavljevic, A., and R. P. Anderson. 2014. Making better Maxent models of species distributions: Complexity, overfitting and evaluation. Journal of Biogeography 41: 629–643.

Rodríguez-Rey, M., A. Jiménez-Valverde, and P. Acevedo. 2013. Species distribution models predict range expansion better than chance but not better than a simple dispersal model. Ecological Modelling 256: 1–5.

Roubicek, A. J., J. VanDerWal, L. J. Beaumont, A. J. Pitman, P. Wilson, and L. Hughes. 2010. Does the choice of climate baseline matter in ecological niche modelling? Ecological Modelling 221: 2280–2286.

Soberón, J. M. 2010. Niche and area of distribution modeling: A population ecology perspective. Ecography 33: 159–167.

Warren, D. L., N. Matzke, M. Cardillo, J. Baumgartner, L. Beaumont, N. Huron, M. Simões, et al. 2019. ENMTools R Package.

Warren, D. L., and S. N. Seifert. 2011. Ecological niche modeling in Maxent: The importance of model complexity and the performance of model selection criteria. Ecological Applications 21: 335–342.