<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Notebooks on plantarum.ca</title>
    <link>https://plantarum.ca/notebooks/</link>
    <description>Recent content in Notebooks on plantarum.ca</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <lastBuildDate>Mon, 22 Jun 2020 00:00:00 +0000</lastBuildDate>
    
        <atom:link href="https://plantarum.ca/notebooks/index.xml" rel="self" type="application/rss+xml" />
    
    
    <item>
      <title>Rawtherapee Notebook</title>
      <link>https://plantarum.ca/notebooks/rawtherapee-notebook/</link>
      <pubDate>Sun, 15 Jan 2023 00:00:00 +0000</pubDate>
      
      <guid>https://plantarum.ca/notebooks/rawtherapee-notebook/</guid>
      <description>&lt;!-- markdown-toc start - Don&#39;t edit this section. Run M-x markdown-toc-refresh-toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#default-profile&#34;&gt;Default Profile&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#lightening-an-image&#34;&gt;Lightening an Image&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#noise-reduction&#34;&gt;Noise Reduction&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;!-- markdown-toc end --&gt;
&lt;h1 id=&#34;default-profile&#34;&gt;Default Profile&lt;/h1&gt;
&lt;p&gt;Notes based on the default profile describe by &lt;a href=&#34;https://www.wildlifeinpixels.net/&#34;&gt;Andy
Astbury&lt;/a&gt; on
&lt;a href=&#34;https://youtu.be/310rCQZe0NI&#34;&gt;YouTube&lt;/a&gt; (posted 9 September 2021).&lt;/p&gt;
&lt;p&gt;Andy recommends the following settings as your default RAW processing
profile. His video describes his rationale in some detail.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;rt-profile.jpg&#34; alt=&#34;The RawTherapee History window showing all the changes described in thissection&#34; title=&#34;RawTherapee History Window&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;start-from-the-built-in-neutral-profile&#34;&gt;Start From the Built-in &amp;ldquo;Neutral&amp;rdquo; profile&lt;/h2&gt;
&lt;p&gt;Open a raw file in RawTherapee.&lt;/p&gt;
&lt;p&gt;Select the &lt;code&gt;(Neutral)&lt;/code&gt; profile in the profile selector at the upper right
side of the editor window. This will give you a very &amp;lsquo;flat&amp;rsquo; (linear) view
of the image. This is desireable to allow you to see all of the detail
available in your file, in order to better decide how to approach your
processing. From this linear beginning you&amp;rsquo;ll set the exposure etc. to suit
your image.&lt;/p&gt;
&lt;h2 id=&#34;set-the-output-profile&#34;&gt;Set the Output Profile&lt;/h2&gt;
&lt;p&gt;tldr; use &lt;code&gt;RTv4_sRGB&lt;/code&gt; for finished photos to be posted online, or for which
you don&amp;rsquo;t control the software used to view them; use &lt;code&gt;ProPhoto&lt;/code&gt; (or maybe
&lt;code&gt;RTv4_Large&lt;/code&gt;) when you are going to further edit your images in Gimp,
Photoshop, Lightroom, or edit/print/view with other programs that
understand wide gamut profiles.&lt;/p&gt;
&lt;h3 id=&#34;rtv4_srgb-for-the-web-prophoto-for-photoshopgimp&#34;&gt;RTv4_sRGB for the web, ProPhoto for Photoshop/Gimp&lt;/h3&gt;
&lt;p&gt;If you&amp;rsquo;re exporting from RawTherapee to Gimp, Photoshop, or Lightroom,
Astbury recommends using the ProPhoto colour profile, in order to maintain
maximum compatibility with PhotoShop and Lightroom. This is also the
default &lt;em&gt;working&lt;/em&gt; profile (NB: not the default &lt;em&gt;output&lt;/em&gt; profile!) for
RawTherapee. However, by default, RawTherapee doesn&amp;rsquo;t offer ProPhoto as an
&lt;code&gt;output profile&lt;/code&gt;. In order to use it, you need to install the
&lt;code&gt;ProPhoto.icc&lt;/code&gt; file.&lt;/p&gt;
&lt;p&gt;Download the file &lt;code&gt;ICCProfiles.zip&lt;/code&gt; from
&lt;a href=&#34;https://sites.google.com/site/chromasoft/icmprofiles?pli=1&#34;&gt;chromasoft&lt;/a&gt;.
Unzip the files and copy them to the profile directory for your system. You
can find this (and change it if you like) in the &lt;code&gt;Preferences&lt;/code&gt; dialog,
under &lt;code&gt;Colour Management&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;You could also use &lt;code&gt;RTv4_Large&lt;/code&gt;, which is provided with &lt;code&gt;RawTherapee&lt;/code&gt; and
provides a similar extended gamut to &lt;code&gt;ProPhoto&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Note that Astbury&amp;rsquo;s video includes the aside:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Only do this if you want to use and maintain a fully ProPhotoRGB workflow
between RawTherapee, Photoshop, and Lightroom.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is because the &lt;code&gt;ProPhoto&lt;/code&gt; profile is not understood by many programs,
so what your audience sees will depend on how they view it. Consequently,
regardless of what your workflow is, when you export a &amp;lsquo;finished&amp;rsquo; image for
display online, you should use a sRGB profile. RawTherapee provides
&lt;a href=&#34;http://rawpedia.rawtherapee.com/Color_Management#Output_Profile&#34;&gt;RTv4_sRGB&lt;/a&gt;
for this purpose. It&amp;rsquo;s an extended sRGB profile that degrades gracefully to
plain sRGB in programs that don&amp;rsquo;t understand the full &lt;code&gt;RTv4_sRGB&lt;/code&gt; profile.&lt;/p&gt;
&lt;p&gt;As an example of the potential problems with profiles, on my android phone,
Firefox, and also the Flickr app, don&amp;rsquo;t provide accurate renderings of the
&lt;code&gt;ProPhoto&lt;/code&gt; or &lt;code&gt;rtv4_Large&lt;/code&gt; profiles (shown on the right), but are just fine
with &lt;code&gt;sRGB&lt;/code&gt; and &lt;code&gt;rtv4_sRGB&lt;/code&gt; (shown on the left).&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;profile_comparison.jpg&#34; alt=&#34;The same image displayed four times, two on the left in vibrant colour,two on the right in more muted tones&#34; title=&#34;Comparing
output colour profiles&#34;&gt;&lt;/p&gt;
&lt;p&gt;This is hard to anticipate, because the same images, on the same website,
may appear differently on different browsers, OS, or other combinations. In
this case, my Linux laptop with firefox displays the images properly, as
does Chrome on Android. You can see the actual images on my
&lt;a href=&#34;https://flic.kr/s/aHBqjAqYkA&#34;&gt;Flickr&lt;/a&gt; account to see for yourself.&lt;/p&gt;
&lt;h2 id=&#34;set-the-demosaicing-algorithm&#34;&gt;Set the Demosaicing Algorithm&lt;/h2&gt;
&lt;p&gt;The default demosaicing algorithm is &lt;code&gt;AMaZE&lt;/code&gt;. This is very good for high
frequency detail (i.e., areas in sharp focus, detailed subjects), but not
so good for low frequency detail (i.e., out of focus areas, soft
background). Astbury prefers &lt;code&gt;RCD&lt;/code&gt; for sharp detail, and &lt;code&gt;VNG4&lt;/code&gt; for out of
focus areas.&lt;/p&gt;
&lt;p&gt;You get both together with the setting &lt;code&gt;RCD+VNG4&lt;/code&gt;. With this setting, a
contrast setting is used to determine which areas of the image have high
contrast, and will have &lt;code&gt;RCD&lt;/code&gt; applied; and which have low contrast, and
will get &lt;code&gt;VNG4&lt;/code&gt; applied. This value is set automatically when the box is
checked. Leave the box checked.&lt;/p&gt;
&lt;h2 id=&#34;capture-sharpening&#34;&gt;Capture Sharpening&lt;/h2&gt;
&lt;p&gt;Further down in the &lt;code&gt;raw&lt;/code&gt; tab, turn on the &lt;code&gt;Capture Sharpening&lt;/code&gt; option, and
leave it at the default setting (i.e., the &lt;code&gt;automatic&lt;/code&gt; check box is
ticked). This will address details lost due to &lt;a href=&#34;http://rawpedia.rawtherapee.com/Capture_Sharpening&#34;&gt;in-camera
blurring&lt;/a&gt; (defraction,
anti-aliasing etc). You will usually want &lt;code&gt;Capture Sharpening&lt;/code&gt; applied,
which will often be combined with either &lt;code&gt;Unsharp Mask&lt;/code&gt; or &lt;code&gt;RL Deconvolution&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id=&#34;auto-chromatic-aberration-correction&#34;&gt;Auto Chromatic Aberration Correction&lt;/h2&gt;
&lt;p&gt;Located in the &lt;code&gt;Raw&lt;/code&gt; tab: turn on this checkbox, but turn off &lt;code&gt;Avoid colour shift&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Andy discusses &lt;code&gt;Avoid Colour Shift&lt;/code&gt; as a &amp;ldquo;Top Tip&amp;rdquo; in his video on &lt;a href=&#34;https://youtu.be/vvmBfGFAxXM?list=PLnIcpm2W3TX_kcxfxeZdfW6R_4FYh-KjS&amp;amp;t=106&#34;&gt;local
adjustments&lt;/a&gt;.
The issue is that, in some circumstances, &lt;code&gt;Avoid Colour Shift&lt;/code&gt; can actually
&lt;strong&gt;cause a colour shift&lt;/strong&gt;. That&amp;rsquo;s counter-intuitive, and you probably
wouldn&amp;rsquo;t think to turn off this feature to reduce colour tinting. So better
to leave it off by default, and only apply it intentionally such that you
might detect how it improves (or degrades) a particular image.&lt;/p&gt;
&lt;h2 id=&#34;turn-on-defringe&#34;&gt;Turn on Defringe&lt;/h2&gt;
&lt;p&gt;In the &lt;code&gt;Detail&lt;/code&gt; tab. Use default settings.&lt;/p&gt;
&lt;h2 id=&#34;turn-on-noise-reduction&#34;&gt;Turn on Noise Reduction&lt;/h2&gt;
&lt;p&gt;Also in the &lt;code&gt;Detail&lt;/code&gt; tab. Turn it on, set colour space to &lt;code&gt;L*a*b*&lt;/code&gt;, mode to
&lt;code&gt;Conservative&lt;/code&gt;, Luminance to &lt;code&gt;slider&lt;/code&gt;, with the actual sliders set to 0
(these should be the default settings).&lt;/p&gt;
&lt;h2 id=&#34;uncheck-clip-out-of-gamut-colors&#34;&gt;Uncheck &amp;lsquo;Clip Out of Gamut Colors&amp;rsquo;&lt;/h2&gt;
&lt;p&gt;In the exposure tab. Mentioned in the &lt;code&gt;Pinned comment&lt;/code&gt; below the video.&lt;/p&gt;
&lt;h2 id=&#34;set-the-profiled-lens-correction-to-auto&#34;&gt;Set the Profiled Lens Correction to Auto&lt;/h2&gt;
&lt;p&gt;Not mentioned by Astbury, this is under the &lt;code&gt;Transform&lt;/code&gt; tab. I set it to
&lt;code&gt;Automatically selected&lt;/code&gt;, which usually sets the correction to the camera
at least. I have to select the lens manually, but this at least saves one
step.&lt;/p&gt;
&lt;h2 id=&#34;dont-turn-on-dead-pixels-and-hot-pixel-filters&#34;&gt;&lt;strong&gt;Don&amp;rsquo;t&lt;/strong&gt; turn on Dead Pixels and Hot Pixel Filters&lt;/h2&gt;
&lt;p&gt;Astbury previously recommended this, but no longer considers it a good
default setting. These options are in the &lt;code&gt;Raw&lt;/code&gt; tab, in the
&lt;code&gt;Preprocessing&lt;/code&gt; section.&lt;/p&gt;
&lt;p&gt;Specifically, Andy has explained (in the comments below the default profile
video):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I no longer class [dead pixels and hot pixel filters] as essential - they
only need activating if an individual image needs them. If they are not
needed they could prove slightly detrimental.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;To which KuruGDI added:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;When I take images with small grained details it&amp;rsquo;s sometimes degrading
the image quality for me with these settings. The software thinks that
these tiny spots should be fixed while in reality it&amp;rsquo;s just a small
detail. I could see this problem especially in area that were just
slightly out of focus.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&#34;save-the-profile&#34;&gt;Save the Profile&lt;/h2&gt;
&lt;p&gt;Save the profile! Click the &lt;code&gt;disk&lt;/code&gt; icon in the profile dialog at the upper
right, and give the profile a useful name, like &lt;code&gt;myBase&lt;/code&gt;, or &lt;code&gt;AstburyBase&lt;/code&gt;
or something like that.&lt;/p&gt;
&lt;h2 id=&#34;set-the-profile-as-your-default&#34;&gt;Set the Profile as your default&lt;/h2&gt;
&lt;p&gt;Open the &lt;code&gt;Preferences&lt;/code&gt; dialog, open the &lt;code&gt;Image Processing&lt;/code&gt; tab, and set the
default profile for raw photos to your new profile.&lt;/p&gt;
&lt;p&gt;At this point, this &lt;code&gt;neutral++&lt;/code&gt; profile will be applied to every Raw image
the first time you open it, providing the base corrections for the image.
All additional changes you make for that particular image are stored in the
image-specific profile, so that on the second and subsequent times that you
open the image, you&amp;rsquo;ll see everything you&amp;rsquo;ve added in addition to the base
profile.&lt;/p&gt;
&lt;h1 id=&#34;lightening-an-image&#34;&gt;Lightening an Image&lt;/h1&gt;
&lt;p&gt;Notes from &lt;a href=&#34;https://youtu.be/-30NwsK5TuY&#34;&gt;RawTherapee Basics: 3 Ways to Lighten
Images&lt;/a&gt;, posted September 15, 2021.&lt;/p&gt;
&lt;p&gt;Astbury prefers the third method, as it is most &amp;lsquo;tractable&amp;rsquo; - provides you
with a great deal of fine-control over brightness, without making a mess of
contrast and saturation. However, each of the three have their place, and
you may end up using a different one (or a combination) for different
images.&lt;/p&gt;
&lt;h2 id=&#34;method-1-exposure-compensation&#34;&gt;Method 1: Exposure Compensation&lt;/h2&gt;
&lt;p&gt;The exposure slider increases or decreases exposure for the entire image.
Getting your midtones where you want may result in blown highlights.&lt;/p&gt;
&lt;p&gt;You can correct for this with:&lt;/p&gt;
&lt;h3 id=&#34;highlight-reconstruction&#34;&gt;Highlight Reconstruction&lt;/h3&gt;
&lt;p&gt;Guesses what the best colour for the clipped highlights should be based on
adjacent pixels, or data in other channels (i.e., the values in the red and
green channel can be used to infer the blue channel, if only the blue
channel is blown out). Primarily for details that are clipped (i.e. blown
out in the RAW), but also used in combination with Highlight Compression to
protect highlights clipped during editing.&lt;/p&gt;
&lt;p&gt;Options include:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Luminance Recovery: details will be neutral gray, doesn&amp;rsquo;t attempt to
restore colors.&lt;/li&gt;
&lt;li&gt;Color propagation: recovers the luminance of highlights, and also the
colors, using adjacent colors.&lt;/li&gt;
&lt;li&gt;CIELab: reduces luminance, and then restores colors.&lt;/li&gt;
&lt;li&gt;Blend: replaces clipped colors with values from nearby areas.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;highlight-compression&#34;&gt;Highlight Compression&lt;/h3&gt;
&lt;p&gt;Recovers highlights that present (i.e., not blown out) in the raw file, but
have been clipped by editing, such as exposure compensation. Works in
conjunction with highlight reconstruction.&lt;/p&gt;
&lt;p&gt;A general rule of thumb is the histogram for each colour channel should
reach the right and left side of the histogram panel, but should not be
clipped at either end. Unless you have a very flat image (e.g. misty/foggy
landscape).&lt;/p&gt;
&lt;h3 id=&#34;the-problem-with-exposure-compensation&#34;&gt;The Problem With Exposure Compensation&lt;/h3&gt;
&lt;p&gt;Exposure compensation works in the RGB space, applying a uniform shift to
exposure &lt;strong&gt;AND&lt;/strong&gt; contrast &lt;strong&gt;AND&lt;/strong&gt; saturation to the entire image. This
makes it difficult to tweak brightness independent from contrast, and vice
versa, and often leads to garrish/glossy images. So Astbury recommends one
of the following two options instead.&lt;/p&gt;
&lt;h2 id=&#34;method-2-lab-adjustments&#34;&gt;Method 2: L*A*B* Adjustments&lt;/h2&gt;
&lt;p&gt;Turning on the LAB Adjustments tab, you can independently modify
brightness, contrast, and chromaticity (saturation), with the three
sliders. Or if you want to really fine-tune it, you can use the curves and
equalizers below.&lt;/p&gt;
&lt;h2 id=&#34;method-3-abstract-profile&#34;&gt;Method 3: Abstract Profile&lt;/h2&gt;
&lt;p&gt;New tool available in RawTherapee 5.9, found in the Colour Management tab,
between Working Profile and Output Profile.&lt;/p&gt;
&lt;p&gt;Set the Abstract Profile to &lt;code&gt;Custom&lt;/code&gt;, and you will be presented with two
sliders:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Gamma : Controls the highlights, without modifying the darker tones.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Slope : Controls the darker areas, without modifying the lighter tones.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These sliders interact - moving the Slope up will push some of the midtones
up such that they are bright enough to be influenced by Gamma. Meaning you
have to play with both sliders in concert to find a balance that works.&lt;/p&gt;
&lt;h1 id=&#34;noise-reduction&#34;&gt;Noise Reduction&lt;/h1&gt;
&lt;p&gt;Details panel -&amp;gt; Noise Reduction -&amp;gt; change slider to curve. Increase the
height of the curve to increase noise reduction. Slide the bottom of the
noise reduction tone curve to the left to restrict noise reduction to
darker tones, or to the right to include lighter tones&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;section class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34; role=&#34;doc-endnote&#34;&gt;
&lt;p&gt;Thanks to Wayne Sutton for correcting the previous version! &lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/section&gt;
</description>
    </item>
    
    <item>
      <title>Spatial Ordinations</title>
      <link>https://plantarum.ca/notebooks/spatial-ordination/</link>
      <pubDate>Fri, 24 Jul 2020 00:00:00 +0000</pubDate>
      
      <guid>https://plantarum.ca/notebooks/spatial-ordination/</guid>
      <description>
&lt;script src=&#34;https://plantarum.ca/rmarkdown-libs/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#spatial-statistics&#34;&gt;Spatial Statistics&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#morans-i&#34;&gt;Moran’s &lt;code&gt;I&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#spatial-weighting-matrix&#34;&gt;Spatial Weighting Matrix&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#spatial-lag&#34;&gt;Spatial Lag&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#gearys-c&#34;&gt;Geary’s &lt;code&gt;c&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#multispati-analysis&#34;&gt;MULTISPATI Analysis&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#co-inertia-analysis&#34;&gt;Co-inertia Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#spatial-principal-components-analysis&#34;&gt;Spatial Principal Components Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#eigenvalues&#34;&gt;Eigenvalues&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#packaging-oddities-significance-testing&#34;&gt;Packaging Oddities &amp;amp; Significance Testing&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#references&#34;&gt;References&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;spatial-statistics&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Spatial Statistics&lt;/h1&gt;
&lt;div id=&#34;morans-i&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Moran’s &lt;code&gt;I&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;A univariate measure of global spatial autocorrelation.&lt;/p&gt;
&lt;p&gt;From &lt;span class=&#34;citation&#34;&gt;&lt;a href=&#34;#ref-DrayEtAl_2008&#34; role=&#34;doc-biblioref&#34;&gt;Dray, Saïd, and Débias&lt;/a&gt; (&lt;a href=&#34;#ref-DrayEtAl_2008&#34; role=&#34;doc-biblioref&#34;&gt;2008&lt;/a&gt;)&lt;/span&gt;:&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;math display&#34;&gt;\[I(x) = \frac{n\sum_{(2)}c_{ij}(x_i - \bar{x})(x_j - \bar{x})}
{\sum_{(2)}c_{ij}\sum_{i=1}^{n}(x_i - \bar{x})^2}\]&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;where&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class=&#34;math inline&#34;&gt;\(x_t = [x_1, \dots, x_n]\)&lt;/span&gt; is a vector of the variable of interest, which
for which spatial coordinates are available&lt;/li&gt;
&lt;li&gt;&lt;span class=&#34;math inline&#34;&gt;\(C = [c_{i,j}]\)&lt;/span&gt; is a spatial connectivity matrix&lt;/li&gt;
&lt;li&gt;&lt;span class=&#34;math inline&#34;&gt;\(\sum_{(2)} = \sum_{i=1}^{n} \sum_{j=1}^{n}\)&lt;/span&gt; with &lt;span class=&#34;math inline&#34;&gt;\(i \neq j\)&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If we define &lt;span class=&#34;math inline&#34;&gt;\(z^t = [z_i] = [x_i = \bar{x}]\)&lt;/span&gt;, the centered variable values,
this becomes:&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;math display&#34;&gt;\[I(x) = \frac{n\sum_{(2)}c_{ij}z_iz_j}
{\sum_{(2)}c_{ij}\sum_{i=1}^{n}z_i^2}\]&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Alternative definition from &lt;a href=&#34;https://en.wikipedia.org/wiki/Moran%27s_I&#34;&gt;Wikipedia&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;math display&#34;&gt;\[I = \frac{N}{W} \frac{\sum_i \sum_j w_{ij}(x_i-\bar{x}) (x_j-\bar{x})} {\sum_i(x_i-\bar{x})^2}\]&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class=&#34;math inline&#34;&gt;\(N\)&lt;/span&gt; is the number of spatial units indexed by &lt;span class=&#34;math inline&#34;&gt;\(i\)&lt;/span&gt; and &lt;span class=&#34;math inline&#34;&gt;\(j\)&lt;/span&gt;;&lt;/li&gt;
&lt;li&gt;&lt;span class=&#34;math inline&#34;&gt;\(x\)&lt;/span&gt; is the variable of interest;&lt;/li&gt;
&lt;li&gt;&lt;span class=&#34;math inline&#34;&gt;\(\bar x\)&lt;/span&gt; is the mean of &lt;span class=&#34;math inline&#34;&gt;\(x\)&lt;/span&gt;;&lt;/li&gt;
&lt;li&gt;&lt;span class=&#34;math inline&#34;&gt;\(w_{ij}\)&lt;/span&gt; is a matrix of spatial weights with zeroes on the diagonal
(i.e., &lt;span class=&#34;math inline&#34;&gt;\(w_{ii} = 0\)&lt;/span&gt;);&lt;/li&gt;
&lt;li&gt;&lt;span class=&#34;math inline&#34;&gt;\(W\)&lt;/span&gt; is the sum of all &lt;span class=&#34;math inline&#34;&gt;\(w_{ij}\)&lt;/span&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is a variation of covariance/correlation, where the contributions of
individual comparisons are weighted by the spatial weighting/connectivity
matrix.&lt;/p&gt;
&lt;p&gt;The value ranges from -1 for complete negative autocorrelation (presence in
one cell indicates absence in adjacent cells) to 1 for positive
autocorrelation (presence in one cell indicates presence in adjacent
cells). 0 indicates no spatial autocorrelation (presence in a cell is
unrelated to presences in adjacent cells).&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;spatial-weighting-matrix&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Spatial Weighting Matrix&lt;/h2&gt;
&lt;p&gt;Can take any form deemed appropriate by the investigator. Binary matrices
are often used to limit comparisons to spatially contiguous observations. A
common transformation is row-standardization, where elements of the matrix
are divided by the sum of their row. Using such a matrix, Moran’s &lt;code&gt;I&lt;/code&gt; can
be simplified to:&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;math display&#34;&gt;\[ I(x) = \frac{\sum_i\sum_j w_{ij}(x_i - \bar{x})(x_j - \bar{x})}
{\sum_i(x_i - \bar{x})^2} \]&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;&lt;a href=&#34;#ref-ThioulouseEtAl_2018&#34; role=&#34;doc-biblioref&#34;&gt;Thioulouse et al.&lt;/a&gt; (&lt;a href=&#34;#ref-ThioulouseEtAl_2018&#34; role=&#34;doc-biblioref&#34;&gt;2018&lt;/a&gt;)&lt;/span&gt; describe options in detail, and include walk-throughs of the main steps in R, based on the &lt;code&gt;spdep&lt;/code&gt; package:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;&lt;p&gt;Create a spatial neighbourhood object (class &lt;code&gt;nb&lt;/code&gt;), using criteria of distance (&lt;code&gt;dnearneigh&lt;/code&gt;), adjacent polygons (&lt;code&gt;poly2nb&lt;/code&gt;), ‘rook’ or ‘queen’ patterns on a grid (&lt;code&gt;cell2nb&lt;/code&gt;), or nearest neighbours (&lt;code&gt;knearneigh&lt;/code&gt;, which can be asymmetrical). More complex approaches include Delaunay Triangulation (&lt;code&gt;deldir::tri2nb&lt;/code&gt;) and Gabriel Graphs (&lt;code&gt;gabrielneigh&lt;/code&gt;), which are described in detail by &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-LegendreLegendre_2012&#34; role=&#34;doc-biblioref&#34;&gt;Legendre and Legendre 2012&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Spatial Neighbourhoods (&lt;code&gt;nb&lt;/code&gt;) are converted to Spatial Weighting Matrices (class &lt;code&gt;listw&lt;/code&gt;). These allows for binary associations to be weighted by spatial distances, including tranformations such as row or total standardization.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Applying spatial weights to the neighbourhood graphs is not clearly
explained in the package. This is how it’s done &lt;span class=&#34;citation&#34;&gt;(after &lt;a href=&#34;#ref-ThioulouseEtAl_2018&#34; role=&#34;doc-biblioref&#34;&gt;Thioulouse et al. 2018&lt;/a&gt;)&lt;/span&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;## Create a neighbourhood network based on
## point-point distance less than MAXDIST 
nbd &amp;lt;- dnearneigh(coords, d1 = 0.001, d2 = MAXDIST)

## calculate inverse distance between observations
invdist &amp;lt;- lapply(nbdists(nbd, coords), function(x) 1/x)

## Apply inverse distances as weights to the
## neighbourhood network:
swm &amp;lt;- nb2listw(nbd, glist = invdist, style = &amp;quot;W&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Adespatial provides the function &lt;code&gt;listw.explore&lt;/code&gt; to interactively examine
the alternatives available. It didn’t work when I tried it though.
Intuitively, creating a neighbourhood based on distance is the most
intuitive approach. Neighbourhoods can be edited directly, which might make
sense for distributions that span geographic boundaries (ie., lakes or
mountains).&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;spatial-lag&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Spatial Lag&lt;/h2&gt;
&lt;p&gt;The spatial lag is a smoothed estimate of the value of a cell, calculated
as the weighted averages of all the cell’s neighbours:&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;math display&#34;&gt;\[ \tilde{x}_i = \sum_jw_{ij}x_j \]&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;&lt;a href=&#34;#ref-Lee_2001&#34; role=&#34;doc-biblioref&#34;&gt;Lee&lt;/a&gt; (&lt;a href=&#34;#ref-Lee_2001&#34; role=&#34;doc-biblioref&#34;&gt;2001&lt;/a&gt;)&lt;/span&gt; shows Moran’s &lt;code&gt;I&lt;/code&gt; is very similar to the Pearson’s correlation
between the values of a cell, and the spatial lag for that cell. i.e.,&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;math display&#34;&gt;\[ I(x) = \frac{\sum_i(x_i - \bar{x})(\tilde{x}_i - \bar{x})}
{\sqrt{\sum_i(x_i - \bar{x})^2} \sqrt{\sum_i(x_i - \bar{x})^2}} \]&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;and&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;math display&#34;&gt;\[ r_{X, \tilde{X}} = \frac{\sum_i(x_i - \bar{x})
    (\tilde{x}_i - \bar{\tilde{x}})}
{\sqrt{\sum_i(x_i - \bar{x})^2} \sqrt{\sum_i(\tilde{x}_i -
                                     \bar{\tilde{x}})^2}} \]&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;It follows that Moran’s &lt;code&gt;I&lt;/code&gt; is the correlation between a variable and its
spatial lag, scaled by the root of the ratio between the variance of the
spatial lag and the variance of the original variable. (really it does, see
&lt;span class=&#34;citation&#34;&gt;&lt;a href=&#34;#ref-Lee_2001&#34; role=&#34;doc-biblioref&#34;&gt;Lee&lt;/a&gt; (&lt;a href=&#34;#ref-Lee_2001&#34; role=&#34;doc-biblioref&#34;&gt;2001&lt;/a&gt;)&lt;/span&gt; for the math). This ratio is the “spatial smoothing scalar,” aka
&lt;strong&gt;SSS&lt;/strong&gt;, or at least it is when the spatial weighting matrix is
row-standardized.&lt;/p&gt;
&lt;p&gt;SSS thus measures the degree of smoothing of a variable when it is
represented by its spatial lag. Spatial clustering produces larger SSS
values, because increased clustering reduces the difference between a
variable and its spatial lag (i.e., increases correlation). The SSS value
can be interpreted as the reduction in variance in the spatial lag relative
to the original variable.&lt;/p&gt;
&lt;p&gt;Together, this allows us to decompose Moran’s &lt;code&gt;I&lt;/code&gt; into a correlation value,
and a measure of variance reduction (SSS).&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;gearys-c&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Geary’s &lt;code&gt;c&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;All the papers and books about this stuff mention Geary’s &lt;code&gt;c&lt;/code&gt;. It doesn’t
appear to be used in any of the analyses, but since everyone else is
talking about it, I will too.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;math display&#34;&gt;\[c(x) = \frac{(n - 1)\sum_{(2)}c_{ij}(x_i - x_j)^2}
{2\sum_{(2)}c_{ij}\sum_{i=1}^n(x_i - \bar{x})^2}\]&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;From &lt;a href=&#34;https://en.wikipedia.org/wiki/Geary%27s_C&#34;&gt;Wikipedia&lt;/a&gt;, Geary’s &lt;code&gt;c&lt;/code&gt; is
sensitive to local spatial autocorrelation, capturing relationships among
adjacent observations. This contrasts with Moran’s &lt;code&gt;I&lt;/code&gt;, which is a measure
of global spatial autocorrelation. The two measures are related, but not
exactly inverse of each other.&lt;/p&gt;
&lt;p&gt;Geary’s &lt;code&gt;c&lt;/code&gt; ranges from 0 (positive spatial autocorrelation) to 1 (no
spatial autocorrelation) to values larger than 1 (negative
autocorrelation).&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;multispati-analysis&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;MULTISPATI Analysis&lt;/h1&gt;
&lt;p&gt;MULTISPATI analysis is co-inertial analysis of a multivariate data matrix
&lt;span class=&#34;math inline&#34;&gt;\(X\)&lt;/span&gt;, and the corresponding lag matrix &lt;span class=&#34;math inline&#34;&gt;\(\tilde{X} = WX\)&lt;/span&gt;. Which means we need
to understand what co-intertia analysis does.&lt;/p&gt;
&lt;div id=&#34;co-inertia-analysis&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Co-inertia Analysis&lt;/h2&gt;
&lt;p&gt;Co-inertia analysis is a form of symmetrical canonical ordination. It is
closely related to Procrustes Analysis, and has some parallels to Canonical
Correlation Analysis &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-LegendreLegendre_2012&#34; role=&#34;doc-biblioref&#34;&gt;Legendre and Legendre 2012&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;&lt;a href=&#34;#ref-LegendreLegendre_2012&#34; role=&#34;doc-biblioref&#34;&gt;Legendre and Legendre&lt;/a&gt; (&lt;a href=&#34;#ref-LegendreLegendre_2012&#34; role=&#34;doc-biblioref&#34;&gt;2012&lt;/a&gt;)&lt;/span&gt;: “variables of both data sets are projected onto the
axes obtained by eigen-analysis of the cross-set covariance matrix.” The
total co-inertia is the sum of the squared cross-set covariances. The
original objects are projected into the co-inertial space. Since each
sample is represented in both data sets, we can then compare their relative
location in the co-inertial space. Paired observations that are close to
each other in this projection indicate that they have similar relative
positions in both data sets &lt;span class=&#34;citation&#34;&gt;(&lt;span&gt;“good agreement between”&lt;/span&gt; data sets, &lt;a href=&#34;#ref-ThioulouseEtAl_2018&#34; role=&#34;doc-biblioref&#34;&gt;Thioulouse et al. 2018&lt;/a&gt;)&lt;/span&gt;. Similarly, points that are close together in one
data set, but have diverging arrows, indicate similarities in the first
data set are not reflected in similarities in the second.&lt;/p&gt;
&lt;p&gt;Full details in &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-DrayEtAl_2003&#34; role=&#34;doc-biblioref&#34;&gt;Dray, Chessel, and Thioulouse 2003&lt;/a&gt;; &lt;a href=&#34;#ref-DrayEtAl_2008&#34; role=&#34;doc-biblioref&#34;&gt;Dray, Saïd, and Débias 2008&lt;/a&gt;; &lt;a href=&#34;#ref-ThioulouseEtAl_2018&#34; role=&#34;doc-biblioref&#34;&gt;Thioulouse et al. 2018&lt;/a&gt;)&lt;/span&gt;.
Very heavy on matrix algebra, hard to develop an intuitive understanding.
For the moment, I shall hum along and pretend it is just CCorA.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;spatial-principal-components-analysis&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Spatial Principal Components Analysis&lt;/h2&gt;
&lt;p&gt;sPCA is the application of co-inertia analysis to a data matrix and its
spatial lag. This is a specific case of MULTISPATI, developed particularly
for application in population genetics. &lt;code&gt;adegenet::spca()&lt;/code&gt; is a high-level
wrapper that handles the details, but also obscures some of the steps.&lt;/p&gt;
&lt;p&gt;In the case of a &lt;code&gt;genpop&lt;/code&gt; object, it does the following:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;## Extract the allel frequences from the genpop object:
X &amp;lt;- tab(obj, freq = TRUE, NA.method = &amp;quot;mean&amp;quot;)

## coordinates retrieved from the genpop or nb object, or
## supplied by the user:
xy

## regular pca of allele frequencies, default center = TRUE,
## scale = FALSE:
x_pca &amp;lt;- ade4::dudi.pca(x, center = center, scale = scale,
                       scannf = FALSE) 

## Connection network from user or generated interactively:
resCN

## defaults
## scannf = TRUE, nfposi = 1, nfnega = 1

out &amp;lt;- ade4::multispati(dudi = x_pca, listw = resCN,
                       scannf = scannf, nfposi = nfposi,
                       nfnega = nfnega)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This returns an object of class &lt;code&gt;spca&lt;/code&gt;, which contains:&lt;/p&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;eig&lt;/code&gt;:&lt;/dt&gt;
&lt;dd&gt;a numeric vector of eigenvalues.
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;nfposi&lt;/code&gt;:&lt;/dt&gt;
&lt;dd&gt;an integer giving the number of global structures retained.
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;nfnega&lt;/code&gt;:&lt;/dt&gt;
&lt;dd&gt;an integer giving the number of local structures retained.
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;c1&lt;/code&gt;:&lt;/dt&gt;
&lt;dd&gt;a data.frame of alleles loadings for each axis.
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;li&lt;/code&gt;:&lt;/dt&gt;
&lt;dd&gt;a data.frame of row (individuals or populations) coordinates onto the
sPCA axes.
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;ls&lt;/code&gt;:&lt;/dt&gt;
&lt;dd&gt;a data.frame of lag vectors of the row coordinates; useful to clarify
maps of global scores.
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;as&lt;/code&gt;:&lt;/dt&gt;
&lt;dd&gt;a data.frame giving the coordinates of the PCA axes onto the sPCA axes.
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;call&lt;/code&gt;:&lt;/dt&gt;
&lt;dd&gt;the matched call.
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;xy&lt;/code&gt;:&lt;/dt&gt;
&lt;dd&gt;a matrix of spatial coordinates.
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;lw&lt;/code&gt;:&lt;/dt&gt;
&lt;dd&gt;a list of spatial weights of class &lt;code&gt;listw&lt;/code&gt;.
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;tab&lt;/code&gt;:&lt;/dt&gt;
&lt;dd&gt;the original data supplied to the PCA (possibly scaled/centered)
&lt;/dd&gt;
&lt;/dl&gt;
&lt;p&gt;Note that the PCA here is based on allele frequencies, which define a
Euclidean distance among populations &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-JombartEtAl_2008&#34; role=&#34;doc-biblioref&#34;&gt;T. Jombart et al. 2008&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;Example:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(adegenet)
library(utils)
library(graphics)
data(spcaIllus)
spca2A &amp;lt;- spca(spcaIllus$dat2A,
              xy = spcaIllus$dat2A$other$xy, ask=FALSE,
              type = 1, plot = FALSE, scannf = FALSE,
              nfposi = 2, nfnega = 0) &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;spca&lt;/code&gt; always warns that it is deprecated. The &lt;a href=&#34;https://github.com/thibautjombart/adegenet/issues/266&#34;&gt;package author says that’s
ok&lt;/a&gt;, and it should
be, since behind the scenes it just calls the function the warning tells
you to use anyways.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;li&lt;/code&gt; contains the observations, projected onto the coinertia space. &lt;code&gt;ls&lt;/code&gt;
provides the lagged values in the same space. We can compare them using
&lt;code&gt;arrows&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;plot(spca2A$li, pch = 16, asp = 1)
arrows(x0 = spca2A$li[, 1], x1 = spca2A$ls[, 1],
       y0 = spca2A$li[, 2], y1 = spca2A$ls[, 2],
       length = 0.1)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://plantarum.ca/notebooks/spatial_files/figure-html/coinertia%20plot-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Here, pairs with short arrows indicate sites where the composition of the
site is well-described by the composition of its neighbours; they
demonstrate high spatial-autocorrelation. In contrast, long arrows denote
sites where the actual composition of the site differs substantially from
what you would expect based on its nearest neighbours.&lt;/p&gt;
&lt;p&gt;These plots don’t appear to be used for sPCA, as they are for other
co-inertia analyses. Instead, the values from the first axis are plotted
geographically, to illustrate the spatial component of the data:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;s.value(spca2A$xy, spca2A$li[,1], include.origin = FALSE,
        addaxes = FALSE, clegend = 0, csize = 0.6) &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://plantarum.ca/notebooks/spatial_files/figure-html/spca%20map-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;This shows us the spatial pattern present in the subset of the original
data that is spatially structured (on the first axis). We could plot the
lagged data the same way, but I’m not sure what that would show us.&lt;/p&gt;
&lt;p&gt;We can also contrast this with a spatial plotting of the standard PCA
analysis:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;allFreq &amp;lt;- tab(spcaIllus$dat2A, freq = TRUE)

pca &amp;lt;- ade4::dudi.pca(allFreq, center = TRUE, scale = FALSE,
                     scannf = FALSE) 

s.value(spca2A$xy, pca$li[,1], include.origin = FALSE,
        addaxes = FALSE, clegend = 0, csize = 0.6) &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://plantarum.ca/notebooks/spatial_files/figure-html/pca%20map-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The main difference is that the sPCA more clearly shows the spatial
pattern; that’s what it does. It isolates the spatial pattern in the data,
and discards non-spatial structure. There should be a way to quantify how
much of the original variation is spatially structured?&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;eigenvalues&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Eigenvalues&lt;/h2&gt;
&lt;p&gt;From &lt;span class=&#34;citation&#34;&gt;&lt;a href=&#34;#ref-Jombart_2008&#34; role=&#34;doc-biblioref&#34;&gt;Thibaut Jombart&lt;/a&gt; (&lt;a href=&#34;#ref-Jombart_2008&#34; role=&#34;doc-biblioref&#34;&gt;2008&lt;/a&gt;)&lt;/span&gt;: Regular PCA decomposes total variance into decreasing,
orthogonal components. sPCA (and MULTISPATI generally, I think) does
something different. The eigenvalues are the product of variance and
autocorrelation. Axes with high variance and positive autocorrelation
indicate strong global structures. That is, they capture a relatively high
proportion of the variation in the data, and it has a relatively high
proportion of spatial structure (gradient, clustering etc). Axes with low
variance represent only a small amount of the total variation, and so
aren’t very interesting.&lt;/p&gt;
&lt;p&gt;Axes with high variance but high negative autocorrelation indicate strong
local structures in the data. That is, spatial replusion, where
neighbouring locations are more likely to have different compositions
(alleles, species composition etc).&lt;/p&gt;
&lt;p&gt;Deciding which axes are interesting seems to be a bit of an art. In
general, we look for a large drop between the first or second (or more)
axes and the rest, to indicate which ones are worth examining. And
similarly at the negative end. The &lt;code&gt;adegenet&lt;/code&gt; package &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-Jombart_2008&#34; role=&#34;doc-biblioref&#34;&gt;Thibaut Jombart 2008&lt;/a&gt;)&lt;/span&gt;
provides the function &lt;code&gt;screeplot&lt;/code&gt; for jointly visualizing the variance and
spatial component. Axes that are separated from the main cloud of points
are considered interesting/interpretable:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(stats)
screeplot(spca2A)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://plantarum.ca/notebooks/spatial_files/figure-html/screeplot-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Here, axis 1 is clearly separated, and should be interpreted. Axis 79 is
less obvious. And the idea of spatial repulsion is going to be difficult to
interpret biologically in most cases I think.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;packaging-oddities-significance-testing&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Packaging Oddities &amp;amp; Significance Testing&lt;/h2&gt;
&lt;p&gt;Note that despite both packages bearing the prefix &lt;code&gt;ade&lt;/code&gt;, &lt;code&gt;adespatial&lt;/code&gt; and
&lt;code&gt;ade4&lt;/code&gt; do &lt;em&gt;NOT&lt;/em&gt; use the same implemenatation of co-inertia analysis. &lt;code&gt;ade4&lt;/code&gt;
provides the function &lt;code&gt;coinertia()&lt;/code&gt; to compute co-inertia on any two data
sets. &lt;code&gt;adespatial&lt;/code&gt; implements it’s own co-inertia code within the function
&lt;code&gt;multispati&lt;/code&gt;. The package &lt;code&gt;adegenet&lt;/code&gt; provides the wrapper &lt;code&gt;spca&lt;/code&gt;, which
uses &lt;code&gt;multispati&lt;/code&gt; to complete the coinertia analysis it needs. Using the
function &lt;code&gt;spca&lt;/code&gt; generates a warning, indicating that this function is
deprecated and the user ought to use &lt;code&gt;multispati&lt;/code&gt; instead. However, &lt;code&gt;spca&lt;/code&gt;
actually calls &lt;code&gt;multispati&lt;/code&gt; itself, so its really just a
convenience/wrapper.&lt;/p&gt;
&lt;p&gt;The consequence of this is analyses completed via &lt;code&gt;spca&lt;/code&gt; and &lt;code&gt;multispati&lt;/code&gt;
do not return (or contain) an object of class &lt;code&gt;coinertia&lt;/code&gt;, and functions in
&lt;code&gt;ade4&lt;/code&gt; that process such objects won’t work on their output. Particularly
the function &lt;code&gt;ade4::randtest&lt;/code&gt;, which might otherwise provide a signficance
test for the existence of spatial structure in the data.&lt;/p&gt;
&lt;p&gt;But wait! There is a function &lt;code&gt;global.rtest&lt;/code&gt; that does provide a
randomization test for sPCA objects, and packages the results in a
&lt;code&gt;randtest&lt;/code&gt; object. It is based on Moran’s Eigenvector Maps. MEMs are part
of an alternative approach to spatial ordination from the Legendre group.
So, I’m not sure if this is actually testing the significance of the sPCA
(i.e., coinertia between the data and its spatial lag matrix), or rather
the equivalent in the MEM framework?&lt;/p&gt;
&lt;p&gt;Both &lt;code&gt;adespatial&lt;/code&gt; and &lt;code&gt;adegenet&lt;/code&gt; have their own version of &lt;code&gt;global.rtest&lt;/code&gt;,
of course. They are nearly identical, but use different code to generate
the MEM values. The results are close, but not identical. Can’t say if this
is an artifact of them being randomization tests, or if there’s some
substantive difference in the implementation.&lt;/p&gt;
&lt;p&gt;The co-existence of multiple loosely-loosely linked packages with
overlapping/duplicated functions and documentation scattered in books,
online tutorials, and, only secondarily, in the package documentation
itself, makes figuring this out more work than it needs to be.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;references&#34; class=&#34;section level1 unnumbered&#34;&gt;
&lt;h1&gt;References&lt;/h1&gt;
&lt;div id=&#34;refs&#34; class=&#34;references csl-bib-body hanging-indent&#34;&gt;
&lt;div id=&#34;ref-DrayEtAl_2003&#34; class=&#34;csl-entry&#34;&gt;
Dray, Stéphane, Daniel Chessel, and Jean Thioulouse. 2003. &lt;span&gt;“Co-Inertia Analysis and the Linking of Ecological Data Tables.”&lt;/span&gt; &lt;em&gt;Ecology&lt;/em&gt; 84 (11): 3078–89. &lt;a href=&#34;https://doi.org/10.1890/03-0178&#34;&gt;https://doi.org/10.1890/03-0178&lt;/a&gt;.
&lt;/div&gt;
&lt;div id=&#34;ref-DrayEtAl_2008&#34; class=&#34;csl-entry&#34;&gt;
Dray, Stéphane, Sonia Saïd, and Françis Débias. 2008. &lt;span&gt;“Spatial Ordination of Vegetation Data Using a Generalization of Wartenberg’s Multivariate Spatial Correlation.”&lt;/span&gt; &lt;em&gt;Journal of Vegetation Science&lt;/em&gt; 19 (1): 45–56. &lt;a href=&#34;https://doi.org/10.3170/2007-8-18312&#34;&gt;https://doi.org/10.3170/2007-8-18312&lt;/a&gt;.
&lt;/div&gt;
&lt;div id=&#34;ref-JombartEtAl_2008&#34; class=&#34;csl-entry&#34;&gt;
Jombart, T., S. Devillard, A.-B. Dufour, and D. Pontier. 2008. &lt;span&gt;“Revealing Cryptic Spatial Patterns in Genetic Variability by a New Multivariate Method.”&lt;/span&gt; &lt;em&gt;Heredity&lt;/em&gt; 101 (1): 92–103. &lt;a href=&#34;https://doi.org/10.1038/hdy.2008.34&#34;&gt;https://doi.org/10.1038/hdy.2008.34&lt;/a&gt;.
&lt;/div&gt;
&lt;div id=&#34;ref-Jombart_2008&#34; class=&#34;csl-entry&#34;&gt;
Jombart, Thibaut. 2008. &lt;span&gt;“Adegenet: A R Package for the Multivariate Analysis of Genetic Markers.”&lt;/span&gt; &lt;em&gt;Bioinformatics&lt;/em&gt; 24 (11): 1403–5. &lt;a href=&#34;https://doi.org/10.1093/bioinformatics/btn129&#34;&gt;https://doi.org/10.1093/bioinformatics/btn129&lt;/a&gt;.
&lt;/div&gt;
&lt;div id=&#34;ref-Lee_2001&#34; class=&#34;csl-entry&#34;&gt;
Lee, Sang-Il. 2001. &lt;span&gt;“Developing a Bivariate Spatial Association Measure: An Integration of Pearson’s r and Moran’s I.”&lt;/span&gt; &lt;em&gt;Journal of Geographical Systems&lt;/em&gt; 3 (4): 369–85. &lt;a href=&#34;https://doi.org/10.1007/s101090100064&#34;&gt;https://doi.org/10.1007/s101090100064&lt;/a&gt;.
&lt;/div&gt;
&lt;div id=&#34;ref-LegendreLegendre_2012&#34; class=&#34;csl-entry&#34;&gt;
Legendre, Pierre, and Louis Legendre. 2012. &lt;em&gt;Numerical Ecology&lt;/em&gt;. 3rd Edition. New York: Elsevier.
&lt;/div&gt;
&lt;div id=&#34;ref-ThioulouseEtAl_2018&#34; class=&#34;csl-entry&#34;&gt;
Thioulouse, Jean, Stéphane Dray, Anne-Béatrice Dufour, Aurélie Siberchicot, Thibaut Jombart, and Sandrine Pavoine. 2018. &lt;em&gt;Multivariate Analysis of Ecological Data with Ade4&lt;/em&gt;. New York, NY: Springer New York. &lt;a href=&#34;https://doi.org/10.1007/978-1-4939-8850-1&#34;&gt;https://doi.org/10.1007/978-1-4939-8850-1&lt;/a&gt;.
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Maxent Best Practices</title>
      <link>https://plantarum.ca/notebooks/maxent/</link>
      <pubDate>Mon, 15 Jun 2020 00:00:00 +0000</pubDate>
      
      <guid>https://plantarum.ca/notebooks/maxent/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#key-resources&#34; id=&#34;toc-key-resources&#34;&gt;Key Resources&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#technical-summaries&#34; id=&#34;toc-technical-summaries&#34;&gt;Technical Summaries&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#sampling-bias&#34; id=&#34;toc-sampling-bias&#34;&gt;Sampling Bias&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#spatial-bias&#34; id=&#34;toc-spatial-bias&#34;&gt;Spatial Bias&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#study-extent&#34; id=&#34;toc-study-extent&#34;&gt;Study Extent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#background-points-number-and-bias-grids&#34; id=&#34;toc-background-points-number-and-bias-grids&#34;&gt;Background Points: Number and Bias Grids&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#variable-selection&#34; id=&#34;toc-variable-selection&#34;&gt;Variable Selection&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#worldclim&#34; id=&#34;toc-worldclim&#34;&gt;Worldclim&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#transferability&#34; id=&#34;toc-transferability&#34;&gt;Transferability&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#beware-temperatureprecipitation-ratios&#34; id=&#34;toc-beware-temperatureprecipitation-ratios&#34;&gt;Beware temperature/precipitation ratios!&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#filtering-collinear-variables&#34; id=&#34;toc-filtering-collinear-variables&#34;&gt;Filtering Collinear Variables&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#feature-selection&#34; id=&#34;toc-feature-selection&#34;&gt;Feature Selection&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#regularization&#34; id=&#34;toc-regularization&#34;&gt;Regularization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#output-type&#34; id=&#34;toc-output-type&#34;&gt;Output type&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#evaluation&#34; id=&#34;toc-evaluation&#34;&gt;Evaluation&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#auc&#34; id=&#34;toc-auc&#34;&gt;AUC&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#boyce&#34; id=&#34;toc-boyce&#34;&gt;Boyce&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#thresholds&#34; id=&#34;toc-thresholds&#34;&gt;Thresholds&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#cross-validation&#34; id=&#34;toc-cross-validation&#34;&gt;Cross-validation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#spatial-null-models&#34; id=&#34;toc-spatial-null-models&#34;&gt;Spatial Null Models&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#projection&#34; id=&#34;toc-projection&#34;&gt;Projection&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#clamping&#34; id=&#34;toc-clamping&#34;&gt;Clamping&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#mess&#34; id=&#34;toc-mess&#34;&gt;MESS&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#extrapolation-detection&#34; id=&#34;toc-extrapolation-detection&#34;&gt;Extrapolation Detection&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#coue&#34; id=&#34;toc-coue&#34;&gt;COUE&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#climate-models&#34; id=&#34;toc-climate-models&#34;&gt;Climate Models&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#recommendations&#34; id=&#34;toc-recommendations&#34;&gt;Recommendations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#worldclim-1&#34; id=&#34;toc-worldclim-1&#34;&gt;WorldClim&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#definitions&#34; id=&#34;toc-definitions&#34;&gt;Definitions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#downscaling&#34; id=&#34;toc-downscaling&#34;&gt;Downscaling&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#ensembles&#34; id=&#34;toc-ensembles&#34;&gt;Ensembles&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#to-read&#34; id=&#34;toc-to-read&#34;&gt;TO-READ&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#references&#34; id=&#34;toc-references&#34;&gt;References&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;key-resources&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Key Resources&lt;/h1&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Merow et al. (&lt;a href=&#34;#ref-MerowEtAl_2013&#34;&gt;2013&lt;/a&gt;)&lt;/span&gt; provides a very thorough introduction to Maxent modeling,
and especially to what the various settings mean and how to set them.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://rspatial.org/&#34;&gt;RSpatial&lt;/a&gt; is a (nearly complete) set of lessons
covering spatial data analysis in R, and including good tutorials for
Maxent and other SDM approaches.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://earthskysea.org/best-practices-in-species-distribution-modeling/&#34;&gt;Best Practices in Species Distribution Modeling&lt;/a&gt;
is another set of online notes, including R code.&lt;/p&gt;
&lt;p&gt;From the same author (Adam B. Smith), a &lt;a href=&#34;https://docs.google.com/spreadsheets/d/1qsK1MNLsi-QjK9VptgeV2HfrB7mWUVxjZdaj_P9Qs-Y/edit#gid=975168115&#34;&gt;database of biodiversity databases&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Another workshop, from Lee-Yaw, is available on
&lt;a href=&#34;https://github.com/jullee/UBC-ENM-Workshop-Fall-2016&#34;&gt;github&lt;/a&gt;. It’s a
little old now, and doesn’t appear to be as comprehensive as Smith’s
workshop above (maybe?). However, some people may prefer the more condensed
format. Both authors have demonstrated expertise in the subject, so I don’t
doubt that these are both reliable sources.&lt;/p&gt;
&lt;div id=&#34;technical-summaries&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Technical Summaries&lt;/h2&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Phillips et al. (&lt;a href=&#34;#ref-PhillipsEtAl_2017&#34;&gt;2017&lt;/a&gt;)&lt;/span&gt; provides a short discussion of recent developments of
the Maxent application, and related statisical methods. See papers cited
there for (even more) detailed information.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Elith et al. (&lt;a href=&#34;#ref-ElithEtAl_2011&#34;&gt;2011&lt;/a&gt;)&lt;/span&gt; provides a statistical interpretation of the Maxent
process; possibly superceded by &lt;span class=&#34;citation&#34;&gt;Phillips et al. (&lt;a href=&#34;#ref-PhillipsEtAl_2017&#34;&gt;2017&lt;/a&gt;)&lt;/span&gt; and the work cited
therein, which was done post-2010.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Valavi et al. (&lt;a href=&#34;#ref-ValaviEtAl_2022&#34;&gt;2022&lt;/a&gt;)&lt;/span&gt; reproduce and extend the seminal paper of Elith
&lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-ElithEtAl_2006&#34;&gt;2006&lt;/a&gt;)&lt;/span&gt;, showing that 16 years later Maxent remains one of the
top performing SDM approaches (although some newer entrants may be slightly
better now).&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;sampling-bias&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Sampling Bias&lt;/h1&gt;
&lt;p&gt;Sampling bias in occurrence data is an issue because it means we can’t be
sure a species is detected under certain conditions because that’s its
preferred habitat, or because those are the conditions in the locations we
prefer to search.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The uniform sampling assumption does not require a uniformly random
sample from geographic space, but instead that environmental conditions
are sampled in proportion to their availability, regardless of their
spatial pattern
&lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-MerowEtAl_2013&#34;&gt;Merow et al., 2013&lt;/a&gt;)&lt;/span&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This problem can be addressed by thinning records &lt;span class=&#34;citation&#34;&gt;(also called spatial
filtering, &lt;a href=&#34;#ref-RadosavljevicAnderson_2014&#34;&gt;Radosavljevic and Anderson, 2014&lt;/a&gt;)&lt;/span&gt;, such that multiple records from
within the same area are represented by only one or a few of the total
records. This is a bit crude, but should remove the worst biases, such as a
particular field station getting preferentially sampled by recurring visits
from scientists or students, or general biases towards sampling roadsides
and popular trails.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Lee‐Yaw et al. (&lt;a href=&#34;#ref-Lee-YawEtAl_2018&#34;&gt;2018&lt;/a&gt;)&lt;/span&gt; developed their own method to thin species records, using
kernel smoothing estimates to reduce the number of samples from a
neighbourhood, and selecting which samples to keep by identifying which
occupy novel environments. &lt;del&gt;I don’t think this is widespread, and feels a
bit like overkill.&lt;/del&gt; 2025-05-05: This still isn’t widespread, but there is
evidence to support the importance of environmental filtering in model
accuracy &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-VarelaEtAl_2014&#34;&gt;Varela et al., 2014&lt;/a&gt;)&lt;/span&gt;. I need to investigate this further!&lt;/p&gt;
&lt;p&gt;Another interesting approach in &lt;span class=&#34;citation&#34;&gt;Atwater et al. (&lt;a href=&#34;#ref-AtwaterEtAl_2018&#34;&gt;2018&lt;/a&gt;)&lt;/span&gt;, who used a large dataset
of gbif records to establish geographic (and presumably also environmental)
bias in the full set of records, and used that to correct bias for
individual species. Simply put, occurrence records for each species are
weighted by the proportion of records for the entire set are found in that
location (either geographic or a cell in a climate grid).&lt;/p&gt;
&lt;p&gt;Subsampling based on raster grids is a simpler, more intuitive approach
provided by &lt;span class=&#34;citation&#34;&gt;Hijmans et al. (&lt;a href=&#34;#ref-HijmansEtAl_2017&#34;&gt;2017&lt;/a&gt;)&lt;/span&gt;. It doesn’t account for the possibility
that local density may be an accurate reflection of the niche requirements
of a species, as the approach of &lt;span class=&#34;citation&#34;&gt;Lee‐Yaw et al. (&lt;a href=&#34;#ref-Lee-YawEtAl_2018&#34;&gt;2018&lt;/a&gt;)&lt;/span&gt; does.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;NB: See my &lt;a href=&#34;https://plantarum.ca/2021/10/26/r-gridsample&#34;&gt;extended discussion of thinning records on a
grid&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Grid-sampling:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(terra)
## `occs` is a spatVector object containing the occurrence
## records as points

## wclim is a spatRaster object containing the environmental
## variables

## select one occurrence record per cell in the
## environmental raster layers::
lsOccs &amp;lt;- spatSample(occs, size = 1, strata = wclim)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Aiello‐Lammens et al. (&lt;a href=&#34;#ref-Aiello-LammensEtAl_2015&#34;&gt;2015&lt;/a&gt;)&lt;/span&gt; provides an alternative approach based on
imposing a minimum permissible nearest-neighbour distance, and then finding
the set that retains the most samples through repeated random samples.&lt;/p&gt;
&lt;p&gt;Thinning by Nearest-Neighbour:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;## thin.par sets minimum distance in km
trichthin &amp;lt;- thin(data.frame(LAT = coordinates(trich)[, &amp;quot;Y&amp;quot;],
                             LONG = coordinates(trich)[, &amp;quot;X&amp;quot;],
                             SPEC = rep(&amp;quot;tplan&amp;quot;, nrow(trich))),
                  thin.par = 2, reps = 1, write.files = FALSE,
                  locs.thinned.list.return = TRUE) &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Radosavljevic and Anderson (&lt;a href=&#34;#ref-RadosavljevicAnderson_2014&#34;&gt;2014&lt;/a&gt;)&lt;/span&gt; show that unfiltered/unthinned data produces
elevated assessment of model performance, as a consequence of over-fitting
to spatially auto-correlated data. So filtering works.&lt;/p&gt;
&lt;p&gt;See also Boria et al. 2014 (unread), &lt;span class=&#34;citation&#34;&gt;Varela et al. (&lt;a href=&#34;#ref-VarelaEtAl_2014&#34;&gt;2014&lt;/a&gt;)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Merow et al. (&lt;a href=&#34;#ref-MerowEtAl_2013&#34;&gt;2013&lt;/a&gt;)&lt;/span&gt; provide two more rigorous approaches, depending on
whether or not data on search effort is available. When search effort is
known, it can be used to construct a biased prior.&lt;/p&gt;
&lt;p&gt;When search effort is unknown, we can create a biased background sample to
account for bias in presence data, via Target Group Sampling. Under TGS,
records that are collected using the same surveys/methods as the focal
species are form the background points. i.e., the set of all herbarium
records in GBIF may be an appropriately biased background for any one of
those plant species. This assumes that the target plant is
collected/detected at the same rate as the reference set. It may be
appropriate to subset the reference set to increase the likelihood of this
being true: use only graminoids as biased background for sedges, or woody
plants as background for a tree?&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;spatial-bias&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Spatial Bias&lt;/h1&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Elith et al. (&lt;a href=&#34;#ref-ElithEtAl_2011&#34;&gt;2011&lt;/a&gt;)&lt;/span&gt; – The area of individual cells in raster layers projected
in Lat-Lon coordinates are not equal. This can be corrected by:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;project the grids to an equal area projection&lt;/li&gt;
&lt;li&gt;create a ‘bias grid’ that can be used to weight background samples&lt;/li&gt;
&lt;li&gt;create a background sample with appropriate sampling weights&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;study-extent&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Study Extent&lt;/h1&gt;
&lt;p&gt;Discussed extensively in &lt;span class=&#34;citation&#34;&gt;Barve et al. (&lt;a href=&#34;#ref-BarveEtAl_2011&#34;&gt;2011&lt;/a&gt;)&lt;/span&gt;. They identified three general
approaches to consider:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Biotic regions (ecozones etc). A good compromise between biological
realism and tractability&lt;/li&gt;
&lt;li&gt;Niche-model reconstructions: back-project a niche model over the
appropriate time period (i.e., previous glacial maximum or interglacial)
to identify the area that the species could have occupied over an
extended period. Nice idea, but a real risk of circularity?&lt;/li&gt;
&lt;li&gt;Detailed simulations. Sounds great, but I think if we had enough data to
properly parameterize such a model, we wouldn’t need to resort to sdms
in the first place.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If you wanted to improve on biotic regions, things to consider in
developing a more rigorous approach should include:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Dispersal characteristics of the species&lt;/li&gt;
&lt;li&gt;Crude estimate of the niche (again, circularity?)&lt;/li&gt;
&lt;li&gt;Establish relevant time span&lt;/li&gt;
&lt;li&gt;Identify relevant environmental changes&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Soberón (&lt;a href=&#34;#ref-Soberon_2010&#34;&gt;2010&lt;/a&gt;)&lt;/span&gt; is often cited together with &lt;span class=&#34;citation&#34;&gt;Barve et al. (&lt;a href=&#34;#ref-BarveEtAl_2011&#34;&gt;2011&lt;/a&gt;)&lt;/span&gt;, but the latter
provides more explicit discussion of best practices for SDM model
construction. I think the deference to Soberon is probably due to their
creation of the BAM model (in earlier publications), which Barve’s system
is based on (Biotic, Abiotic, Movement).&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Merow et al. (&lt;a href=&#34;#ref-MerowEtAl_2013&#34;&gt;2013&lt;/a&gt;)&lt;/span&gt; provide a shorter discussion, and emphasize matching the
study extent to the biological question of interest. Prioritizing sites for
protection within the range of a species should constrain the extent to the
existing range of the species; evaluating invasion potential should use an
extent large enough to encompass the areas of concern (i.e., global, or
continental scale for novel invasives).&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;background-points-number-and-bias-grids&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Background Points: Number and Bias Grids&lt;/h1&gt;
&lt;blockquote&gt;
&lt;p&gt;The number of background points should be large enough to comprehensively
sample (and hence represent) all environments in the region of interest.
&lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-ValaviEtAl_2022&#34;&gt;Valavi et al., 2022&lt;/a&gt;)&lt;/span&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If computational resources aren’t limiting, the ‘gold standard’ would be to
use every cell in the background extent. When that’s not feasible (i.e.,
most of the time!), &lt;span class=&#34;citation&#34;&gt;Valavi et al. (&lt;a href=&#34;#ref-ValaviEtAl_2022&#34;&gt;2022&lt;/a&gt;)&lt;/span&gt; recommend using 50,000 points. This is
based on empirical tests on a large number of species which showed that model
AUC&lt;sub&gt;ROC&lt;/sub&gt; approaches that of the ‘gold standard’ when 50K background points
are used. This comes at a cost of doubling the computation time compared to
sampling 10K background locations. They also note that choosing 50K, rather
than 35K or 70K, is somewhat arbitrary.&lt;/p&gt;
&lt;p&gt;This updates previous work that recommended sampling 10K background points
&lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-PhillipsDudik_2008&#34;&gt;Phillips and Dudík, 2008&lt;/a&gt;; &lt;a href=&#34;#ref-Barbet-MassinEtAl_2012&#34;&gt;Barbet-Massin et al., 2012&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;Note that some reviewers hold the view that the number of background points
should be ‘balanced’ (or weighted) against the number of presence records
in the model. This idea comes from considering Maxent models to be a form
of logistic regression, where the background points take the role of
absences. This is flawed, because we don’t know if our species is present
or absent at any given background point.&lt;/p&gt;
&lt;p&gt;This problem is resolved when we recognize that Maxent is more accurately
seen as an approximation of a Poisson Point Process Model. In this
framework, the background points serve to estimate the distribution of
environmental conditions in the full study area. That is, they aren’t tied
to presences or absences at all. The consequence is that, in general, more
background points is better, and efforts to match background numbers to
presence records is misguided.&lt;/p&gt;
&lt;p&gt;The theory underpinning this is explained by &lt;span class=&#34;citation&#34;&gt;Renner et al. (&lt;a href=&#34;#ref-RennerEtAl_2015&#34;&gt;2015&lt;/a&gt;)&lt;/span&gt;, and
practical application to distribution models is covered in
&lt;span class=&#34;citation&#34;&gt;Valavi et al. (&lt;a href=&#34;#ref-ValaviEtAl_2022&#34;&gt;2022&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;variable-selection&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Variable Selection&lt;/h1&gt;
&lt;p&gt;Variables == predictors, the spatial layers used as the
environmental/dependent variables in the model.&lt;/p&gt;
&lt;p&gt;Interesting discussion in &lt;span class=&#34;citation&#34;&gt;Guisan et al. (&lt;a href=&#34;#ref-GuisanEtAl_2017&#34;&gt;2017&lt;/a&gt;)&lt;/span&gt; (section 6.4, page 102+):
variables that are measured most accurately often/usually are only
indirectly related to a species’ niche; e.g., elevation, slope, aspect.
Very precise and accurate spatial layers are available for these.&lt;/p&gt;
&lt;p&gt;Variables with a direct relationship to a species niche are referred to as
“proximal”. These include temperature, moisture, soil type etc, and are
usually created through interpolation from sparse reference points (weather
stations). This involves unavoidable error propagation and imprecision.&lt;/p&gt;
&lt;p&gt;Over small extents, it may be preferable to use indirect variables, as they
offer greater precision in quantifying the local environment. However, as
extent increases, the relative value of direct variables increases. The
indirect variables are likely not stationary on large scales - a species
relationship to slope and elevation are likely different in southern US vs
northern Canada, for instance. On the other hand, a species relationship to
temperature, however coarsely it is mapped, is likely similar across its
geographic range.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Elith et al. (&lt;a href=&#34;#ref-ElithEtAl_2011&#34;&gt;2011&lt;/a&gt;)&lt;/span&gt; point out that Maxent’s built-in variable selection (via
L1-regularization) is reliable, relatively insensitive to correlation among
variables, and model performance may actually be degraded by imposing
additional model selection procedures prior to running Maxent! They do
suggest that sticking to proximal variables is preferable when projecting
models to novel contexts.&lt;/p&gt;
&lt;div id=&#34;worldclim&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Worldclim&lt;/h2&gt;
&lt;p&gt;One of the most common sources of climate data for SDMs is
&lt;a href=&#34;https://worldclim.org/data/cmip6/cmip6climate.html&#34;&gt;WorldClim&lt;/a&gt;. There are
actually three distinct data sets available from Worldclim:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Historical climate data
&lt;ul&gt;
&lt;li&gt;compiled from a combination of weather stations and databases,
and further refined with satellite imagery &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-FickHijmans_2017&#34;&gt;Fick and Hijmans, 2017&lt;/a&gt;)&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;climate normals (30 year averages for the period 1970-2000) including
monthly minimum, maximum, average temperature; monthly total
precipitation; 19 bioclim variables (biologically meaningful values
calculated from monthly temperature and precipitation values).&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;Historical monthly weather data (1950-2024)
&lt;ul&gt;
&lt;li&gt;downscaled from CRU &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-HarrisEtAl_2020&#34;&gt;Harris et al., 2020&lt;/a&gt;)&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;10-year averages for monthly minimum, maximum and average
temperatures; monthly total precipitation.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;Future climate projections
&lt;ul&gt;
&lt;li&gt;downscaled &amp;amp; bias corrected data from
&lt;a href=&#34;https://wcrp-cmip.org/cmip-phases/cmip6/&#34;&gt;CMIP6&lt;/a&gt;, including a variety
of global climate models (GCM) and shared socio-economic pathways
(SSP).&lt;/li&gt;
&lt;li&gt;20-year averages for monthly minimum, maximum and average
temperatures; monthly total precipitation; bioclim variables.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;div id=&#34;transferability&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Transferability&lt;/h2&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Petitpierre et al. (&lt;a href=&#34;#ref-PetitpierreEtAl_2017&#34;&gt;2017&lt;/a&gt;)&lt;/span&gt; explicitly tested different approaches to model
selection for use in projecting models in space and time. They recommend
modelers should use a small number of proximal variables, or,
particularly when very few observations are available for training, the
first few PCA axes of a larger set of environmental variables. PCA axes are
orthogonal (i.e., not collinear) by construction, but interpretation may be
tricky if they incorporate a large number of variables.&lt;/p&gt;
&lt;p&gt;Since we rarely have much information on proximal variables (beyond
understanding that temperature and moisture are generally important),
&lt;span class=&#34;citation&#34;&gt;Petitpierre et al. (&lt;a href=&#34;#ref-PetitpierreEtAl_2017&#34;&gt;2017&lt;/a&gt;)&lt;/span&gt; recommend a suite of “State of the Art” (SOA)
variables that have been shown to be useful for plants. These include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;BIO 01: annual mean temperature&lt;/li&gt;
&lt;li&gt;BIO 04: temperature seasonality&lt;/li&gt;
&lt;li&gt;BIO 11: mean temperature of coldest quarter&lt;/li&gt;
&lt;li&gt;BIO 10: mean temperature of warmest quarter&lt;/li&gt;
&lt;li&gt;BIO 15: precipitation seasonality&lt;/li&gt;
&lt;li&gt;BIO 16: precipitation of wettest quarter&lt;/li&gt;
&lt;li&gt;BIO 23: moisture index seasonality&lt;/li&gt;
&lt;li&gt;BIO 28: annual mean moisture index&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Note the last two aren’t included in the
&lt;a href=&#34;https://worldclim.org/&#34;&gt;WorldClim/Bioclim&lt;/a&gt; archives, and the numbering
above BIO 19 isn’t necessarily consistent (at least between different
sources). These variables are available from &lt;a href=&#34;https://www.climond.org&#34; class=&#34;uri&#34;&gt;https://www.climond.org&lt;/a&gt;, but
that appears to be an outdated source relative to Worldclim? Worldclim
currently offers more and more recent climate projections, but only for the
first 19 variables. The Climond data is currently is also only available at
10’ and 30’ resolution, while Worldclim provides data (without the moisture
indices) down to 30” resolution.&lt;/p&gt;
&lt;p&gt;At a global scale, annual mean temperature and temperature of the warmest
quarter are (not suprisingly) highly correlated, and the moisture indices
are moderately correlated with precip var and precip wet quarter (again,
not suprising). So it might be reasonable to use a reduced set of the SOA
variables in studies that aim to generate transferable models: BIO 01,
BIO 11, BIO 15, BIO 16 (and maybe BIO 23).&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;beware-temperatureprecipitation-ratios&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Beware temperature/precipitation ratios!&lt;/h2&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Booth (&lt;a href=&#34;#ref-Booth_2022&#34;&gt;2022&lt;/a&gt;)&lt;/span&gt; found some very interesting problems with climate variables
that mix temperature and precipitation (e.g., temperature of wettest
quarter, precipitation of warmest quarter). In some cases these can
generate abrupt gradients that reflect numerical oddities of the ratios
rather than any biologically meaningful pattern. If you use the SOA
variables this isn’t a concern, but if you are using combination ratios, be
sure to check that paper for details.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;filtering-collinear-variables&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Filtering Collinear Variables&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;This section is out of date and needs to be updated and trimmed!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I will need to reconsider whether evaluating collinearity is something I
should continue to do. In the meantime, I leave the following notes on how
to do it:&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Merow et al. (&lt;a href=&#34;#ref-MerowEtAl_2013&#34;&gt;2013&lt;/a&gt;)&lt;/span&gt; identify two general approaches to selecting variables.
The machine learning approach is based on the understanding that the
Maxent algorithm will, by design, select the most useful variables and
features, so we can include all reasonable variables.&lt;/p&gt;
&lt;p&gt;However, this probably only applies when the objective is to provide
accurate predictions of occurrences in the same context in which the model
is built. Efforts to understand the environmental constraints on that
distribution, or projecting it to a new context, will be potentially
confounded when the model includes correlated variables.&lt;/p&gt;
&lt;p&gt;To minimize this problem, &lt;span class=&#34;citation&#34;&gt;Merow et al. (&lt;a href=&#34;#ref-MerowEtAl_2013&#34;&gt;2013&lt;/a&gt;)&lt;/span&gt; recommends taking a statistical
approach (i.e., treating a Maxent model as a ‘conventional’ statistical
model). In this case, they recommend prescreening variables to limit
colinearity, and emphasize biologically relevant variables. This should
produce more parsimonious and interpretable models.&lt;/p&gt;
&lt;p&gt;Pairwise correlations can be used to identify pairs or groups of variables
that are highly correlated. ENMTools &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-WarrenEtAl_2019&#34;&gt;Warren et al., 2019&lt;/a&gt;)&lt;/span&gt; provides several
helper functions for this, including &lt;code&gt;raster.cor.matrix&lt;/code&gt;,
&lt;code&gt;raster.cor.plot&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I prefer using &lt;code&gt;hclust&lt;/code&gt; based on &lt;code&gt;1 - abs(cor)&lt;/code&gt; to visualize correlated
groups:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;## &amp;quot;predictors&amp;quot; is a raster stack

## Calculate correlations:
cors &amp;lt;- raster.cor.matrix(predictors) # from ENMTools

threshold = 0.7  ## the maximum permissible correlation

dists &amp;lt;- as.dist(1 - abs(cors))

clust &amp;lt;- hclust(dists, method = &amp;quot;single&amp;quot;)
groups &amp;lt;- cutree(clust, h = 1 - threshold)

## Visualize groups:
plot(clust, hang = -1)
rect.hclust(clust, h = 1 - threshold)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://plantarum.ca/notebooks/maxent_files/figure-html/hclust-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;## Print the groups:
groups&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##  bio1 bio12 bio16 bio17  bio5  bio6  bio7  bio8 biome 
##     1     1     1     1     1     1     1     1     2&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;NB&lt;/strong&gt; I set the &lt;code&gt;method&lt;/code&gt; argument to &lt;code&gt;&#34;single&#34;&lt;/code&gt;. This ensures that all the
variables with correlations above the threshold will be captured together
in a cluster. The default method, &lt;code&gt;&#34;complete&#34;&lt;/code&gt;, creates clusters that may
include variables with correlations above the specified threshold. See
discussion of hierarchical agglomerative clustering (section 8.5) in
&lt;span class=&#34;citation&#34;&gt;Legendre and Legendre (&lt;a href=&#34;#ref-LegendreLegendre_2012&#34;&gt;2012&lt;/a&gt;)&lt;/span&gt; for details, if you want to know more.&lt;/p&gt;
&lt;p&gt;After running this, &lt;code&gt;groups&lt;/code&gt; will identify which cluster each variable
belongs to. Keep at most one variable from each group.&lt;/p&gt;
&lt;p&gt;&lt;del&gt;This doesn’t absolutely guarantee that variables in different groups will
have correlations less than &lt;code&gt;threshold&lt;/code&gt;, but in most cases this will be
true. When it isn’t, the highest inter-group correlation will still be very
close to the threshold. If you’re concerned, double check the correlations
of the variables after you’ve picked them.&lt;/del&gt;&lt;/p&gt;
&lt;p&gt;Alternatively, you can use &lt;code&gt;cutree&lt;/code&gt; to select the number of groups you
want, then pick a variable from each group. Since this doesn’t use a
threshold, you’ll have to make sure the number of groups you pick is equal
to or lower than the number of groups generated using the threshold
approach.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;## &amp;quot;predictors&amp;quot; is a raster stack

## Calculate correlations:
cors &amp;lt;- raster.cor.matrix(predictors) # from ENMTools

groupNum = 3  ## the desired number of groups

dists &amp;lt;- as.dist(1 - abs(cors))

clust &amp;lt;- hclust(dists, method = &amp;quot;single&amp;quot;)
groups &amp;lt;- cutree(clust, k = groupNum)

## Visualize groups:
plot(clust, hang = -1)
rect.hclust(clust, k = groupNum)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://plantarum.ca/notebooks/maxent_files/figure-html/hclust%20groups-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;## Print the groups:
groups&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##  bio1 bio12 bio16 bio17  bio5  bio6  bio7  bio8 biome 
##     1     2     2     2     1     1     1     1     3&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;However, neither of these approaches will address multicollinearity among
three or more variables. &lt;span class=&#34;citation&#34;&gt;Guisan et al. (&lt;a href=&#34;#ref-GuisanEtAl_2017&#34;&gt;2017&lt;/a&gt;)&lt;/span&gt; suggest using the function
&lt;code&gt;usdm::vif&lt;/code&gt; instead, which calculates variable inflation. They recommend
keeping the vif values under 10, but different authors will use cutoffs
from 5-20.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;feature-selection&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Feature Selection&lt;/h1&gt;
&lt;p&gt;Features == the statistical models used to fit the variables to the
response variables (presences). i.e., linear, quadratic, product, hinge,
threshold, categorical.&lt;/p&gt;
&lt;p&gt;Note that &lt;code&gt;hinge&lt;/code&gt; is essentially a superset of linear and threshold
features, so if you have hinges, the other two are redundant
&lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-ElithEtAl_2011&#34;&gt;Elith et al., 2011&lt;/a&gt;)&lt;/span&gt;. As of version 3.4.0, threshold featues are not included
by default; experience has shown that this improves model performance, and
produces simpler, more realistic models &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-PhillipsEtAl_2017&#34;&gt;Phillips et al., 2017&lt;/a&gt;)&lt;/span&gt;. Similarly,
product features appear to contribute very little to model performance,
given the added complexity.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Merow et al. (&lt;a href=&#34;#ref-MerowEtAl_2013&#34;&gt;2013&lt;/a&gt;)&lt;/span&gt; recommend selecting features on biological grounds. They
provide a short discussion, noting that the fundamental niche is likely
quadratic for most variables over a large enough extent, but may be better
approximated by a linear function if the study extent is truncated with
respect to the species’ tolerance for that variable (a la Whittaker).
Interesting ideas, but not much to go on unless you actually do know a fair
bit about your species.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Warren and Seifert (&lt;a href=&#34;#ref-WarrenSeifert_2011&#34;&gt;2011&lt;/a&gt;)&lt;/span&gt; describe a process for selecting features to
keep/include in the model (linear, quadratic, polynomial, hinge, threshold,
categorical). It uses the AIC to identify the optimal combination. Easy and
quick to do with the ENMEval package (note that many references cite
ENMTools for these tests, but they’ve been moved to ENMEval nowadays).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;NB&lt;/strong&gt; applying different spatial filtering/thinning to your data can
produce different ‘optimal’ models (i.e., different retained features and
regularization value), as determined by the AIC criterion.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;NB&lt;/strong&gt; The &lt;code&gt;enmeval&lt;/code&gt; function only evaluates AIC for models with an
appropriate number of parameters. If you have a low number of observations,
a low regularization (i.e., permissive approach to including parameters),
and complex/high-parameter models (especially hinge features), the AIC
values will be reported as &lt;code&gt;NA&lt;/code&gt;. These models are overfit, and as such you
shouldn’t use them with your data. &lt;a href=&#34;https://groups.google.com/g/maxent/c/qtMgmZ3Tpz8/m/GHSC8XF2BQAJ&#34;&gt;Explanation on Maxent discussion
list.&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;regularization&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Regularization&lt;/h1&gt;
&lt;p&gt;Regularization is used to penalize complexity. Low values will produce
models with many predictors and features, with 0 leading to all features
and variables being included. This can lead to problems with over-fitting
and interpretation. Higher regularization values will lead to ‘smoother’,
and hopefully more general and transferable models. There will be a
trade-off between over- and under-fitting.&lt;/p&gt;
&lt;p&gt;The default values in Maxent are based on empirical tests on a large number
of species. These are probably not unreasonable, but it’s pretty standard
to mention that they’re a compromise, and we improved them for our the
needs of our particular species and context by doing X (for various values
of X).&lt;/p&gt;
&lt;p&gt;The approach of &lt;span class=&#34;citation&#34;&gt;Warren and Seifert (&lt;a href=&#34;#ref-WarrenSeifert_2011&#34;&gt;2011&lt;/a&gt;)&lt;/span&gt; (see previous) can be used here as
well, testing a range of regularization (aka beta) values, and selecting
the one that generates the lowest AIC. It may also be worth selecting the
simplest model that is within a certain similarity of the ‘best’ model?
That’s more to explain to reviewers though.&lt;/p&gt;
&lt;p&gt;Warren and Seifert’s simulations demonstrate that models with
a similar number of parameters to the true model produce more accurate
models, in terms of suitability, variable assessment, and ranking of
habitat suitability, both for the training extent and for models projected
in space/time. Furthermore, AIC and BIC are the most effective approaches
to model tuning to achieve the correct number of parameters.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Radosavljevic and Anderson (&lt;a href=&#34;#ref-RadosavljevicAnderson_2014&#34;&gt;2014&lt;/a&gt;)&lt;/span&gt; also consider the impact of the regularization
parameter on over-fitting. They find that the default value often leads to
over-fitting, especially when spatial auto-correlation is not accounted for
in model fitting. They conclude that regularization should be set
deliberately for a study, following the results of experiments exploring a
range of potential values.&lt;/p&gt;
&lt;p&gt;Note that specifying the regularization is done via the &lt;code&gt;betamultiplier&lt;/code&gt;
argument, which applies to each of the different feature classes. That is,
the actual regularization value will be set by Maxent automatically for
each class, subject to the multiplier value specified by the user. We don’t
set the regularization values for each class directly (which is possible
via the options &lt;code&gt;beta_lqp&lt;/code&gt;, &lt;code&gt;beta_threshold&lt;/code&gt; etc. &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-Phillips_2017&#34;&gt;Phillips, 2017&lt;/a&gt;)&lt;/span&gt;,
although &lt;span class=&#34;citation&#34;&gt;Radosavljevic and Anderson (&lt;a href=&#34;#ref-RadosavljevicAnderson_2014&#34;&gt;2014&lt;/a&gt;)&lt;/span&gt; suggest experiments to explore this
should be done.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;output-type&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Output type&lt;/h1&gt;
&lt;p&gt;&lt;strong&gt;Raw&lt;/strong&gt;: values are Relative Occurrence Rate (ROR) &lt;del&gt;which will sum to 1
over the extent of the study&lt;/del&gt;. &lt;span class=&#34;citation&#34;&gt;Merow et al. (&lt;a href=&#34;#ref-MerowEtAl_2013&#34;&gt;2013&lt;/a&gt;)&lt;/span&gt; considers this to be a
reasonable interperetation of the Maxent output, but the actual values can
be difficult to interpret; they produce maps that “do not often match
ecologists’ intuition about the distribution of their species”
&lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-PhillipsEtAl_2017&#34;&gt;Phillips et al., 2017&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Logistic&lt;/strong&gt;: attempts to provide an accurate estimate of the probability
that the species is present, given the environment. Monotonically related
to raw values; site rank is identical for these measures. Based on the
assumption that there is a 50% probability that a species will be present
at a site with ‘average’ conditions for the species. This assumption is
problematic and unrealistic according to &lt;span class=&#34;citation&#34;&gt;Merow et al. (&lt;a href=&#34;#ref-MerowEtAl_2013&#34;&gt;2013&lt;/a&gt;)&lt;/span&gt;. However,
&lt;span class=&#34;citation&#34;&gt;Elith et al. (&lt;a href=&#34;#ref-ElithEtAl_2011&#34;&gt;2011&lt;/a&gt;)&lt;/span&gt; prefer logistic output, and discuss justification for
preferring it over raw values.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cumulative&lt;/strong&gt;: the sum of all cells with &amp;lt;= to the raw value of the cell.
Rescaled to range from 0-100.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Complementary Log-Log&lt;/strong&gt; (aka CLOGLOG): the standard output from version
3.4.0 onwards. Generally similar to the logistic output, but tending to
give slightly higher values. There is a stronger theoretical justification
for CLOGLOG than logistic, as summarized in &lt;span class=&#34;citation&#34;&gt;Phillips et al. (&lt;a href=&#34;#ref-PhillipsEtAl_2017&#34;&gt;2017&lt;/a&gt;)&lt;/span&gt;. CLOGLOG
provides an estimate of the probability of presence, but with the caveat
that this probability is based on an arbitrary quadrat size (similar to the
prevalence assumption made with logistic).&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Merow et al. (&lt;a href=&#34;#ref-MerowEtAl_2013&#34;&gt;2013&lt;/a&gt;)&lt;/span&gt; recommends sticking to Raw whenever possible, which means
using the same species in the same extent. Note that the raw values will
change for different extents, even for identical models, so they can’t be
compared across projections without additional post-processing.&lt;/p&gt;
&lt;p&gt;Cumulative is preferable when defining/describing range boundaries, or
otherwise dealing with omission rates.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Elith et al. (&lt;a href=&#34;#ref-ElithEtAl_2011&#34;&gt;2011&lt;/a&gt;)&lt;/span&gt; prefer to use the logistic, which they present as a
biologically reasonable estimate. However, it will be a problem in
comparisons among species with different prevalence on the landscape, as it
assumes identical (and arbitrary) prevalence.&lt;/p&gt;
&lt;p&gt;Following &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-PhillipsEtAl_2017&#34;&gt;Phillips et al., 2017&lt;/a&gt;)&lt;/span&gt;, CLOGLOG now seems to be the most intuitive
output to use in most cases.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;evaluation&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Evaluation&lt;/h1&gt;
&lt;div id=&#34;auc&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;AUC&lt;/h2&gt;
&lt;p&gt;AUC assesses the success of the model in correctly ranking a random
background point and a random presence point; that is, it should predict
the suitability of the presence point higher than the background point. It
is threshold-independent.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Lobo et al. (&lt;a href=&#34;#ref-LoboEtAl_2008&#34;&gt;2008&lt;/a&gt;)&lt;/span&gt; identified five problems with AUC:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;it ignores the predicted probability values and the goodness-of-fit of the model;&lt;/li&gt;
&lt;li&gt;it summarises the test performance over regions of the ROC space in
which one would rarely operate;&lt;/li&gt;
&lt;li&gt;it weights omission and commission errors equally;&lt;/li&gt;
&lt;li&gt;it does not give information about the spatial distribution of model
errors; and, most importantly,&lt;/li&gt;
&lt;li&gt;the total extent to which models are carried out highly influences the
rate of well-predicted absences and the AUC scores.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Additionally, &lt;span class=&#34;citation&#34;&gt;Radosavljevic and Anderson (&lt;a href=&#34;#ref-RadosavljevicAnderson_2014&#34;&gt;2014&lt;/a&gt;)&lt;/span&gt; point out that AUC doesn’t
assess over-fitting or goodness-of-fit; rather, it is a measure of
discrimination capacity.&lt;/p&gt;
&lt;p&gt;However, comparing the difference in AUC for the training and testing data
does give an estimate of overfitting. If the model fit perfectly, without
overfitting, the AUC should be identical. It won’t be, and the difference
reflects the degree to which the model is over-fit on the training data. In
other words, the extent to which the model is fit to noise in the data, or
environmental bias, if geographic masking is used in the k-fold partitions.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;boyce&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Boyce&lt;/h2&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Boyce et al. (&lt;a href=&#34;#ref-BoyceEtAl_2002&#34;&gt;2002&lt;/a&gt;)&lt;/span&gt; proposed an index that compares the predicted and expected
number of occupied sites with the suitability value of those sites:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;sites are first sorted from lowest to highest suitability&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;then they are binned into groups of equal frequency&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;the number of actual occurrences for each bin are tabulated&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;the Spearman-rank correlation between bin rank and occurrence number
is then calculated; if the model is good, we expect increasing numbers of
occurrences for higher-ranked bins&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Hirzel et al. (&lt;a href=&#34;#ref-HirzelEtAl_2006&#34;&gt;2006&lt;/a&gt;)&lt;/span&gt; evaluated a variety of SDM evaluation measures; on data
sets with more than 50 presences, most evaluators had &amp;gt; 0.70 correlation
with each other. Which is a little reassuring I suppose? They used AUC on
presence/absence data as the ‘gold standard’, and found that the continuous
Boyce index (which uses presence-only data) performed best.&lt;/p&gt;
&lt;p&gt;This can be calculated with the function &lt;code&gt;ecospat::boyce()&lt;/code&gt;, which takes a
raster of suitability values and a matrix or data.frame containing the
coordinates of presences. Note that as of 2025-05-06 there is a &lt;a href=&#34;https://github.com/ecospat/ecospat/issues/99&#34;&gt;bug in
this function&lt;/a&gt;, and the
values it produces are not correct! Hopefully this will be fixed soon.&lt;/p&gt;
&lt;p&gt;Related discussion in &lt;span class=&#34;citation&#34;&gt;Phillips and Elith (&lt;a href=&#34;#ref-PhillipsElith_2010&#34;&gt;2010&lt;/a&gt;)&lt;/span&gt;, who note that the Boyce index is
an example of their presence-only calibration plot (POC plot).&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;thresholds&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Thresholds&lt;/h2&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Radosavljevic and Anderson (&lt;a href=&#34;#ref-RadosavljevicAnderson_2014&#34;&gt;2014&lt;/a&gt;)&lt;/span&gt; Threshold-dependent evaluation requires
identifying a threshold in values predicted by the model to generate a
binary suitable/unsuitable map. Setting the threshold to the lowest
predicted value for a presence location may produce undesireable results if
the lowest values is associated with an observation from an extreme
outlier. More robust is setting the threshold to a particular quantile
(10%), to exclude weirdos from establishing what’s suitable.&lt;/p&gt;
&lt;p&gt;Again, if the model is perfectly fit, the omission rate in the testing data
should be the same as in the training data. That is, setting the threshold
at 10% to create the binary suitability map, we expect the omission rate in
the test data to be 10%. Higher omission in the testing data reflects
over-fitting (noise and/or bias).&lt;/p&gt;
&lt;p&gt;For presence-only data commission error is unknown/unknowable. Accordingly,
&lt;span class=&#34;citation&#34;&gt;Radosavljevic and Anderson (&lt;a href=&#34;#ref-RadosavljevicAnderson_2014&#34;&gt;2014&lt;/a&gt;)&lt;/span&gt; defined an optimal model as one that “(1)
reduced omission rates to the lowest observed value (or near it) and
minimized the difference between calibration and evaluation AUC [i.e.,
minimized over-fitting]; and (2) still led to maximal or near maximal
observed values for the evaluation AUC (which assesses discriminatory
ability). When more than one regularization multiplier fulfilled these
criteria equally well, we chose the lowest one, to promote discriminatory
ability (and hence, counter any tendency towards underfitting).”&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;cross-validation&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Cross-validation&lt;/h2&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Radosavljevic and Anderson (&lt;a href=&#34;#ref-RadosavljevicAnderson_2014&#34;&gt;2014&lt;/a&gt;)&lt;/span&gt; evaluated cross-validation using random k-fold
partitions, geographical structuring, and geographic masking of partitions.
Random partitions suffer from preserving biases in the training data in the
testing data.&lt;/p&gt;
&lt;p&gt;Geographic structuring, which uses occurrences from a pre-defined
geographic area (rather than a random sample) as the test set, introduces
additional spatial bias, and should be avoided. However, geographic
structuring combined with masking (which excludes both presences and
background from the specified geographic region from the test set) may
substantially reduce overfitting, and yields more realistic models than
random partitions.&lt;/p&gt;
&lt;p&gt;Checkerboard partitions offer a nice compromise - this is geographic
structuring and masking on a fine scale, and so should reduce spatial
correlation between training and testing data. A version was used by
&lt;span class=&#34;citation&#34;&gt;Pearson et al. (&lt;a href=&#34;#ref-PearsonEtAl_2013&#34;&gt;2013&lt;/a&gt;)&lt;/span&gt;, without a lot of discussion. Functions to do
checkerboard cross validation are provided by &lt;span class=&#34;citation&#34;&gt;Muscarella et al. (&lt;a href=&#34;#ref-MuscarellaEtAl_2014&#34;&gt;2014&lt;/a&gt;)&lt;/span&gt;, but
without a lot of discussion. The cited references suggest this might be
intended more for species with limited occurrence data? Also, as
implemented it looks like they only allow for 2-fold and 4-fold
cross-validation. I’m not sure there’s any reason not to use checkerboards
to do 9- or 16- fold cross validation?&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;spatial-null-models&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Spatial Null Models&lt;/h2&gt;
&lt;p&gt;Not often used, but see &lt;span class=&#34;citation&#34;&gt;Rodríguez-Rey et al. (&lt;a href=&#34;#ref-Rodriguez-ReyEtAl_2013&#34;&gt;2013&lt;/a&gt;)&lt;/span&gt;, and discussion of
&lt;span class=&#34;citation&#34;&gt;Bahn and McGill (&lt;a href=&#34;#ref-BahnMcGill_2007&#34;&gt;2007&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;dismo&lt;/code&gt; package &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-HijmansEtAl_2017&#34;&gt;Hijmans et al., 2017&lt;/a&gt;)&lt;/span&gt; provides the function &lt;code&gt;geoDist&lt;/code&gt; to
serve as a spatial null model. Provided a matrix of occurence points, it
generates a simple model of occurence based on the distribution of those
points (i.e., the likelihood of an occurence at a location is inversely
proportional to the distance of that location from known occurences).&lt;/p&gt;
&lt;p&gt;&lt;code&gt;geoDist&lt;/code&gt; returns a raster of ‘suitability’ values that can be evaluated
just as the output from an SDM model projection.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;projection&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Projection&lt;/h1&gt;
&lt;p&gt;Complex models, with large numbers of predictors, or that use complex
features, are more likely to be overfit. Overfitting sacrifices generality
(i.e., capacity to project to different scenarios) in favour of better fit
to training data. Consequently, for the purposes of projection, we may want
to emphasize smaller, simpler models. See also &lt;span class=&#34;citation&#34;&gt;Petitpierre et al. (&lt;a href=&#34;#ref-PetitpierreEtAl_2017&#34;&gt;2017&lt;/a&gt;)&lt;/span&gt; for
discussion of variable selection.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Guisan et al. (&lt;a href=&#34;#ref-GuisanEtAl_2017&#34;&gt;2017&lt;/a&gt;)&lt;/span&gt;: two related issues to consider when comparing the
environment in the training region to that in the region into which the
model is projected:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;availability:
are environment ‘types’ similarly abundant and available?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;analog:
are the environment ‘types’ in the projected range also present in the training range?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The act of projection implicitly assumes environments are fully analagous
with equal availability.&lt;/p&gt;
&lt;p&gt;This is to a large extent unavoidable, so we have to take measures to
reduce it in data preparation, and/or account for it in interpretation of
results.&lt;/p&gt;
&lt;div id=&#34;clamping&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Clamping&lt;/h2&gt;
&lt;p&gt;Maxent includes the option of “clamping” projections. This constrains the
values for environmental values in the projected range to the limit of that
variable that is found in the training range. This has the effect of
setting the predicted value of all non-analog cells to the value for the
most extreme environments that are found in the training region.&lt;/p&gt;
&lt;p&gt;This will reduce the occurence of unrealistic patterns emerging from the
extension of complex models beyond the range of values they were trained
on. It’s probably better than not clamping, but no reason to expect it’s
particularly realistic.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Guisan et al. (&lt;a href=&#34;#ref-GuisanEtAl_2017&#34;&gt;2017&lt;/a&gt;)&lt;/span&gt; doesn’t mention clamping at all. Instead, they recommend
explicitly identifying non-analog environments and excluding them from the
projection (see MESS and exDet below); or at least, identifying them
clearly and giving them due consideration in the interpretation of the
results.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;mess&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;MESS&lt;/h2&gt;
&lt;p&gt;Multivariate Environmental Similarity Surfaces, &lt;span class=&#34;citation&#34;&gt;Elith et al. (&lt;a href=&#34;#ref-ElithEtAl_2010&#34;&gt;2010&lt;/a&gt;)&lt;/span&gt;, provide a
way to identify non-analog environments, and to quantify the extent to
which they differ from environments in the training area. &lt;del&gt;The functions
&lt;code&gt;dismo::mess&lt;/code&gt; and &lt;code&gt;ecospat::ecospat.mess&lt;/code&gt;&lt;/del&gt; &lt;code&gt;predicts::mess&lt;/code&gt; (as of
2023-10-19) is the main R function for calculating MESS values.&lt;/p&gt;
&lt;p&gt;MESS is applied in geographic space, and calculates a single index value
for each cell. The value is based on the single environmental variable that
is most different from the conditions in the training region. If all
variables at a cell are within the range of the training data, the MESS
index will be between 0 and 1, with 1 indicating maximum similarity with
the training range. Values below 0 indicate locations where at least one
variable is outside the range of the training data. The further the
departure from the training range, the lower the value will be. Note that
this incorporates only the single worse variable: if only one variable is
out of range, or all variables are out of range, the index only reflects
the worst one.&lt;/p&gt;
&lt;p&gt;There is no formal test or strict rule for interpreting MESS maps, other
than that we ought to be skeptical of projected results into areas with
MESS values &amp;lt; 0.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;extrapolation-detection&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Extrapolation Detection&lt;/h2&gt;
&lt;p&gt;Extrapolation Detection, or &lt;code&gt;exDet&lt;/code&gt;, was proposed by &lt;span class=&#34;citation&#34;&gt;Mesgaran et al. (&lt;a href=&#34;#ref-MesgaranEtAl_2014&#34;&gt;2014&lt;/a&gt;)&lt;/span&gt;. It
uses the Mahalanobis distance as the basis for an index of novelty. It
accomodates covariation among variables, a major improvement over &lt;code&gt;MESS&lt;/code&gt;.
I’ve put together a tutorial for completing &lt;a href=&#34;https://plantarum.ca/2023/12/19/exdet&#34;&gt;exDet analysis in
R&lt;/a&gt; with more details.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;coue&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;COUE&lt;/h2&gt;
&lt;p&gt;&lt;span class=&#34;citation&#34;&gt;Broennimann et al. (&lt;a href=&#34;#ref-BroennimannEtAl_2012&#34;&gt;2012&lt;/a&gt;)&lt;/span&gt; presents and approach to contrast environmental
conditions between training and projection regions in E-space. E-space and
G-space provide complementary views of the niche. E-space shows the
environmental distribution of a species in the context of all possible
environmental combinations; G-space shows the geographic distribution of
the species, constrained to environmental conditions that actually exist in
the landscape. This analysis applies the &lt;code&gt;COUE&lt;/code&gt; framework (centroid,
overlap, underfilling and expansion).&lt;/p&gt;
&lt;p&gt;&lt;code&gt;ecospat&lt;/code&gt; &lt;span class=&#34;citation&#34;&gt;(&lt;a href=&#34;#ref-ColaEtAl_2017&#34;&gt;Cola et al., 2017&lt;/a&gt;)&lt;/span&gt; provides all the functions necessary to
implement these analyses.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;climate-models&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Climate Models&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;Climate projections are not predictions of future conditions—they are
model-derived descriptions of possible future climates under a given set
of plausible scenarios of climate forcings. The intention of simulating
future climate is not to make accurate predictions regarding the future
state of the climate system at any given point in time but to represent
the range of plausible futures and establish the envelope that the future
climate could conceivably occupy.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;– &lt;span class=&#34;citation&#34;&gt;Harris et al. (&lt;a href=&#34;#ref-HarrisEtAl_2014&#34;&gt;2014&lt;/a&gt;)&lt;/span&gt;&lt;/p&gt;
&lt;div id=&#34;recommendations&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Recommendations&lt;/h3&gt;
&lt;p&gt;From &lt;span class=&#34;citation&#34;&gt;Harris et al. (&lt;a href=&#34;#ref-HarrisEtAl_2014&#34;&gt;2014&lt;/a&gt;)&lt;/span&gt; (originally presented as 9 points):&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;&lt;p&gt;Include a high and a low emissions scenario, to capture ‘best-case’ and
‘worst-case’ scenarios.&lt;/p&gt;
&lt;p&gt;Note that as of 2014 we are trending close to the high emissions RCP8.5
scenario. RCP2.6 represents an increasingly unlikely aggressive
mitigation approach.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Time and resources may limit us to one or two emissions scenarios, but
more than one GCM should be used.&lt;/p&gt;
&lt;p&gt;Different GCMs have different strengths, weaknesses and biases. Some
models are known to be ‘wet’ or ‘dry’ or ‘hot’, compared to the mean of
all GCMs. These biases are also spatially variable: some GCMs may be
‘hot’ for Africa, and ‘cold’ for North America.&lt;/p&gt;
&lt;p&gt;Ostenibly different GCMs may share code and assumptions, and thus share
biases in their projections.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Consider the most appropriate way to present the output from multiple
climate models;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If the multi-model mean is presented, report the range or standard
deviation;&lt;/li&gt;
&lt;li&gt;best-case and worst-case scenarios can be presented using
envelopes/binary maps: envelopes based on locations where &lt;em&gt;any&lt;/em&gt; of the
models predict suitable habitat show worst case (i.e., all areas
identified as being suitable habitat in any model), envelopes based on
locations identified by &lt;em&gt;all&lt;/em&gt; models show best case (i.e., only areas
that all models agree are suitable are identified) [good approach for
IAS].&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Choose a baseline time period appropriate to the study&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;baselines are preferably defined on data amalgamated over 30 years, to
account for stochastic inter-annual variation. Shorter time periods
may be unacceptably influenced by noisy weather.&lt;/li&gt;
&lt;li&gt;baseline climate should correspond to the time period in which the
data (i.e., observations) were collected&lt;/li&gt;
&lt;li&gt;NB: different climate sources use different baseline periods!&lt;/li&gt;
&lt;li&gt;Further discussed in &lt;span class=&#34;citation&#34;&gt;Roubicek et al. (&lt;a href=&#34;#ref-RoubicekEtAl_2010&#34;&gt;2010&lt;/a&gt;)&lt;/span&gt;. They tested the sensitivity
of SDMs trained on a different baseline than the one used to simulate
the data, and found those models were signficantly worse than ones
trained on the correct baseline.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Be aware of the real resolution of the climate data used;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;GCMs are coarse resolution, and need to be down-scaled for use with
SDMs.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Maintain a dialog with climate modelers, to keep up-to-date with
developments in climate models.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;div id=&#34;worldclim-1&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;WorldClim&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;https://worldclim.org/data/cmip6/cmip6climate.html&#34;&gt;WorldClim 2.1&lt;/a&gt; has
CMIP6 data: 9 GCMs for four SSP, at resolutions down to 2.5 minutes (30
seconds overdue for release in March 2020), projected to 2040, 2060, 2080
and 2100.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://worldclim.org/data/v1.4/cmip5.html&#34;&gt;WorldClim 1.4&lt;/a&gt; has CMIP5 data:
19 GCMs, four RCP, projected to 2050 and 2070, at resolutions as low as 30
seconds.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;definitions&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Definitions&lt;/h3&gt;
&lt;dl&gt;
&lt;dt&gt;AR&lt;/dt&gt;
&lt;dd&gt;
&lt;strong&gt;Assessment Report&lt;/strong&gt; for the IPCC. AR4 was released in 2007, AR5 in
2014, AR6 is scheduled for 2022
&lt;/dd&gt;
&lt;dt&gt;CMIP&lt;/dt&gt;
&lt;dd&gt;
&lt;strong&gt;Coupled Model Intercomparison Project&lt;/strong&gt;, IPCC collection of GCMs,
Numbered to correspond to the AR (eg. CMIP5 for AR5). “Models are only
admitted to the CMIP archive if they meet a suite of rigorous
requirements, including consistency with relevant observations, both past
and present, and with fundamental physical principles.” IPCC doesn’t
judge the models beyond being deemed fit for service. As such, the
collection of models represents set of plausible future climates under a
given emissions scenario.
&lt;/dd&gt;
&lt;dt&gt;GCM&lt;/dt&gt;
&lt;dd&gt;
&lt;strong&gt;General Circulation Model&lt;/strong&gt;: 3D numerical representation of the climate
system. 50-250km cell size, 10-20 layers in the atmosphere. Includes
&lt;strong&gt;Atmosphere-Ocean GCM&lt;/strong&gt; (AOGCM), incorportating interactions between the
oceans and atmosphere, and &lt;strong&gt;Earth System Models&lt;/strong&gt; (ESM), GCMs that
incorporate biogeochemical cycles (eg. carbon cycle); ESM may also
contain dynamic global vegetation models (DGVM)
&lt;/dd&gt;
&lt;dt&gt;SRES / RCP / SSP&lt;/dt&gt;
&lt;dd&gt;
&lt;strong&gt;Special Report on Emissions Scenarios&lt;/strong&gt;, socioeconomic analysis of
future climate emissions under varying conditions, used in AR4. Updated
to &lt;strong&gt;Representative Concentration Pathways&lt;/strong&gt;, in AR5. Updated to &lt;strong&gt;Shared
Socionomic Pathways&lt;/strong&gt;, in AR6. RCP2.6/ SSP 126 is aggressive mitigation
(best case); RCP8.5/SSP 585 is our current trajectory (worst case)
&lt;/dd&gt;
&lt;/dl&gt;
&lt;/div&gt;
&lt;div id=&#34;downscaling&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Downscaling&lt;/h3&gt;
&lt;p&gt;GCMs have a native resolution in the range of 50-250km&lt;sup&gt;2&lt;/sup&gt;, much coarser
than what is used for SDMs. To use for model projections, they need to be
downscaled (i.e., converted to finer resolution) to the required scale. If
you are using
&lt;a href=&#34;https://worldclim.org/data/cmip6/cmip6climate.html&#34;&gt;WorldClim&lt;/a&gt; data, it
has already been downscaled for you.&lt;/p&gt;
&lt;p&gt;Three options:&lt;/p&gt;
&lt;dl&gt;
&lt;dt&gt;Dynamic Downscaling&lt;/dt&gt;
&lt;dd&gt;
uses the coarse-resolution GCMs as the input for similarly complex
fine-scale models for the region of interest. Lots of added value
compared to the GCMs, but computationally expensive, requiring expert
skill to produce and interpret. Thus availability is limited.
&lt;/dd&gt;
&lt;dt&gt;Statistical Downscaling&lt;/dt&gt;
&lt;dd&gt;
uses past coars and fine-scale data to establish statistical/numerical
models mapping regional data to local data. Then uses the resulting model
to create future local data from future projections. Easier to do than
Dynamic Downscaling, but less realistic, more implicit assumptions.
&lt;/dd&gt;
&lt;dt&gt;Simple Scaling&lt;/dt&gt;
&lt;dd&gt;
break coarse pixels up into smaller pixels containing the same value as
the parent; doesn’t create any new values, or introduce any error.
&lt;/dd&gt;
&lt;/dl&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;ensembles&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Ensembles&lt;/h1&gt;
&lt;p&gt;TODO&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;to-read&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;TO-READ&lt;/h1&gt;
&lt;p&gt;Boria, R. A. et al. 2014. Spatial filtering to reduce sampling bias can
improve the performance of ecological niche models. – Ecol. Model. 275:
73–77.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;references&#34; class=&#34;section level1 unnumbered&#34;&gt;
&lt;h1&gt;References&lt;/h1&gt;
&lt;div id=&#34;refs&#34; class=&#34;references csl-bib-body hanging-indent&#34;&gt;
&lt;div id=&#34;ref-Aiello-LammensEtAl_2015&#34; class=&#34;csl-entry&#34;&gt;
Aiello‐Lammens, M. E., R. A. Boria, A. Radosavljevic, B. Vilela, and R. P. Anderson. 2015. &lt;a href=&#34;https://doi.org/10.1111/ecog.01132&#34;&gt;spThin: an R package for spatial thinning of species occurrence records for use in ecological niche models&lt;/a&gt;. &lt;em&gt;Ecography&lt;/em&gt; 38: 541–545.
&lt;/div&gt;
&lt;div id=&#34;ref-AtwaterEtAl_2018&#34; class=&#34;csl-entry&#34;&gt;
Atwater, D. Z., C. Ervine, and J. N. Barney. 2018. &lt;a href=&#34;https://doi.org/10.1038/s41559-017-0396-z&#34;&gt;Climatic niche shifts are common in introduced plants&lt;/a&gt;. &lt;em&gt;Nature Ecology &amp;amp; Evolution&lt;/em&gt; 2: 34–43.
&lt;/div&gt;
&lt;div id=&#34;ref-BahnMcGill_2007&#34; class=&#34;csl-entry&#34;&gt;
Bahn, V., and B. J. McGill. 2007. &lt;a href=&#34;https://doi.org/10.1111/j.1466-8238.2007.00331.x&#34;&gt;Can niche-based distribution models outperform spatial interpolation?&lt;/a&gt; &lt;em&gt;Global Ecology and Biogeography&lt;/em&gt; 16: 733–742.
&lt;/div&gt;
&lt;div id=&#34;ref-Barbet-MassinEtAl_2012&#34; class=&#34;csl-entry&#34;&gt;
Barbet-Massin, M., F. Jiguet, C. H. Albert, and W. Thuiller. 2012. &lt;a href=&#34;https://doi.org/10.1111/j.2041-210X.2011.00172.x&#34;&gt;Selecting pseudo-absences for species distribution models: how, where and how many?&lt;/a&gt; &lt;em&gt;Methods in Ecology and Evolution&lt;/em&gt; 3: 327–338.
&lt;/div&gt;
&lt;div id=&#34;ref-BarveEtAl_2011&#34; class=&#34;csl-entry&#34;&gt;
Barve, N., V. Barve, A. Jiménez-Valverde, A. Lira-Noriega, S. P. Maher, A. T. Peterson, J. Soberón, and F. Villalobos. 2011. &lt;a href=&#34;https://doi.org/10.1016/j.ecolmodel.2011.02.011&#34;&gt;The crucial role of the accessible area in ecological niche modeling and species distribution modeling&lt;/a&gt;. &lt;em&gt;Ecological Modelling&lt;/em&gt; 222: 1810–1819.
&lt;/div&gt;
&lt;div id=&#34;ref-Booth_2022&#34; class=&#34;csl-entry&#34;&gt;
Booth, T. H. 2022. &lt;a href=&#34;https://doi.org/10.1111/aec.13234&#34;&gt;Checking bioclimatic variables that combine temperature and precipitation data before their use in species distribution models&lt;/a&gt;. &lt;em&gt;Austral Ecology&lt;/em&gt; 47: 1506–1514.
&lt;/div&gt;
&lt;div id=&#34;ref-BoyceEtAl_2002&#34; class=&#34;csl-entry&#34;&gt;
Boyce, M. S., P. R. Vernier, S. E. Nielsen, and F. K. A. Schmiegelow. 2002. &lt;a href=&#34;https://doi.org/10.1016/S0304-3800(02)00200-4&#34;&gt;Evaluating resource selection functions&lt;/a&gt;. &lt;em&gt;Ecological Modelling&lt;/em&gt; 157: 281–300.
&lt;/div&gt;
&lt;div id=&#34;ref-BroennimannEtAl_2012&#34; class=&#34;csl-entry&#34;&gt;
Broennimann, O., M. C. Fitzpatrick, P. B. Pearman, B. Petitpierre, L. Pellissier, N. G. Yoccoz, W. Thuiller, et al. 2012. &lt;a href=&#34;https://doi.org/10.1111/j.1466-8238.2011.00698.x&#34;&gt;Measuring ecological niche overlap from occurrence and spatial environmental data&lt;/a&gt;. &lt;em&gt;Global Ecology and Biogeography&lt;/em&gt; 21: 481–497.
&lt;/div&gt;
&lt;div id=&#34;ref-ColaEtAl_2017&#34; class=&#34;csl-entry&#34;&gt;
Cola, V. D., O. Broennimann, B. Petitpierre, F. T. Breiner, M. D’Amen, C. Randin, R. Engler, et al. 2017. &lt;a href=&#34;https://doi.org/10.1111/ecog.02671&#34;&gt;ecospat: an R package to support spatial analyses and modeling of species niches and distributions&lt;/a&gt;. &lt;em&gt;Ecography&lt;/em&gt; 40: 774–787.
&lt;/div&gt;
&lt;div id=&#34;ref-ElithEtAl_2006&#34; class=&#34;csl-entry&#34;&gt;
Elith, J., C. H. Graham, R. P. Anderson, M. Dudík, S. Ferrier, A. Guisan, R. J. Hijmans, et al. 2006. &lt;a href=&#34;https://doi.org/10.1111/j.2006.0906-7590.04596.x&#34;&gt;Novel methods improve prediction of species’ distributions from occurrence data&lt;/a&gt;. &lt;em&gt;Ecography&lt;/em&gt; 29: 129–151.
&lt;/div&gt;
&lt;div id=&#34;ref-ElithEtAl_2010&#34; class=&#34;csl-entry&#34;&gt;
Elith, J., M. Kearney, and S. Phillips. 2010. &lt;a href=&#34;https://doi.org/10.1111/j.2041-210X.2010.00036.x&#34;&gt;The art of modelling range-shifting species&lt;/a&gt;. &lt;em&gt;Methods in Ecology and Evolution&lt;/em&gt; 1: 330–342.
&lt;/div&gt;
&lt;div id=&#34;ref-ElithEtAl_2011&#34; class=&#34;csl-entry&#34;&gt;
Elith, J., S. J. Phillips, T. Hastie, M. Dudík, Y. E. Chee, and C. J. Yates. 2011. &lt;a href=&#34;https://doi.org/10.1111/j.1472-4642.2010.00725.x&#34;&gt;A statistical explanation of MaxEnt for ecologists: Statistical explanation of MaxEnt&lt;/a&gt;. &lt;em&gt;Diversity and Distributions&lt;/em&gt; 17: 43–57.
&lt;/div&gt;
&lt;div id=&#34;ref-FickHijmans_2017&#34; class=&#34;csl-entry&#34;&gt;
Fick, S. E., and R. J. Hijmans. 2017. &lt;a href=&#34;https://doi.org/10.1002/joc.5086&#34;&gt;WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas&lt;/a&gt;. &lt;em&gt;International Journal of Climatology&lt;/em&gt; 37: 4302–4315.
&lt;/div&gt;
&lt;div id=&#34;ref-GuisanEtAl_2017&#34; class=&#34;csl-entry&#34;&gt;
Guisan, A., W. Thuiller, and N. E. Zimmermann. 2017. Habitat Suitability and Distribution Models: with Applications in R. Cambridge University Press.
&lt;/div&gt;
&lt;div id=&#34;ref-HarrisEtAl_2020&#34; class=&#34;csl-entry&#34;&gt;
Harris, I., T. J. Osborn, P. Jones, and D. Lister. 2020. &lt;a href=&#34;https://doi.org/10.1038/s41597-020-0453-3&#34;&gt;Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset&lt;/a&gt;. &lt;em&gt;Scientific Data&lt;/em&gt; 7: 109.
&lt;/div&gt;
&lt;div id=&#34;ref-HarrisEtAl_2014&#34; class=&#34;csl-entry&#34;&gt;
Harris, R. M. B., M. R. Grose, G. Lee, N. L. Bindoff, L. L. Porfirio, and P. Fox‐Hughes. 2014. &lt;a href=&#34;https://doi.org/10.1002/wcc.291&#34;&gt;Climate projections for ecologists&lt;/a&gt;. &lt;em&gt;WIREs Climate Change&lt;/em&gt; 5: 621–637.
&lt;/div&gt;
&lt;div id=&#34;ref-HijmansEtAl_2017&#34; class=&#34;csl-entry&#34;&gt;
Hijmans, R. J., S. Phillips, J. Leathwick, and J. Elith. 2017. &lt;a href=&#34;https://CRAN.R-project.org/package=dismo&#34;&gt;dismo R package Version 1.1-4&lt;/a&gt;.
&lt;/div&gt;
&lt;div id=&#34;ref-HirzelEtAl_2006&#34; class=&#34;csl-entry&#34;&gt;
Hirzel, A. H., G. Le Lay, V. Helfer, C. Randin, and A. Guisan. 2006. &lt;a href=&#34;https://doi.org/10.1016/j.ecolmodel.2006.05.017&#34;&gt;Evaluating the ability of habitat suitability models to predict species presences&lt;/a&gt;. &lt;em&gt;Ecological Modelling&lt;/em&gt; 199: 142–152.
&lt;/div&gt;
&lt;div id=&#34;ref-Lee-YawEtAl_2018&#34; class=&#34;csl-entry&#34;&gt;
Lee‐Yaw, J. A., M. Fracassetti, and Y. Willi. 2018. &lt;a href=&#34;https://doi.org/10.1111/ecog.02869&#34;&gt;Environmental marginality and geographic range limits: a case study with &lt;em&gt;Arabidopsis lyrata&lt;/em&gt; ssp. &lt;em&gt;lyrata&lt;/em&gt;&lt;/a&gt;. &lt;em&gt;Ecography&lt;/em&gt; 41: 622–634.
&lt;/div&gt;
&lt;div id=&#34;ref-LegendreLegendre_2012&#34; class=&#34;csl-entry&#34;&gt;
Legendre, P., and L. Legendre. 2012. Numerical ecology. 3rd Edition. Elsevier, New York.
&lt;/div&gt;
&lt;div id=&#34;ref-LoboEtAl_2008&#34; class=&#34;csl-entry&#34;&gt;
Lobo, J. M., A. Jiménez‐Valverde, and R. Real. 2008. &lt;a href=&#34;https://doi.org/10.1111/j.1466-8238.2007.00358.x&#34;&gt;AUC: a misleading measure of the performance of predictive distribution models&lt;/a&gt;. &lt;em&gt;Global Ecology and Biogeography&lt;/em&gt; 17: 145–151.
&lt;/div&gt;
&lt;div id=&#34;ref-MerowEtAl_2013&#34; class=&#34;csl-entry&#34;&gt;
Merow, C., M. J. Smith, and J. A. Silander. 2013. &lt;a href=&#34;https://doi.org/10.1111/j.1600-0587.2013.07872.x&#34;&gt;A practical guide to MaxEnt for modeling species’ distributions: what it does, and why inputs and settings matter&lt;/a&gt;. &lt;em&gt;Ecography&lt;/em&gt; 36: 1058–1069.
&lt;/div&gt;
&lt;div id=&#34;ref-MesgaranEtAl_2014&#34; class=&#34;csl-entry&#34;&gt;
Mesgaran, M. B., R. D. Cousens, and B. L. Webber. 2014. &lt;a href=&#34;https://doi.org/10.1111/ddi.12209&#34;&gt;Here be dragons: a tool for quantifying novelty due to covariate range and correlation change when projecting species distribution models&lt;/a&gt; J. Franklin [ed.],. &lt;em&gt;Diversity and Distributions&lt;/em&gt; 20: 1147–1159.
&lt;/div&gt;
&lt;div id=&#34;ref-MuscarellaEtAl_2014&#34; class=&#34;csl-entry&#34;&gt;
Muscarella, R., P. J. Galante, M. Soley‐Guardia, R. A. Boria, J. M. Kass, M. Uriarte, and R. P. Anderson. 2014. &lt;a href=&#34;https://doi.org/10.1111/2041-210X.12261&#34;&gt;ENMeval: An R package for conducting spatially independent evaluations and estimating optimal model complexity for Maxent ecological niche models&lt;/a&gt;. &lt;em&gt;Methods in Ecology and Evolution&lt;/em&gt; 5: 1198–1205.
&lt;/div&gt;
&lt;div id=&#34;ref-PearsonEtAl_2013&#34; class=&#34;csl-entry&#34;&gt;
Pearson, R. G., S. J. Phillips, M. M. Loranty, P. S. A. Beck, T. Damoulas, S. J. Knight, and S. J. Goetz. 2013. &lt;a href=&#34;https://doi.org/10.1038/nclimate1858&#34;&gt;Shifts in Arctic vegetation and associated feedbacks under climate change&lt;/a&gt;. &lt;em&gt;Nature Climate Change&lt;/em&gt; 3: 673–677.
&lt;/div&gt;
&lt;div id=&#34;ref-PetitpierreEtAl_2017&#34; class=&#34;csl-entry&#34;&gt;
Petitpierre, B., O. Broennimann, C. Kueffer, C. Daehler, and A. Guisan. 2017. &lt;a href=&#34;https://doi.org/10.1111/geb.12530&#34;&gt;Selecting predictors to maximize the transferability of species distribution models: lessons from cross-continental plant invasions&lt;/a&gt;. &lt;em&gt;Global Ecology and Biogeography&lt;/em&gt; 26: 275–287.
&lt;/div&gt;
&lt;div id=&#34;ref-Phillips_2017&#34; class=&#34;csl-entry&#34;&gt;
Phillips, S. J. 2017. A Brief Tutorial on Maxent. Website &lt;a href=&#34;http://biodiversityinformatics.amnh.org/open_source/maxent/&#34;&gt;http://biodiversityinformatics.amnh.org/open_source/maxent/&lt;/a&gt; [accessed 19 May 2020].
&lt;/div&gt;
&lt;div id=&#34;ref-PhillipsEtAl_2017&#34; class=&#34;csl-entry&#34;&gt;
Phillips, S. J., R. P. Anderson, M. Dudík, R. E. Schapire, and M. E. Blair. 2017. &lt;a href=&#34;https://doi.org/10.1111/ecog.03049&#34;&gt;Opening the black box: an open-source release of Maxent&lt;/a&gt;. &lt;em&gt;Ecography&lt;/em&gt; 40: 887–893.
&lt;/div&gt;
&lt;div id=&#34;ref-PhillipsDudik_2008&#34; class=&#34;csl-entry&#34;&gt;
Phillips, S. J., and M. Dudík. 2008. &lt;a href=&#34;https://doi.org/10.1111/j.0906-7590.2008.5203.x&#34;&gt;Modeling of species distributions with Maxent: new extensions and a comprehensive evaluation&lt;/a&gt;. &lt;em&gt;Ecography&lt;/em&gt; 31: 161–175.
&lt;/div&gt;
&lt;div id=&#34;ref-PhillipsElith_2010&#34; class=&#34;csl-entry&#34;&gt;
Phillips, S. J., and J. Elith. 2010. &lt;a href=&#34;https://doi.org/10.1890/09-0760.1&#34;&gt;POC plots: calibrating species distribution models with presence-only data&lt;/a&gt;. &lt;em&gt;Ecology&lt;/em&gt; 91: 2476–2484.
&lt;/div&gt;
&lt;div id=&#34;ref-RadosavljevicAnderson_2014&#34; class=&#34;csl-entry&#34;&gt;
Radosavljevic, A., and R. P. Anderson. 2014. &lt;a href=&#34;https://doi.org/10.1111/jbi.12227&#34;&gt;Making better Maxent models of species distributions: complexity, overfitting and evaluation&lt;/a&gt;. &lt;em&gt;Journal of Biogeography&lt;/em&gt; 41: 629–643.
&lt;/div&gt;
&lt;div id=&#34;ref-RennerEtAl_2015&#34; class=&#34;csl-entry&#34;&gt;
Renner, I. W., J. Elith, A. Baddeley, W. Fithian, T. Hastie, S. J. Phillips, G. Popovic, and D. I. Warton. 2015. &lt;a href=&#34;https://doi.org/10.1111/2041-210X.12352&#34;&gt;Point process models for presence-only analysis&lt;/a&gt;. &lt;em&gt;Methods in Ecology and Evolution&lt;/em&gt; 6: 366–379.
&lt;/div&gt;
&lt;div id=&#34;ref-Rodriguez-ReyEtAl_2013&#34; class=&#34;csl-entry&#34;&gt;
Rodríguez-Rey, M., A. Jiménez-Valverde, and P. Acevedo. 2013. &lt;a href=&#34;https://doi.org/10.1016/j.ecolmodel.2013.01.024&#34;&gt;Species distribution models predict range expansion better than chance but not better than a simple dispersal model&lt;/a&gt;. &lt;em&gt;Ecological Modelling&lt;/em&gt; 256: 1–5.
&lt;/div&gt;
&lt;div id=&#34;ref-RoubicekEtAl_2010&#34; class=&#34;csl-entry&#34;&gt;
Roubicek, A. J., J. VanDerWal, L. J. Beaumont, A. J. Pitman, P. Wilson, and L. Hughes. 2010. &lt;a href=&#34;https://doi.org/10.1016/j.ecolmodel.2010.06.021&#34;&gt;Does the choice of climate baseline matter in ecological niche modelling?&lt;/a&gt; &lt;em&gt;Ecological Modelling&lt;/em&gt; 221: 2280–2286.
&lt;/div&gt;
&lt;div id=&#34;ref-Soberon_2010&#34; class=&#34;csl-entry&#34;&gt;
Soberón, J. M. 2010. &lt;a href=&#34;https://doi.org/10.1111/j.1600-0587.2009.06074.x&#34;&gt;Niche and area of distribution modeling: a population ecology perspective&lt;/a&gt;. &lt;em&gt;Ecography&lt;/em&gt; 33: 159–167.
&lt;/div&gt;
&lt;div id=&#34;ref-ValaviEtAl_2022&#34; class=&#34;csl-entry&#34;&gt;
Valavi, R., G. Guillera-Arroita, J. J. Lahoz-Monfort, and J. Elith. 2022. &lt;a href=&#34;https://doi.org/10.1002/ecm.1486&#34;&gt;Predictive performance of presence-only species distribution models: a benchmark study with reproducible code&lt;/a&gt;. &lt;em&gt;Ecological Monographs&lt;/em&gt; 92: e01486.
&lt;/div&gt;
&lt;div id=&#34;ref-VarelaEtAl_2014&#34; class=&#34;csl-entry&#34;&gt;
Varela, S., R. P. Anderson, R. García-Valdés, and F. Fernández-González. 2014. &lt;a href=&#34;https://doi.org/10.1111/j.1600-0587.2013.00441.x&#34;&gt;Environmental filters reduce the effects of sampling bias and improve predictions of ecological niche models&lt;/a&gt;. &lt;em&gt;Ecography&lt;/em&gt; 37: 1084–1091.
&lt;/div&gt;
&lt;div id=&#34;ref-WarrenEtAl_2019&#34; class=&#34;csl-entry&#34;&gt;
Warren, D. L., N. Matzke, M. Cardillo, J. Baumgartner, L. Beaumont, N. Huron, M. Simões, et al. 2019. &lt;a href=&#34;https://doi.org/10.5281/zenodo.3268814&#34;&gt;ENMTools R Package&lt;/a&gt;.
&lt;/div&gt;
&lt;div id=&#34;ref-WarrenSeifert_2011&#34; class=&#34;csl-entry&#34;&gt;
Warren, D. L., and S. N. Seifert. 2011. &lt;a href=&#34;https://doi.org/10.1890/10-1171.1&#34;&gt;Ecological niche modeling in Maxent: the importance of model complexity and the performance of model selection criteria&lt;/a&gt;. &lt;em&gt;Ecological Applications&lt;/em&gt; 21: 335–342.
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
</description>
    </item>
    
  </channel>
</rss>
