Reference
This is an update of my previous mapping
tutorial. Spatial analysis in R is shifting to
terra
and sf
as the primary packages, so I’ve translated my old,
raster
-based tutorial to the new workflow. See the RSpatial
tutorial for a more detailed
introduction/overview of using terra
for GIS/spatial analysis.
The following tutorial walks through some common plotting tasks I use for distribution models.
Basemaps
The geodata
package provides several convenient functions for downloading
raster and vector maps for use as basemaps and spatial analysis. The first
time you use these functions, they will download the requested maps from
the internet. It will save the data in your working directory, or in a
location specified with the path
argument. The next time you request the
same data, if it finds them in the local directory (or the specified
path
), they will be loaded from there, saving the time and bandwith
necessary to download them.
library(geodata)
## Loading required package: terra
## terra 1.7.29
us <- gadm(country = "USA", level = 1, resolution = 2,
path = "../data/maps/")
canada <- gadm(country = "CAN", level = 1, resolution = 2,
path = "../data/maps")
mexico <- gadm(country = "MX", level = 1, resolution = 2,
path = "../data/maps")
Countries are specified by their ISO code, which you can find by calling
the function country_codes()
. The by default, country_codes()
returns a
table of countries and the various ISO codes they have, as well as the
continents they are in. The query
argument lets you filter this table on
any value. For example, you can get a table of North American countries
with:
country_codes("North America")
or, just their ISO codes with:
country_codes("North America")$ISO3
The other arguments allow you to select, national, subnational, or
subprovincial borders (level
1-3), the resolution
(high: 1, low: 2),
and the path
to the directory where you want the maps stored.
These maps can be plotted directly with the plot
command. If you want to
combine them, use the add = TRUE
argument to the second plot
call:
plot(us, lwd = 2)
plot(canada, add = TRUE, col = 'red')
plot(mexico, add = TRUE, border = "green")
col
sets the fill colour, border
sets the outline color.
You can also request multiple countries as a single set of polygons, by
passing a character vector to the country
argument.
NorthAmerica <- gadm(country = country_codes("North America")$ISO3,
level = 0, resolution = 2,
path = "../data/maps/")
plot(NorthAmerica, xlim = c(-180, -50))
Note that xlim
and ylim
work as you would expect for plotting.
If you want to combine multiple polygons into a single object, use rbind
:
CanUS <- rbind(us, canada)
plot(CanUS, xlim = c(-180, -50))
These maps are ‘unprojected’, meaning they are plotted in latitude/longitude degrees. That makes it easy to set the plot boundaries:
plot(CanUS, xlim = c(-100, -50), ylim = c(30, 60))
plot(NorthAmerica, lwd = 2, add = TRUE)
NB: The size of your plot canvas is fixed, but a map can’t stretch. The x and y dimensions have to maintain the same aspect. That means zooming in one dimension (i.e. latitude only) won’t necessarily change the zoom of your map, if the other dimension fills the canvas. You’ll have to play around with the plot size, and both x and y dimensions together, to tweak your zoom.
It’s handy to have a shapefile of the Great Lakes, for making prettier maps. I created this one in QGIS and use it for plotting:
greatlakes <- vect("../data/maps/greatlakes.shp")
Adding Data
Vectors
You can add points to the plot like a regular scatter plot:
library(scales) ## for the alpha function below
##
## Attaching package: 'scales'
## The following object is masked from 'package:terra':
##
## rescale
gbif <- read.table("../data/trich-gbif.csv")
## Set the line color to gray to focus on the data points:
plot(CanUS, xlim = c(-100, -50), ylim = c(30, 60),
border = "gray")
points(gbif$X, gbif$Y, pch = 16,
col = alpha("green", 0.2))
You can also convert your points to a spatial vector object, in which case R will know which columns to use for plotting. This is also necessary before we can project our data (see below).
gbif <- vect(gbif, geom = c("X", "Y"))
## plot(CanUS, xlim = c(-100, -50), ylim = c(30, 60),
## border = "gray")
## points(gbif, pch = 16, col = alpha("green", 0.2))
Subsetting vectors
You often need to select polygons that contain points, or select only those
points that occur within a particular polygon. You can do this with R’s [
subsetting syntax. For instance, <polygons>[<points>, ]
will select
polygons
that contain one or more points
:
plot(CanUS, xlim = c(-100, -50), ylim = c(30, 60),
border = "gray")
## Select states and provinces where Trichophorum is
## present:
plot(CanUS[gbif, ], col = 'red', add = TRUE)
points(gbif, pch = 21, col = 'white', bg = 'black')
Conversely, <points>[<polygons>, ]
will select points that are within
polygons. In this example, we’ll use the fact that our CanUS
SpatVector
object has a column named NAME_1
that holds the name of the state or
province of each polygon:
plot(CanUS, xlim = c(-100, -50), ylim = c(30, 60),
border = "gray")
plot(CanUS[gbif, ], col = 'red', add = TRUE)
## Select only points in NY state:
points(gbif[CanUS[CanUS$NAME_1 == "New York", ], ],
pch = 21, col = 'white', bg = 'black')
Rasters
Similarly, you can plot rasters with plot:
trichPreds <- rast("../data/trichPreds.grd")
plot(trichPreds, xlim = c(-100, -50), ylim = c(30, 60))
plot(CanUS, border = "gray", lwd = 0.5, add = TRUE)
Cells with NA
values are transparent. In this case, a species
distribution model, low values are displayed in gray. This may be useful
for visualizing the extent of the model. However, it looks a bit odd, and
makes it hard to see limits of the high-suitability areas. You can tweak
this by playing with the color ramp, but it’s also handy to ‘turn off’ the
low values entirely (for visualization, not for analysis!!)
trichPredsTrim <- trichPreds
trichPredsTrim[trichPredsTrim <
quantile(values(trichPreds),
probs = 0.75, na.rm = TRUE)] <- NA
plot(trichPredsTrim, xlim = c(-100, -50), ylim = c(30, 60))
plot(CanUS, border = "grey", add = TRUE)
The test I used here, trichPredsTrim < quantile(getValues(trichPreds), probs = 0.75, na.rm = TRUE)
identifies all cells in the lower 75% of the
suitability scores, which I then set to NA
to make them invisible. I
decided on 75% after experimenting with different values. In this case, 75%
drops most of the grey background (the very lowest values), without eating
into the areas that the prediction indicates are suitable.
You could also use an absolute value here, but then you’d need to know the
actual distribution of the suitability scores. quantile
is easier to
tweak.
Alternatively, you can assign colours to the prediction raster based on the value of each pixel, after splitting them into categories:
suitability <- extract(trichPreds, gbif, ID = FALSE)[, 1]
predCat <- (trichPreds > quantile(suitability, 0.25)) +
(trichPreds > quantile(suitability, 0.15)) +
(trichPreds > quantile(suitability, 0.05)) +
(trichPreds > quantile(suitability, 0.025))
predCols <- c("white", "grey95", "yellow3", "orange", "red")
plot(predCat, xlim = c(-100, -50), colNA = 'lightblue',
col = predCols, ylim = c(30, 60), legend = FALSE)
plot(CanUS, border = "grey", add = TRUE)
I find using a small number of categories makes it easier to read the suitability map than trying to interpret the colour gradients usually used.
Projections
Lat/Lon maps look a bit square; we’re more used to seeing maps projected. A common projection for Canada is Lambert Conformal Conic. We can transform our data to this projection to make nicer maps:
## define the projection
canlam <- "+proj=lcc +lat_1=49 +lat_2=77 +lat_0=49 +lon_0=-95 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs"
## project our vector data:
CanUS.lcc <- project(CanUS, canlam)
gl.lcc <- project(greatlakes, canlam)
## Now we to set the projection of our points:
crs(gbif) <- "+proj=longlat +datum=WGS84"
## Finally, we can project our points to LCC:
gbif.lcc <- project(gbif, canlam)
Note that our data needs to be in an object of class Spat*
, and it must
have a defined coordinate reference system (CRS) before we can project it
to a new CRS. The crs
function allows us to explicitly set the
projection. See the RSpatial
tutorial for more
details on specifying CRS with terra
.
The same function works for rasters (sort of):
rasterLCC <- project(trichPredsTrim, canlam)
This will reproject my raster from lat/lon to Lambert Conformal Conic, but we will unavoidably lose some precision when we do this. This is fine for visualization, but you should avoid projecting raster data used in analysis. For more details, see the terra tutorial.
plot(rasterLCC)
plot(CanUS.lcc, border = "grey", add = TRUE)
The units are no longer Lat/Lon, but meters. We can read them off the plot to improve the zoom:
plot(rasterLCC, xlim = c(0, 2500000),
ylim = c(-1500000, -400000))
plot(CanUS.lcc, border = "grey", add = TRUE)
Formatting
With the data plotted, we can then turn to making the map a little
prettier. Note that when working with Spat*
objects, we can’t set the
plot margins with par
as we usually do. Instead, we use the mar
argument within the plot
function:
## Make a panel with two plots
par(mfrow = c(1, 2))
## store the plot limits:
my_xlims <- c(0, 2500000)
my_ylims <- c(-1300000, -200000)
## Plot the points:
plot(CanUS.lcc, xlim = my_xlims , ylim = my_ylims,
border = "grey", background = "lightblue",
col = "white", axes = FALSE, mar = c(0.1,0.1,0.1,0))
plot(gl.lcc, add = TRUE, border = "grey", col = "lightblue")
points(gbif.lcc, pch = 16, col = alpha("grey30", 0.2),
cex = 0.7)
box()
plot(CanUS.lcc, xlim = my_xlims , ylim = my_ylims,
border = "grey", background = "lightblue",
col = "white", mar = c(0.1,0,0.1,0.1), axes = FALSE)
plot(gl.lcc, add = TRUE, border = "grey", col = "lightblue")
plot(rasterLCC, add = TRUE, legend = FALSE, axes = FALSE)
## plotted again to put the border lines on top:
plot(CanUS.lcc, border = "grey", add = TRUE)
box()
If you want to plot the state/provincial borders on top of the raster, you need to add those layers last. But you can’t set the background colour of the raster layer to “lightblue” (or at least I haven’t figured that out), so the ocean stays white. I get around that by plotting the boundaries twice, first to set the background colour, and then to put the state lines on top of the raster.