This is an update of my previous mapping
tutorial. Spatial analysis in R is shifting to
sf as the primary packages, so I’ve translated my old,
raster-based tutorial to the new workflow. See the RSpatial
tutorial for a more detailed
introduction/overview of using
terra for GIS/spatial analysis.
The following tutorial walks through some common plotting tasks I use for distribution models.
geodata package provides several convenient functions for downloading
raster and vector maps for use as basemaps and spatial analysis. The first
time you use these functions, they will download the requested maps from
the internet. It will save the data in your working directory, or in a
location specified with the
path argument. The next time you request the
same data, if it finds them in the local directory (or the specified
path), they will be loaded from there, saving the time and bandwith
necessary to download them.
## Loading required package: terra
## terra 1.7.29
us <- gadm(country = "USA", level = 1, resolution = 2, path = "../data/maps/") canada <- gadm(country = "CAN", level = 1, resolution = 2, path = "../data/maps") mexico <- gadm(country = "MX", level = 1, resolution = 2, path = "../data/maps")
Countries are specified by their ISO code, which you can find by calling
country_codes(). The by default,
country_codes() returns a
table of countries and the various ISO codes they have, as well as the
continents they are in. The
query argument lets you filter this table on
any value. For example, you can get a table of North American countries
or, just their ISO codes with:
The other arguments allow you to select, national, subnational, or
subprovincial borders (
level 1-3), the
resolution (high: 1, low: 2),
path to the directory where you want the maps stored.
These maps can be plotted directly with the
plot command. If you want to
combine them, use the
add = TRUE argument to the second
plot(us, lwd = 2) plot(canada, add = TRUE, col = 'red') plot(mexico, add = TRUE, border = "green")
col sets the fill colour,
border sets the outline color.
You can also request multiple countries as a single set of polygons, by
passing a character vector to the
NorthAmerica <- gadm(country = country_codes("North America")$ISO3, level = 0, resolution = 2, path = "../data/maps/") plot(NorthAmerica, xlim = c(-180, -50))
ylim work as you would expect for plotting.
If you want to combine multiple polygons into a single object, use
CanUS <- rbind(us, canada) plot(CanUS, xlim = c(-180, -50))
These maps are ‘unprojected’, meaning they are plotted in latitude/longitude degrees. That makes it easy to set the plot boundaries:
plot(CanUS, xlim = c(-100, -50), ylim = c(30, 60)) plot(NorthAmerica, lwd = 2, add = TRUE)
NB: The size of your plot canvas is fixed, but a map can’t stretch. The x and y dimensions have to maintain the same aspect. That means zooming in one dimension (i.e. latitude only) won’t necessarily change the zoom of your map, if the other dimension fills the canvas. You’ll have to play around with the plot size, and both x and y dimensions together, to tweak your zoom.
It’s handy to have a shapefile of the Great Lakes, for making prettier maps. I created this one in QGIS and use it for plotting:
greatlakes <- vect("../data/maps/greatlakes.shp")
You can add points to the plot like a regular scatter plot:
library(scales) ## for the alpha function below
## ## Attaching package: 'scales'
## The following object is masked from 'package:terra': ## ## rescale
gbif <- read.table("../data/trich-gbif.csv") ## Set the line color to gray to focus on the data points: plot(CanUS, xlim = c(-100, -50), ylim = c(30, 60), border = "gray") points(gbif$X, gbif$Y, pch = 16, col = alpha("green", 0.2))
You can also convert your points to a spatial vector object, in which case R will know which columns to use for plotting. This is also necessary before we can project our data (see below).
gbif <- vect(gbif, geom = c("X", "Y")) ## plot(CanUS, xlim = c(-100, -50), ylim = c(30, 60), ## border = "gray") ## points(gbif, pch = 16, col = alpha("green", 0.2))
You often need to select polygons that contain points, or select only those
points that occur within a particular polygon. You can do this with R’s
subsetting syntax. For instance,
<polygons>[<points>, ] will select
polygons that contain one or more
plot(CanUS, xlim = c(-100, -50), ylim = c(30, 60), border = "gray") ## Select states and provinces where Trichophorum is ## present: plot(CanUS[gbif, ], col = 'red', add = TRUE) points(gbif, pch = 21, col = 'white', bg = 'black')
<points>[<polygons>, ] will select points that are within
polygons. In this example, we’ll use the fact that our
object has a column named
NAME_1 that holds the name of the state or
province of each polygon:
plot(CanUS, xlim = c(-100, -50), ylim = c(30, 60), border = "gray") plot(CanUS[gbif, ], col = 'red', add = TRUE) ## Select only points in NY state: points(gbif[CanUS[CanUS$NAME_1 == "New York", ], ], pch = 21, col = 'white', bg = 'black')
Similarly, you can plot rasters with plot:
trichPreds <- rast("../data/trichPreds.grd") plot(trichPreds, xlim = c(-100, -50), ylim = c(30, 60)) plot(CanUS, border = "gray", lwd = 0.5, add = TRUE)
NA values are transparent. In this case, a species
distribution model, low values are displayed in gray. This may be useful
for visualizing the extent of the model. However, it looks a bit odd, and
makes it hard to see limits of the high-suitability areas. You can tweak
this by playing with the color ramp, but it’s also handy to ‘turn off’ the
low values entirely (for visualization, not for analysis!!)
trichPredsTrim <- trichPreds trichPredsTrim[trichPredsTrim < quantile(values(trichPreds), probs = 0.75, na.rm = TRUE)] <- NA plot(trichPredsTrim, xlim = c(-100, -50), ylim = c(30, 60)) plot(CanUS, border = "grey", add = TRUE)
The test I used here,
trichPredsTrim < quantile(getValues(trichPreds), probs = 0.75, na.rm = TRUE) identifies all cells in the lower 75% of the
suitability scores, which I then set to
NA to make them invisible. I
decided on 75% after experimenting with different values. In this case, 75%
drops most of the grey background (the very lowest values), without eating
into the areas that the prediction indicates are suitable.
You could also use an absolute value here, but then you’d need to know the
actual distribution of the suitability scores.
quantile is easier to
Alternatively, you can assign colours to the prediction raster based on the value of each pixel, after splitting them into categories:
suitability <- extract(trichPreds, gbif, ID = FALSE)[, 1] predCat <- (trichPreds > quantile(suitability, 0.25)) + (trichPreds > quantile(suitability, 0.15)) + (trichPreds > quantile(suitability, 0.05)) + (trichPreds > quantile(suitability, 0.025)) predCols <- c("white", "grey95", "yellow3", "orange", "red") plot(predCat, xlim = c(-100, -50), colNA = 'lightblue', col = predCols, ylim = c(30, 60), legend = FALSE) plot(CanUS, border = "grey", add = TRUE)
I find using a small number of categories makes it easier to read the suitability map than trying to interpret the colour gradients usually used.
Lat/Lon maps look a bit square; we’re more used to seeing maps projected. A common projection for Canada is Lambert Conformal Conic. We can transform our data to this projection to make nicer maps:
## define the projection canlam <- "+proj=lcc +lat_1=49 +lat_2=77 +lat_0=49 +lon_0=-95 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs" ## project our vector data: CanUS.lcc <- project(CanUS, canlam) gl.lcc <- project(greatlakes, canlam) ## Now we to set the projection of our points: crs(gbif) <- "+proj=longlat +datum=WGS84" ## Finally, we can project our points to LCC: gbif.lcc <- project(gbif, canlam)
Note that our data needs to be in an object of class
Spat*, and it must
have a defined coordinate reference system (CRS) before we can project it
to a new CRS. The
crs function allows us to explicitly set the
projection. See the RSpatial
tutorial for more
details on specifying CRS with
The same function works for rasters (sort of):
rasterLCC <- project(trichPredsTrim, canlam)
This will reproject my raster from lat/lon to Lambert Conformal Conic, but we will unavoidably lose some precision when we do this. This is fine for visualization, but you should avoid projecting raster data used in analysis. For more details, see the terra tutorial.
plot(rasterLCC) plot(CanUS.lcc, border = "grey", add = TRUE)
The units are no longer Lat/Lon, but meters. We can read them off the plot to improve the zoom:
plot(rasterLCC, xlim = c(0, 2500000), ylim = c(-1500000, -400000)) plot(CanUS.lcc, border = "grey", add = TRUE)
With the data plotted, we can then turn to making the map a little
prettier. Note that when working with
Spat* objects, we can’t set the
plot margins with
par as we usually do. Instead, we use the
argument within the
## Make a panel with two plots par(mfrow = c(1, 2)) ## store the plot limits: my_xlims <- c(0, 2500000) my_ylims <- c(-1300000, -200000) ## Plot the points: plot(CanUS.lcc, xlim = my_xlims , ylim = my_ylims, border = "grey", background = "lightblue", col = "white", axes = FALSE, mar = c(0.1,0.1,0.1,0)) plot(gl.lcc, add = TRUE, border = "grey", col = "lightblue") points(gbif.lcc, pch = 16, col = alpha("grey30", 0.2), cex = 0.7) box() plot(CanUS.lcc, xlim = my_xlims , ylim = my_ylims, border = "grey", background = "lightblue", col = "white", mar = c(0.1,0,0.1,0.1), axes = FALSE) plot(gl.lcc, add = TRUE, border = "grey", col = "lightblue") plot(rasterLCC, add = TRUE, legend = FALSE, axes = FALSE) ## plotted again to put the border lines on top: plot(CanUS.lcc, border = "grey", add = TRUE) box()
If you want to plot the state/provincial borders on top of the raster, you need to add those layers last. But you can’t set the background colour of the raster layer to “lightblue” (or at least I haven’t figured that out), so the ocean stays white. I get around that by plotting the boundaries twice, first to set the background colour, and then to put the state lines on top of the raster.