Tutorials on plantarum.ca

Tutorials on plantarum.ca https://plantarum.ca/tutorials/ Recent content in Tutorials on plantarum.ca Hugo -- gohugo.io en-us Fri, 10 May 2024 00:00:00 +0000 Spatial Tutorials Update https://plantarum.ca/2024/05/10/terra-time/ Fri, 10 May 2024 00:00:00 +0000 https://plantarum.ca/2024/05/10/terra-time/ A quick update. The spatial analysis libraries in the R Project have undergone a substantial change in the past couple of years. The details are laid out in the R spatial blog, but the crux of the issue is that legacy packages rgdal and rgeos have been retired, and packages that depend on them (such as raster and sp) will have been modified to use new dependencies, or replaced entirely. Preparing GBIF records for distribution modeling https://plantarum.ca/2024/04/04/record-cleaning/ Thu, 04 Apr 2024 00:00:00 +0000 https://plantarum.ca/2024/04/04/record-cleaning/ GBIF.org The Global Biodiversity Information Facility (GBIF.org) has become the standard open-access online database of occurrence records for all manner of biological organisms. It was initially a clearinghouse for museum records (such as herbarium specimens), but now includes iNaturalist observations (those that are rated ‘research’ grade), survey data, and a growing variety of taxonomic and checklist sources. While GBIF’s expansion increases the overall value of the database, it also means we need to be more circumspect in how we use the data. Introduction to Org mode for cluster computing https://plantarum.ca/2024/03/28/org-cluster/ Thu, 28 Mar 2024 00:00:00 +0000 https://plantarum.ca/2024/03/28/org-cluster/ Getting Started This tutorial builds on my previous post Emacs posts (ie., introduction, Orgmode, R and ESS). If you’re not familiar with Emacs, you might want to look through those first, particularly Orgmode. In this post, I will show you how to setup an Org mode (or Orgmode, org-mode) file on your local machine (laptop or office desktop) to manage and run a cluster computing project. While not absolutely necessary, working this way is much easier if you’ve configured your machine to provide keywordless access to the server. Extrapolation Detection (exDet) for SDMs https://plantarum.ca/2023/12/19/exdet/ Tue, 19 Dec 2023 00:00:00 +0000 https://plantarum.ca/2023/12/19/exdet/ Identifying Non-Analogous Climate Conditions A major concern when projecting species distribution models to new contexts (e.g., invaded ranges, or future climates) is establishing whether (and where) the environments in the new context are analogous to those in the training region (see my notes). A common approach is to compare each variable in isolation, and construct a “Multivariate Environmental Similarity Surface”, or MESS (Elith et al., 2010). Areas in the new context that are outside the range of any variable from the training context will have values below 0, with lower values indicating greater departures. Niche Quantification with Ecospat and Terra https://plantarum.ca/2023/07/28/ecospat-terra/ Fri, 28 Jul 2023 00:00:00 +0000 https://plantarum.ca/2023/07/28/ecospat-terra/ Introduction This is an update of my previous ecospat tutorial. Spatial analysis in R is shifting to terra and sf as the primary packages, so I’ve translated my old, raster-based tutorial to the new workflow. I also took this opportunity to clean up and extend the original tutorial. See the RSpatial tutorial for a more detailed introduction/overview of using terra for GIS/spatial analysis. Note this analysis depends on the ecospat package, and as of 2023-07-28 ecospat doesn’t support the spatial objects produced by terra. Managing Absolute Paths in Reproducible Analyses https://plantarum.ca/2023/02/14/path_switching/ Tue, 14 Feb 2023 00:00:00 +0000 https://plantarum.ca/2023/02/14/path_switching/ In a previous post on reproducible analysis, I explained the importance of using relative paths in your scripts, and organizing your data in a single directory, in order to maintain portability. You want to be able to pack up your analysis in a zip file, or upload it as a single directory to GitHub or Dropbox, in order to share it with colleagues, or transfer it to a new computer. Simple Maps in R with Terra https://plantarum.ca/2023/02/13/terra-maps/ Mon, 13 Feb 2023 00:00:00 +0000 https://plantarum.ca/2023/02/13/terra-maps/ Reference This is an update of my previous mapping tutorial. Spatial analysis in R is shifting to terra and sf as the primary packages, so I’ve translated my old, raster-based tutorial to the new workflow. See the RSpatial tutorial for a more detailed introduction/overview of using terra for GIS/spatial analysis. The following tutorial walks through some common plotting tasks I use for distribution models. Basemaps The geodata package provides several convenient functions for downloading raster and vector maps for use as basemaps and spatial analysis. Data Management for Reproducible Science https://plantarum.ca/2022/10/17/data_management/ Mon, 17 Oct 2022 00:00:00 +0000 https://plantarum.ca/2022/10/17/data_management/ Introduction Research is reproducible when others can reproduce the results of a scientific study given only the original data, code, and documentation (Alston and Rick, 2021) Benefits to the Author: Clear and complete documentation of your work makes it easier to share, write up and extend in future work, including responding to reviewers and developing new projects Conscientious documentation of your work involves a great deal of error-checking, which is reassuring to you – that you haven’t missed anything, or mis-remembered what you did; and to your readers – that you have conducted your work in a rigorous manner Reproducible work gets cited more, and developing a data archive creates a new citable product from your research. Georeferencing Notes https://plantarum.ca/2022/03/23/georeferencing/ Wed, 23 Mar 2022 00:00:00 +0000 https://plantarum.ca/2022/03/23/georeferencing/ GBIF Data GBIF is the main online clearing house for occurence data in the world. It includes most (but not all) online herbarium databases. It also includes iNaturalist records, as well as a number of other survey repositories. I’m not familiar with all of the sources included. The main page presents a search bar. Enter your species name (e.g., Rubus chamaemorus) in the box, and select the ‘occurrences’ tab. You’ll be taken to a list of results that match your species (2. Schoener's D and Study Extent https://plantarum.ca/2021/12/02/schoenersd/ Thu, 02 Dec 2021 00:00:00 +0000 https://plantarum.ca/2021/12/02/schoenersd/ Background Schoener’s D was created by Schoener (1968) He was studying the feeding niche of anoles, and needed a way to quantify the overlap in prey items for different species. This is what he came up with: \[D(p_X, p_X) = 1 - \frac{1}{2} \sum_i \vert p_{X,i} - p_{Y, i} \vert\] Here, \(p_{X,i}\) and \(p_{Y,i}\) are the frequencies for species \(X\) and \(Y\), respectively, for the \(i^{th}\) category. For Schoener, the categories were prey sizes. Thinning Occurrence Records in R https://plantarum.ca/2021/10/26/r-gridsample/ Tue, 26 Oct 2021 00:00:00 +0000 https://plantarum.ca/2021/10/26/r-gridsample/ Note that this tutorial refers to the thinning method used in the old version of the rspatial.org tutorial, which used the raster package (along with dismo) for the GIS computations. The terra package will shortly be replacing raster, and all new code should use this instead. The details of spatial thinning with terra are presented in my new ecospat tutorial A common approach to reducing spatial bias in occurrence records is to randomly select one (or a small number) of samples present in each cell in the landscape. Emacs for Bioinformatics #4: RMarkdown https://plantarum.ca/2021/10/03/emacs-tutorial-rmarkdown/ Sun, 03 Oct 2021 00:00:00 +0000 https://plantarum.ca/2021/10/03/emacs-tutorial-rmarkdown/ This is part four in my series of Emacs tutorials aimed at bioinformatics (and other scientific analysis) workflows. See the rest on my tutorials page. Emacs provides full support for editing RMarkdown documents. RMarkdown has extensive documentation, both at the previous RStudio link, and several free online books by Xie et al. (notably R Markdown: The Definitive Guide, but also several others listed on Yihui Xie’s Bookdown page). Most of these references assume you are using the RStudio development environment. Evaluating Invasion Stage with SDMs https://plantarum.ca/2021/08/11/invasion-stage/ Wed, 11 Aug 2021 00:00:00 +0000 https://plantarum.ca/2021/08/11/invasion-stage/ My attempt to recreate the invasion stage analysis developed by Gallien et al. (2012), inspired by seeing it applied by Eckert et al. (2020). We’ll continue with the Lythrum salicaria data from my tutorial on niche quantification analysis. Specifically, I’ll model how the niche space this species occupies in its invaded range in North America relates to its global niche. library(ecospat) library(raster) library(rgbif) library(maptools) library(magrittr) library(dismo) Gallien et al. (2012) used an ensemble of SDMs, which is (should be) more robust than applying a single approach. Niche Quantification with Ecospat https://plantarum.ca/2021/07/29/ecospat/ Thu, 29 Jul 2021 00:00:00 +0000 https://plantarum.ca/2021/07/29/ecospat/ The ecospat package (Cola et al. 2017) provides code to quantify and compare the environmental and geographic niche of two species, or of the same species in different contexts (e.g., in its native and invaded ranges). The included vignette explains how to do such analyses. However, the vignette assumes you already have a matrix of occurrence records, along with the climate data for each of those records. In our work, we typically have to construct those matrices from observation data (herbarium records, iNaturalist observations, etc) and climate rasters (e. GBS Admixture Analysis Workflow https://plantarum.ca/2021/06/01/admixture/ Tue, 01 Jun 2021 00:00:00 +0000 https://plantarum.ca/2021/06/01/admixture/ Admixture is a program for completing STRUCTURE-style analyses of large SNP datasets, such as we get with GBS (Elshire et al. 2011). This short tutorial covers getting our SNP data from STACKS (Rochette, Rivera‐Colón, and Catchen 2019) into a format that Admixture will understand, running the analysis, and importing the results into R for further investigation & plotting. Converting Stacks Output to Admixture Input Both Stacks and Admixture can process PLINK data. Adding Lat/Lon Grids to Maps in R https://plantarum.ca/2021/02/22/graticules-r/ Mon, 22 Feb 2021 00:00:00 +0000 https://plantarum.ca/2021/02/22/graticules-r/ In a previous post, I outlined my workflow for preparing maps in R. Today I had to add a graticule, a grid of latitude and longitude lines, to my maps. That’s easy enough to do with unprojected maps, as the plot coordinates are latitude and longitude, so your X and Y axes are already graticules. But if you’ve projected your data, the plot coordinates are on a different scale, so you need to do a bit of tuning. Emacs for Bioinformatics #3: R and ESS https://plantarum.ca/2020/12/30/emacs-tutorial-03/ Wed, 30 Dec 2020 00:00:00 +0000 https://plantarum.ca/2020/12/30/emacs-tutorial-03/ This is part three in my series of Emacs tutorials aimed at bioinformatics (and other scientific analysis) workflows. See the rest on my tutorials page. Emacs support for the R programming language is provided by the ESS package (AKA, “Emacs Speaks Statistics”). ESS has been around since at least 1994, and is supported by a very active development team. It provides most or all of the features of the more widely-known RStudio, as well as a great many more. Plotting Simple Maps in R https://plantarum.ca/2020/10/30/simple-maps-r/ Fri, 30 Oct 2020 00:00:00 +0000 https://plantarum.ca/2020/10/30/simple-maps-r/ NOTE: This tutorial uses older R packages that are scheduled to be deprecated at the end of 2023. I have updated this tutorial using the new packages. Unless you need to use older code, you should use the new Terra-based approach instead of this! Reference See the RSpatial tutorial for a more detailed introduction/overview of using R for GIS/spatial analysis. The following tutorial walks through some common plotting tasks I use for distribution models. Emacs for Bioinformatics #2: Orgmode https://plantarum.ca/2020/06/17/emacs-tutorial-02/ Wed, 17 Jun 2020 00:00:00 +0000 https://plantarum.ca/2020/06/17/emacs-tutorial-02/ In the previous post we took a first look at Emacs, including creating and editing a script file, and passing commands from the file to the shell terminal. At the end of that post, I recommended you check out the built-in tutorial (accessible via C-h t from within Emacs). In this post I assume you’ve done so, although I won’t expect you’ve understood everything you found there. Orgmode Last time, I promised a better way to integrate scripts, output, and notes in a single file. Emacs for Bioinformatics: Getting Started https://plantarum.ca/2020/06/16/emacs-tutorial-01/ Tue, 16 Jun 2020 00:00:00 +0000 https://plantarum.ca/2020/06/16/emacs-tutorial-01/ Emacs for Bioinformatics GNU Emacs is likely one of the oldest pieces of software still in active development. It is also one of the most powerful systems for editing code, built by and for hackers. However, it does have a reputation for unwieldy complexity. I think this is largely undeserved. While it would take years of study to understand all its nooks and crannies, if you focus on just those features that you actually need, you can get going fairly quickly. Publication Quality R Figures https://plantarum.ca/2014/02/19/r-graphics/ Wed, 19 Feb 2014 00:00:00 +0000 https://plantarum.ca/2014/02/19/r-graphics/ Introduction Learning Objectives Pre-requisites Motivation Building Our Plot Size Content Plot Symbols Margins Axes The finished plot Exercise 1: adding a legend Additional Customization Selecting Plot Symbols Panels Exercise 2: Completing the Panel Image Formats Raster Images Vector Images Figure 1: A. Iris Sepal Size by Species. B. Iris Petal Width Introduction Learning Objectives At the end of this lesson, you should be able to: Customize plots produced with the R base graphics system Design multi-panel plots Design plots to suit the publication requirements of a journal Save your plots as high-resolution raster or vector image files as required by your publisher Pre-requisites You will need: Preparing Rubus samples for herbarium study https://plantarum.ca/2013/08/25/rubus-herbarium/ Sun, 25 Aug 2013 00:00:00 +0000 https://plantarum.ca/2013/08/25/rubus-herbarium/ Collecting blackberries and their relatives (Rubus spp.) for herbarium study is particularly challenging, and even experienced field-botanists may not appreciate everything that is involved. More than in other vascular plant groups, to make a good Rubus specimens, you need to understand a bit about the their life-cycle. A single Rubus allegheniensis specimen, with the first-year primocane on the right, the second-year floricane on the left, and my expert Rubus presser Charlotte in the middle.