Shared Services Canada maintains the General Purpose Science Cluster (GPSC) high-performance computing cluster. These are my notes for accessing it from the AAFC network. For more general notes on using orgmode to manage cluster work, see this post.

Get the hostname from the administrators

Note: GPSC no longer uses multi-hop access (at least from AAFC), so you just need the GPSC hostname to log in. I’m leaving these notes here in case we switch back at some point.

Multi-hop access from inside your department network requires logging into your ‘local’ cluster (<LOCAL HOSTNAME>), and then logging into GPSC (<GPSC HOSTNAME>) from there. You’ll need to get the actual hostnames for these from the cluster administrator.

ssh into your local cluster using your GPSC username (not the same as your AAFC network user ID) and password.
from your local cluster, ssh into the GPSC with the same credentials.

Configure keys and addresses

Use an RSA key for password-free logins, as I described in my tutorial on Digital Ocean droplets.

I’ll never rember the hostnames, so I’ve added the following to ~/.ssh/config on my laptop (Linux; should work the same on Mac(?), Windows users will have to configure their ssh client as needed):

Host gpsc
    Hostname <GPSC HOSTNAME>
    User <USERNAME>

This allows me to sign in to <GPSC HOSTNAME> via ssh gpsc without remembering the actual address, username or password. This also works for transferring files via rsync:

rsync <LOCAL> gpsc:<PATH>

Here, <LOCAL> can be a file (e.g., my_script.sh), or a directory (e.g., my-folder), and <PATH> is the location to transfer to on gpsc. Leaving <PATH blank will transfer your file or directory to your home directory on gpsc.

Prepare slurm submission scripts

The template for submitting jobs is:

#!/bin/bash -l
#SBATCH --job-name=JOB_NAME
#SBATCH --open-mode=append
#SBATCH --partition=standard
#SBATCH --time=1:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
####SBATCH --mem-per-cpu=1G
#SBATCH --comment="<SUBMIT_COMMENT>"
#SBATCH --account=<ACCOUNT_NAME>

echo hello
sleep 45
echo goodbye

Note that you’ll need to include your actual and as provided by the administrators.

With the above template saved to a file named slurm_script.sh, you can run it from the cluster via sbatch slurm_script.sh.

Be sure to update the job-name option to something informative, and of course increase the time, ntasks, and cpus-per-task to something appropriate for your job.

Note that on GPSC, mem-per-cpu isn’t required. That’s why I’ve commented it out in the script above (notice the extra ###). The system will automatically divide RAM among CPUs, which usually works fine. If your job runs out of memory, you can uncomment this line and ask for a specific (higher) amount.

Configure Emacs

You can open files on GPSC in Emacs on your local machine via: C-x C-f /ssh:gpsc:FILENAME.

Execute code blocks with Org and Babel

You can use orgmode to run code blocks from a file on your local machine on GPSC. To do this, use a header like this in your local .org file (which assumes you’ve set up your .ssh/config files as described above):

#+PROPERTY: header-args:bash :results output :dir /ssh:gpsc:./path/to/working/directory

I ran into transient permission issues executing code blocks. I’ve fixed this by configuring org-babel to use my home directory for temp files. It seems to have problems writing to /tmp on GPSC, although I should have permission to do so? Here’s the code from my emacs init.el:

(setq org-babel-remote-temporary-directory "~/")

So far the temp files babel generates are cleaned up automatically, so this hasn’t created any detritus. You could use a subdirectory under ~/, but the directory must exist (i.e., you need to create it yourself) before org can find it - it won’t make it for you.

With this done, you can submit code blocks from your org file directly to slurm on GPSC. Here’s my template:

#+BEGIN_SRC bash :results output
  sbatch <<SUBMITSCRIPT
  #!/bin/bash
  #SBATCH --job-name=slurm_test_emacs
  #SBATCH --output=slurm_test_emacs.log
  #SBATCH --open-mode=append
  #SBATCH --partition=standard
  #SBATCH --time=0:01:00
  #SBATCH --ntasks=1
  #SBATCH --cpus-per-task=1
  ####SBATCH --mem-per-cpu=1G
  #SBATCH --comment="<SUBMIT_COMMENT>"
  #SBATCH --account=<ACCOUNT_NAME>

  date
  echo
  echo hello
  echo

  SUBMITSCRIPT
#+END_SRC

Everything between <<SUBMITSCRIPT and SUBMITSCRIPT will get passed to sbatch as if it were a separate script file.

If you do use this approach, note that you must escape any variables! (e.g., \$VAR). This won’t work:

#+BEGIN_SRC bash :results output
  sbatch <<SUBMITSCRIPT
  #!/bin/bash
  #SBATCH --job-name=bad_HEREDOC
  <...>

  NAME=TYLER
  echo $NAME

  SUBMITSCRIPT
#+END_SRC

But this will (note the backslash in front of the $):

#+BEGIN_SRC bash :results output
  sbatch <<SUBMITSCRIPT
  #!/bin/bash
  #SBATCH --job-name=good_HEREDOC
  <...>

  NAME=TYLER
  echo \$NAME

  SUBMITSCRIPT
#+END_SRC

A further wrinkle: this uses a heredoc, and heredocs need to write a temp file. By defaul this goes in /tmp, and in some cases /tmp may be full. If that happens, you’ll get an error

cannot create temp file for here-document: No space left on device

All users can write to /tmp, so you can’t fix this yourself. You can change the location where your heredoc gets written, though:

#+BEGIN_SRC bash :results output
  export TMPDIR=/path/to/your/personal/tmp/directory
  sbatch <<SUBMITSCRIPT
  #!/bin/bash
  #SBATCH --job-name=good_HEREDOC
  <...>

  NAME=TYLER
  echo \$NAME

  SUBMITSCRIPT
#+END_SRC

You may set this permanently by adding the export TMPDIR=... line to your .bashrc on GPSC.

Once a job is submitted, the ID will appear in your local .org file, e.g.:

#+RESULTS:
: Submitted batch job 1385545

You can retrieve info on the run with:

#+BEGIN_SRC bash :results output 
sacct --jobs=1385545 --format=jobid,jobname,state,elapsed,MaxRSS
#+END_SRC

In my case, this generated:

#+RESULTS:
: JobID           JobName      State    Elapsed     MaxRSS 
: ------------ ---------- ---------- ---------- ---------- 
: 1385545      slurm_tes+  COMPLETED   00:00:07 
: 1385545.bat+      batch  COMPLETED   00:00:07      2096K 
: 1385545.ext+     extern  COMPLETED   00:00:07      2088K

Use the format argument to request which details to report. MaxRSS is the maximum amount of RAM used during the run. Other options are available via sacct --helpformat, with more details in the slurm manual.

Note that GPSC doesn’t provide the seff command, which you may have used on other clusters.

Linking to files on the remote server

It may be more convenient to manage more complex scripts as stand-alone .sh files on the GPSC. If the configuration above works for you, you can insert a link to remote files in your org file with the syntax:

[[file:/ssh:gpsc:~/path/to/script.sh][My Script]]

This will create a ‘hotlink’ in your file you can click to open that file in your local Emacs. The default keyboard shortcut for entering links is C-c C-l.

That will allow you to edit the remote file as if it were local. When you’re ready to submit it, you can do that from your local .org file:

#+BEGIN_SRC bash :results output
sbatch script.sh
#+end_src

Note that the location of your script will be relative to the directory you listed in your file header.