Notes for connecting to GW's high performance computing cluster, Pegasus.
Consult this Getting Started guide for more info.
Generate a new SSH public / private key pair by following this guide.
The public key can be shared with anyone, without concern. The private key should never be shared.
Verify you see some files ("id_rsa" and "id_rsa.pub"):
ll ~/.ssh
Print the contents of the public key:
cat ~/.ssh/id_rsa.pub
Fill out this access requet form, with info like the following:
- PI: Michael Rossetti
- Research Group: "Data Science Research Group" (rossettigrp)
- Clusters: "Pegasus" only should be fine for now
Paste your public key contents, or upload the public key file directly.
You must be connected to GWireless on campus, or remotely through the GW VPN.
For either option, you will need a GW email account. Researchers and assistants at other universities can use this form to request a GW account.
Login success when connected to GWireless!
In terms of downloading the VPN, here are some notes from the site:
Palo Alto GlobalProtect 6.0.7 for macOS
Download PaloAltoGlobalProtect-6.0.7-Mac.pkg (76 MB) (File will begin downloading in a few seconds) Palo Alto GlobalProtect allows remote access to GW resources through an encrypted connection to GW. Portal Address: gwvpn.gwu.edu
Compatible with macOS 11 and higher. Note: During installation, also install System Extensions.
After you have downloaded and installed the VPN, you may need to give access to it via the Security and Privacy settings.
To use the VPN, launch the "GlobalProtect" program, and enter the portal address. Then sign in with your GW microsoft account ("example@gwu.edu").
Log in using the SSH credentials you submitted via the access request form:
ssh <username>@pegasus.arc.gwu.edu
# ssh <username>@pegasus.arc.gwu.edu -i ~/.ssh/id_rsa.pub
If you see a "Permission Denied" issue, email support, and they might say... "You will need to use a one-time multifactor code to log in. Please use one of these codes when prompted..." and provide you with some codes. Try to login again, and supply the code. It works. Great!
Every time you log in, it will prompt you for a 2FA code, so store those OTP codes somewhere, and setup multifactor auth as soon as possible (see section below).
After using one of the OTP codes to login for the first time, run google-authenticator
to configure multifactor auth via your Authenticator app. Answer "y" for all the questions. Scan the QR code with your Authenticator app. In the future you will use a code generated by the Authenticator app instead of the OTP / scratch codes.
Your home folder is /SEAS/home/<username>
.
Be aware of your SSH keys and SSH config:
ll ~/.ssh
#> authorized_keys
#> cluster
#> cluster.pub
#> config
#> id_ecdsa
#> id_ecdsa.pub
#> known_hosts
There are many installable "modules" available, including a module for Python 3.10. However we want project-specific virtual environments, so let's use the miniconda module:
# module load python3/3.10.11
module load miniconda/miniconda3
You will need to run this every time you login to the server?
Once loaded, we should have access to anaconda command line tool.
Listing environments:
conda info --envs
Creating and activating an environment:
conda create -n my-first-env python=3.10
#conda activate my-first-env
# conda command may require some bashrc setup, they want us to use source instead:
source activate my-first-env
Verify the environment is setup properly:
python --version
pip --version
python -i # enter into python shell, test things out
# exit()
NOTE: "the python virtual environment can be built in your home directory (/SEAS/home/), or group directory (/SEAS/groups/) to share with others in your group, or on the lustre (/lustre/groups/), and similar to any packages you need."
We need to clone repositories from GitHub. We'll use git. It looks like git is pre-installed on the server:
which git
#> /usr/bin/git
git --version
#> git version 2.39.3
Attempting to clone a repo from GitHub:
git clone git@github.com:s2t2/pegasus-notes.git
You may run into permissions issues the first time, in which case you'll need to configure SSH connection from server to GitHub (see section below).
Generating a new SSH key (run this on the server):
ssh-keygen -t ed25519 -C "your_email@example.com"
This creates a new "id_ed25519.pub" key pair. Upload the resulting public key via your GitHub account's SSH settings.
cat ~/.ssh/id_ed25519.pub
#> copy then paste in GitHub
Also setup the ssh agent:
eval "$(ssh-agent -s)"
#> Agent pid 123456
ssh-add ~/.ssh/id_ed25519
Afterwards, when you try to clone again, it should work.
We can use this example Python application, a game of tic tac toe. Verify you should be able to setup the app, install requirements, and play a game:
Repo setup:
git clone git@github.com:s2t2/tic-tac-toe-py.git
cd tic-tac-toe-py/
conda create -n tictactoe-env python=3.8
source activate tictactoe-env
pip install -r requirements.txt
NOTE: environment creation and package installation can take a long... long ... long ... long ... time :-/
Usage:
python -m app.game
X_STRATEGY="COMPUTER-HARD" O_STRATEGY="COMPUTER-EASY" GAME_COUNT=100 python -m app.jobs.play_games
Use scp to upload or download files to/from the server. Note: you may be asked for your multifactor code whenever you initiate a file transfer.
scp <username>@pegasus.arc.gwu.edu:/SEAS/home/<username>/projects/tic-tac-toe-py/data/games/x_minimax_vs_o_random_100.csv ~/Downloads
NOTE: "please do not run jobs against the NFS shares (/SEAS/home/) and groups (/SEAS/groups) but use /lustre instead, i.e. files should be read from and written to /lustre/groups/ directory."
NOTE: "we use slurm for job scheduling which can run into interactive move (salloc) or batch mode (sbatch). The batch mode requires a shell script as a wrapper to call your python code."
Example shell script:
#!/bin/bash
#SBATCH --time 00:10:00
#SBATCH -p nano
#SBATCH -o temperature.out
#SBATCH -e temperature.err
#SBATCH -N 1
. ~/miniconda3/etc/profile.d/conda.sh
python3 ~/temperature.py | sort -n
TBD - verify this when we need to schedule some jobs