jackjackk / notes-ics-aci

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

System Basics [2017-09-14 Thu] (Slides)

ICS & HPC

  • what is High Performance Computing? Large…
    • # of processors
    • # of runs
    • memory requirements (finance, bio)
    • storage reqs (ML, big data)
    • runtime (optimization, poor coding)
  • HPC does not improve single processes!
  • resources
    • internal
      • ICS-ACI
    • external
      • NSF XSEDE
      • Blue Waters through GLCPC
      • DoE
  • people
    • engagement team (Lavely, Blanton, Pavloski)
    • iAsk center (help answering questions, mostly on ACI)
  • goal: enable research

ICS-ACI

  • 24000 cores
  • very capable tool
    • ACI-B for batch (high/std/basic memory)
    • ACI-I for interactive (Rstudio)
    • Gateways, GPUs (coming soon)
    • ACI Partitions (CyberLAMP)
    • Hosted resources (LIGO)
  • ACI-B
    • access via ssh only
    • log into an aci-lgn-* node to submit jobs
    • process runs have dedicated resources when they start running
    • what you request is specified in the submission script
    • anybody can access the open allocation w/ some limitations
    • slow X session launchable via -Y
    • used for: production runs, high mem/proc/time
  • ACI-I
    • access via ssh or EoD
    • placed on a node and run process there
    • limited to 4 procs, 12h, 48GB mem
    • does not guarantee exclusive resources like a compute node
    • used for: debugging, visualization, interaction

File systems

  • three storage location options
    • home
      • /storage/home/userid
      • small (10GB) but well protected
        • not shared w/ anybody else
        • backed up frequently
    • work
      • /gpfs/work/userid
      • 128GB cap
      • you can share data
      • backed up every 3 days or so
    • scratch
      • /gpfs/scratch/userid
      • no file size limit
      • 1M files cap per user
      • files older than 30 days are deleted
  • work & scratch are linked from home
  • research groups can purchase
    • allocations
    • storage space
      • for
        • group (like work but 5TB)
        • archive
      • datamgr.aci.ics.psu
        • allow for faster data transfers

Connecting

  • ways to connect
    • ssh in terminal
      ssh userid@aci-b.aci.ics.psu.edu
      # or
      ssh userid@aci-i.aci.ics.psu.edu
              
      • password doesn’t show up!
    • PuTTy on Windows
    • Exceed onDemand to ACI-I tutorial
      • config
        • Xconfig: Desktop_mode_1280x1024.cfg
        • Xstart: Gnome_Desktop.xs
      • only 1 session
      • you can always come back (don’t log out, just quit)
      • still gonna have to use cmdline
        • Apps -> System Tools -> Terminal

Terminal

  • things to know
    • GIYF
    • man
    • banana
  • find total and available allocation hours
    mam-list-funds -h
        
  • manual for commands
    man cmd
        
  • see list of options using an improper flag
    mam-list-funds --banana
        
  • 4 most basic commands
    ls                            # list contents of curr dir
    pwd                           # print current dir
    cd scratch                    # change dir
    cp logFile logFile_13Sept2017 # copy file
        
  • other useful commands
    history # past commands
    mv # move files
    rm # remove files
    mkdir # make
    find # find files
    grep # filter files
    awk # text manipulator
    id
    du # disk space
    clear # clear screen
    env
    ssh
    more
        
  • special characters
    cd ~ # move to home
    cd . # move to here (stay here)
    cd .. # move one dir up
    ls *.png # list all png
    ls -1 | grep png # pipe output of ls to other commands
    ls > log.ls # put output in a file
        

Modules

  • wrapper for individual program
    • e.g. in order to use Matlab you need to first load its module
  • show modules currently available
    module avail
        
  • search for modules
    module spider vasp
        
  • load a module optionally w/ specific version (otherwise will you the default)
    module load ansys/18.1
        

    better to specify version.

  • module families
    module avail
    module load gcc/5.3.1
    module avail # new modules (compiled with hence conditional on gcc/5.3.1) will show up
        
  • other cmds
    module list # list loaded modules
    module purge # clean up loaded modules
    module show modName # where libs of the module are, which env vars are setup
        
    • e.g. w/ boost module you can use show to help you set up lib paths when compiling w/ it

Transfer data

  • cmdline
    scp lFile userid@datamgr.aci.ics.psu.edu:~/work/
    rsync ...
    sftp ...
        
  • programs: WinSCP, Filezilla
  • via Box, Dropbox w/ EoD’s Firefox interface (no syncing)
    • (and no sound => no youtube!)
  • specific programs: Globus, Aspera

Submitting a Job

  • submission scripts
    • two sections
      1. PBS directives
        #PBS directive ...
                    
        • used for requesting resources
        • only at beginning
      2. commands
    • to submit
      qsub submitScript.pbs
              
    • e.g.
      #!/bin/bash
      #PBS -l nodes=1:ppn=1
      #PBS -l walltime=5:00
      #PBS -A open
      
      echo "Job started on $(hostname) at $(date)"
      
      module purge
      module load matlab/R2016a
      
      # goto dir where the script lives
      cd $PBS_O_WORKDIR
      
      matlab-bin -nodisplay -nosplash < runThis.m > log.matlabRun
      
      echo "Job ended at $(date)"
              

Info

  • ICS docs ics.psu.edu
  • iAsk
    • iAsk at ics.psu.edu, 54275
  • check doc of other batch systems: TACC, OSC
  • seminar 2
    • submitting jobs
    • compiling simple codes
    • allocation usage
    • intro to parallelization
      • distributed vs shared memory
    • data moving
      • globus, rsync
  • seminar 3 (Feb 2018)
    • optimization techniques

About

License:MIT License