Installing on Mox
These instructions assume that you have access to the psicenter Slurm account on Mox. On Mox, only the build nodes have access to the internet. Internet access is required for the cmake step, so one must build WARPXM on a build node.
From the login node, run
salloc -p build --time=4:00:00 --mem=100G
ssh ${allocated_node} #e.g. n2232
The following list is one working combination of modules for WARPXM compilation. Add the following to your ~/.bashrc
:
module load contrib/petsc/3.9.4/icc_19
module load contrib/cmake_3.12.3
module load contrib/gcc/6.3.0 icc_19
module load icc_19-impi_2019
Before attempting to build WARPXM ensure that these modules are loaded by running module list
.
Make sure you clone and build WARPXM in the /gscratch
partition, NOT the home partition!
Installing on Klone
Helpful links
Pre-install setup for Hyak Klone
- Before logging into Hyak Klone, check the list of modules required to build WARPXM on your local machine:
cat ~/code/warpxm/warpxm/tools/warpy/clusters/hyak_klone.py
- Login Hyak Klone:
ssh your_uw_netID@klone.hyak.uw.edu
- Set a path:
Follow the instructions on Pre-install setup for Linux HPC machines
- Create the code directory:
- Add the list of modules that you have checked in the step 1 to your
~/.bashrc
:
The following example was created on 4/13/2023. Make sure to add the list given in the latest version of ~/code/warpxm//warpxm/tools/warpy/clusters/hyak_klone.py
. module load cmake/3.21.1
module load gcc/10.2.0
module load stf/gdb/10.2
module load ucx/1.10.0
module load stf/mpich/4.0a2
module load stf/hdf5/mpich/1.12.0
module load stf/petsc/mpich/3.15.1
In a login node on Hyak Klone, the module
command is not supported, and thus you need to build WARPXM in an interactive compute node. If you reload ~/.bashrc
within the login node or if you logout and back in on Hyak Klone, you will see the following warning such that The module command is no longer supported on Klone Login nodes
. If you'd prefer, you can disable those warning messages by adding the following to your ~/.bashrc
: export LOGIN_SILENCE_MODULE_WARNING=true
- NOTE for developers only:
If the given list of modules in the hyak_klone.py
is no longer supported on Hyak Klone or WARPXM requires newer versions, you need to update the list.
Building WARPXM
- Get into an interactive compute node:
Make sure you have the access to aaplasma
account by running Specify compute
partition in aaplasma
account for your interactive jobs: salloc -A aaplasma -p compute -N 1 -c 4 --mem=10G --time=2:30:00
If you don't have the access to aaplasma
account, you need to replace aaplasma
with stf
.
- Build WARPXM:
Follow the instructions on Basic Building.
- Set a path to your run directory:
Edit the last argument in ~/.config/warpy_user_config.py
by using Vim or Emacs on Hyak Klone to use /gscratch/aaplasma/your_uw_netID/name_of_your_run_directory
or /gscratch/stf/your_uw_netID/name_of_your_run_directory
as your primary storage. For example, def test_rundir(category, name):
'''
@brief Function for computing the directory to use for running the simulation.
@param category category of the simulation
@param name test case name
@return The path where the sumulation should be run. This is relative to the current working directory (if a relative path is returned).
'''
return os.path.join('/gscratch/aaplasma/your_uw_netID/warpxm_run', category, name)
Do NOT use your $HOME directory (/mmfs1/home/your_uw_netID
) as your primary storage because its quota is only 10GB. You can check your storage usage by runninng
Schedule a batch job via slurm
You can submit a batch job via slurm automatically by using ~/.config/warpy_user_config.py
from the login node. Make sure to edit the following arguments manually for your jobs:
num_procs
:
Number of processors you want to use (40 processors for each node)
queue
:
Name of partition to submit your jobs. This should be compute
on Hyak Klone
account
:
Name of account to charge jobs. This should be either aaplasma
or stf
walltime
:
Walltime limit. Set walltime limit and make sure your calculations will end before a scheduled monthly maintenance
test_rundir
:
Path to your run directory. You should point to /gscratch/aaplasma/your_uw_netID/name_of_your_run_directory
or /gscratch/stf/your_uw_netID/name_of_your_run_directory
.
Optional:
memory_limit
:
Total amount of memory you need for your calculations with units either M
, G
, or T
for megabyte, gigabyte and terabytte respectively. As the default on Hyak Klone, each processor has 4GB of memory. You can change the total amount of the memory limit, if necessary. Do NOT take the entire memory allotted to the account.
Example of ~/.config/warpy_user_config.py
:
'''
@brief User configuration file for WARPy.
'''
'''
@brief Configuration file for WARPy
'''
import os
'''
@brief True to delete simulation directory before running a new sim. Note
that this has no effect if trying to run an existing input file.
'''
'''
@brief True to copy warpy input file to output directory. By default is True.
'''
'''
@brief True to write git commit no. and branch status to output directory.
Requires the GitPython package. By default is False.
'''
'''
@brief Number of instances of WARPXM to run or 0 to not run under MPI.
'''
num_procs = 40*2
'''
@brief WARPXM executable
'''
'''
@brief mpirun executable. Note: for some reason specifying an absolute path for mpirun
seems to segfault OpenCL...
'''
'''
@brief Base directory to look for meshes in.
Paths are relative to the current working directory (if given a relative path).
'''
'''
@brief environment variables to set only for running WARPXM
right now used for disabling multi-threaded blas (clashes with MPI)
'''
'''
@brief Extra arguments to pass to mpirun
'''
'''
@brief Name of queue to submit job to
'''
queue = 'compute'
'''
@brief Name of account to charge jobs against.
If account is an empty string, then account will be set equal queue (default)
If account is set to None, then account is ignored
'''
account = 'aaplasma'
'''
@brief Walltime
'''
walltime = 3600*24*3
'''
@brief Memory limit allowed for this run. Note: this only takes effect if using the SLURM scheduler.
'''
'''
@brief template pbs file
'''
'''
@brief template slurm file
'''
def test_rundir(category, name):
'''
@brief Function for computing the directory to use for running the simulation.
@param category category of the simulation
@param name test case name
@return The path where the sumulation should be run. This is relative to the current working directory (if a relative path is returned).
'''
return os.path.join('/gscratch/aaplasma/your_uw_netID/warpxm_run', category, name)