nekRS
Overview
nekRS is a fast and scaleable computational fluid dynamics (CFD) software package targeting massively parallel computers. It is based on the high-order spectral element method and is capable of solving incompressible and low Mach-number fluid flow problems. nekRS uses the OCCA portability layer for offloading compute kernels to GPU devices.
For details about the code and its usage, see the nekRS home page. This page provides information specific to running on Polaris at the ALCF.
Using nekRS at ALCF
ALCF provides assistance with build instructions, compiling executables, submitting jobs, and providing prebuilt binaries (upon request). For questions, contact us at support@alcf.anl.gov.
How to Obtain the Code
nekRS is an open-source code and can be downloaded from the website. Alternatively, the user can clone from the nekRS Github repository. We recommend using the next branch since it is the most updated branch with some of the latest features, including the in-situ visualization capability.
The rest of this documentation is based on building and running using thenext
branch. Users who are interested to run the default master
branch can contact support@alcf.anl.gov for additional support.
Building on Polaris
nekRS uses CMake to build and install the software package. After nekRS has been downloaded or cloned on an ALCF filesystem, users should see a directory with the name nekRS
. Inside this directory the user would find a file named nrsconfig
that can be used to configure and customize the CMake build options.
The user should remove previous build and installation directories whenever there is an update.
Building using GNU compilers
The following modules are to be loaded for this particular build.
module restore
module load craype-accel-nvidia80
module swap PrgEnv-nvhpc PrgEnv-gnu
module use /soft/modulefiles
module load cudatoolkit-standalone/12.4.0
module load spack-pe-base cmake
To build and install the code run:
During the installation process, you will be prompted to verify the configuration options. If everything was done correctly, you should see the correct compilers and theDefault backend : CUDA
in the Summary
section of the output. If you see this, press Enter
to continue with the build and installation process.
After installation, execute the following commands to setup the environment.
Alternatively, you may add the above lines to your $HOME/.bashrc
and type source $HOME/.bashrc
in the current terminal window.
Building using NVHPC compilers
The following modules are to be loaded for this particular build. The initial module restore
is just setting the default environment as the starting point.
module restore
module load craype-accel-nvidia80
module use /soft/modulefiles
module load spack-pe-base cmake
Running Jobs on Polaris
An example submission script for running a 2-node nekRS job is shown below as an example. Additional information on nekRS input files and application setup options is described here. The correct options to execute the script is as follows:
NEKRS_HOME= PROJ_ID=
QUEUE= ./run.sh `
Users can copy the script below into a file run.sh
and execute it using the command above.
#!/bin/bash
: ${PROJ_ID:=""}
: ${QUEUE:="prod"} # debug, debug-scaling, prod https://docs.alcf.anl.gov/running-jobs/job-and-queue-scheduling/
: ${NEKRS_HOME:="$HOME/.local/nekrs"}
: ${NEKRS_CACHE_BCAST:=1}
: ${NEKRS_SKIP_BUILD_ONLY:=0}
if [ $# -ne 3 ]; then
echo "usage: [PROJ_ID] [QUEUE] $0 <casename> <number of compute nodes> <hh:mm:ss>"
exit 0
fi
if [ -z "$PROJ_ID" ]; then
echo "ERROR: PROJ_ID is empty"
exit 1
fi
if [ -z "$QUEUE" ]; then
echo "ERROR: QUEUE is empty"
exit 1
fi
bin=${NEKRS_HOME}/bin/nekrs
case=$1
nodes=$2
gpu_per_node=4
cores_per_numa=8
let nn=$nodes*$gpu_per_node
let ntasks=nn
time=$3
backend=CUDA
NEKRS_GPU_MPI=1
if [ ! -f $bin ]; then
echo "Cannot find" $bin
exit 1
fi
if [ ! -f $case.par ]; then
echo "Cannot find" $case.par
exit 1
fi
if [ ! -f $case.udf ]; then
echo "Cannot find" $case.udf
exit 1
fi
if [ ! -f $case.re2 ]; then
echo "Cannot find" $case.re2
exit 1
fi
striping_unit=16777216
max_striping_factor=128
let striping_factor=$nodes/2
if [ $striping_factor -gt $max_striping_factor ]; then
striping_factor=$max_striping_factor
fi
if [ $striping_factor -lt 1 ]; then
striping_factor=1
fi
MPICH_MPIIO_HINTS="*:striping_unit=${striping_unit}:striping_factor=${striping_factor}:romio_cb_write=enable:romio_ds_write=disable:romio_no_indep_rw=true"
# sbatch
SFILE=s.bin
echo "#!/bin/bash" > $SFILE
echo "#PBS -A $PROJ_ID" >>$SFILE
echo "#PBS -N nekRS_$case" >>$SFILE
echo "#PBS -q $QUEUE" >>$SFILE
echo "#PBS -l walltime=$time" >>$SFILE
echo "#PBS -l filesystems=home:eagle" >>$SFILE
echo "#PBS -l select=$nodes:system=polaris" >>$SFILE
echo "#PBS -l place=scatter" >>$SFILE
echo "#PBS -k doe" >>$SFILE #write directly to the destination, doe=direct, output, error
echo "#PBS -j eo" >>$SFILE #oe=merge stdout/stderr to stdout
# job to “run” from your submission directory
echo "cd \$PBS_O_WORKDIR" >> $SFILE
echo "module use /soft/modulefiles" >> $SFILE
echo "module use /opt/cray/pe/lmod/modulefiles/mix_compilers" >> $SFILE
echo "module load libfabric" >> $SFILE
echo "module load cpe-cuda" >> $SFILE
echo "module load PrgEnv-gnu" >> $SFILE
echo "module load nvhpc-mixed" >> $SFILE
echo "module load cmake" >> $SFILE
echo "module list" >> $SFILE
echo "nvidia-smi" >> $SFILE
echo "ulimit -s unlimited " >>$SFILE
echo "export NEKRS_HOME=$NEKRS_HOME" >>$SFILE
echo "export NEKRS_GPU_MPI=$NEKRS_GPU_MPI" >>$SFILE
echo "export MPICH_MPIIO_HINTS=$MPICH_MPIIO_HINTS" >>$SFILE
echo "export MPICH_MPIIO_STATS=1" >>$SFILE
echo "export NEKRS_CACHE_BCAST=$NEKRS_CACHE_BCAST" >>$SFILE
echo "export NEKRS_LOCAL_TMP_DIR=/local/scratch" >>$SFILE
echo "export MPICH_GPU_SUPPORT_ENABLED=1" >> $SFILE
echo "export MPICH_OFI_NIC_POLICY=NUMA" >> $SFILE
# https://github.com/Nek5000/Nek5000/issues/759
echo "export FI_OFI_RXM_RX_SIZE=32768" >> $SFILE # >=lelt, large mpi-messsage for restart
if [ $NEKRS_SKIP_BUILD_ONLY -eq 0 ]; then
echo "mpiexec -n 1 $bin --setup ${case} --backend ${backend} --device-id 0 --build-only $nn" >>$SFILE
fi
CMD=.lhelper
echo "#!/bin/bash" >$CMD
echo "gpu_id=\$((${gpu_per_node} - 1 - \${PMI_LOCAL_RANK} % ${gpu_per_node}))" >>$CMD
echo "export CUDA_VISIBLE_DEVICES=\$gpu_id" >>$CMD
echo "$bin --setup ${case} --backend ${backend} --device-id 0" >>$CMD
chmod 755 $CMD
echo "mpiexec -n $nn -ppn $gpu_per_node -d $cores_per_numa --cpu-bind depth ./$CMD" >>$SFILE
qsub -q $QUEUE $SFILE
# clean-up
#rm -rf $SFILE $ROMIO_HINTS .lhelper
Just-in-time (JIT) compilation
nekRS uses the OCCA library to translate, compile, and run GPU-targetted functions and kernels. Some useful notes on the cached object files can be found here.
Discussion Group
Users can visit the Github Discussions page to seek help, find solutions, share ideas, and follow discussions on several application-specific topics.