Getting Started on Aurora
Logging Into Aurora:
To log into Aurora:
Hardware Overview
An overview of the Aurora system including details on the compute node architecture is available on the Machine Overview page.
Compiling Applications
Users are encouraged to read through the Compiling and Linking Overview page and corresponding pages depending on the target compiler and programming model.
Autotools and CMake are available in the default Aurora Programming Environment (PE) and can be loaded via Lmod modules:
Submitting and Running Jobs
Users are encouraged to read through the Running Jobs with PBS at the ALCF page for information on using the PBS scheduler and preparing job submission scripts. For Aurora-specific job documentation, refer to Running Jobs on Aurora
Early User Notes and Known Issues
- Hardware instabilities - possible frequent downtime
- Software instabilities - non-optimized compilers, libraries and tools; frequent software updates
- Non-final configurations (storage, OS versions, etc...)
- Short notice for downtimes (scheduled downtimes will be with 4 hr notice, but sometimes downtimes may occur with just an email notice). Notices go to the aurora-notify@alcf.anl.gov email list. All users with access are added to the list initially.
See Early User Notes and Known Issues for details.
Python on Aurora
Frameworks on Aurora can be loaded into a users environment by loading the frameworks
module as follows. The conda environment loaded with this module makes available TensorFlow, Horovod, and Pytorch with Intel extensions and optimizations.
Note that there is a separate Python installation in spack-pe-gcc
which is used as a dependency of a number of Spack PE packages. Users will need to exercise caution when loading both frameworks
and python
from the Spack PE. For more details please about Python on Aurora please review Python on Aurora
Software Environment
The Aurora Programming Environment (Aurora PE) provides the OneAPI SDK, MPICH, runtime libraries, and a suite of additional tools and libraries. The Aurora PE is available in the default environment and is accessible through modules. For example, tools and libraries like cmake
, boost
, and hdf5
are available in the default environment.
Additional software is installed in /soft
and can be accessed by adding /soft/modulefiles
to the module search path.
kokkos
.
Proxy
If the node you are on doesn’t have outbound network connectivity, add the following to your ~/.bash_profile
file to access the proxy host:
File Systems and DAOS
Home and Project Directories
Home directories on Aurora are /home/username
, available on login and compute
nodes. This is provided from /lus/gecko/home
. The default quota is 50 GB. Note that bastions have a different /home
and the default quota is 500 MB.
Lustre project directories are under /lus/flare/projects
. ALCF staff should
use /lus/flare/projects/Aurora_deployment
project directory. ESP and ECP
project members should use their corresponding project directories. The
project name is similar to the name on Polaris with an _CNDA suffix
(e.g.: projectA_aesp_CNDA, CSC250ADABC_CNDA). Default quota is 1 TB. The
project PI should email [email protected] if
their project requires additional storage.
Note: The Project Lustre File system has changed from Gecko to Flare.
DAOS
The primary storage system on Aurora is not a file system, but rather an object store called the Distributed Asynchronous Object Store. This is a key-array based system embedded directly in the Slingshot fabric, which provides much faster I/O than conventional block-based parallel file systems such as Lustre (even those using non-spinning disk and/or burst buffers). Project PIs will have requested a storage pool on DAOS via INCITE/ALCC/DD allocation proposals.
Preproduction ESP and ECP Aurora project PIs should email [email protected] to request DAOS storage with the following information
- Project name (e.g. FOO_aesp_CNDA)
- Storage capacity (For ESP projects, if this is different than in the ESP proposal, please give brief justification)
See DAOS Overview for more on using DAOS for I/O.
Lustre File Striping
In addition to the content above, here is a document on Lustre File Striping Basics:
Getting Assistance
Please direct all questions, requests, and feedback to [email protected].