The Adaptable Input/Output (I/O) System (ADIOS2) is a framework for I/O and streaming of scientific data developed as part of the U.S. DOE Exascale Computing Project. ADIOS2 conveniently provides C, C++, Fortran, and Python APIs for traditional file system I/O, as well as APIs for transporting data between applications running concurrently on HPC systems. Data transport with ADIOS2 can be performed via the file system, wide-area networks (WAN), remote direct memory access (RDMA), or MPI to construct a variety of workflows, such as in-situ (or in-transit) visualization, data analysis, and ML training and inference from ongoing simulations.
Users are invited to find more information about ADIOS2 on their GitHub page and their documentation.
Pre-built modules are available to all users, enabling access to the latest version of ADIOS2 (v2.10). These modules can be displayed with module avail adios2 and comprise a CPU-only build and a SYCL build of the library, with the SYCL build being the default. Note that both ADIOS2 modules also load a Spack installation of Python 3.10 with the numpy, mpi4py, and adios2 packages. For instance, the default SYCL build can be loaded by executing
moduleloadadios2
A custom build of ADIOS2 is also possible on Aurora. In this case, we recommend users start with the following install script for a base build:
The Python bindings for ADIOS2 can be built by setting ADIOS2_USE_Python=ON; however, this requires a Python 3 installation to be found. We recommend users load the Python AI/ML module with module load frameworks and build ADIOS2 under this environment. This will require users to augment their Python path with export PYTHONPATH=$PYTHONPATH:/path/to/adios2-build/install/lib/python3.10/site-packages in order to use the adios2 package. Alternatively, users can use a custom Python installation, but note that ADIOS2 requires numpy and mpi4py as well.
A full list of CMake options is available in the documentation.
Here we show a basic example of using ADIOS2 to stream data between a C++ data producer (e.g., a simulation) and a Python data consumer (e.g., a data analysis or ML component). Both applications are MPI programs. In this simple workflow, each application loops over a workflow iteration loop, in which the producer writes data to the stream and the consumer reads the data. The ADIOS2 IO engine is set to SST for data streaming, and the engine parameters are set to force the producer to pause execution until the consumer has read the data for a given step. This is not a requirement and can be modified with the RendezvousReaderCount, QueueFullPolicy, and QueueLimit parameters. More information on the SST engine can be found in the documentation as well as in the provided examples.
#include<iostream>#include<vector>#include<adios2.h>#include<mpi.h>#include<unistd.h>template<classT>voidPrintData(conststd::vector<T>&data,constintrank,constsize_tstep){std::cout<<"\tProducer Rank["<<rank<<"]: send data [";for(size_ti=0;i<data.size();++i){std::cout<<data[i]<<" ";}std::cout<<"]"<<std::endl;}intmain(intargc,char*argv[]){// MPI_THREAD_MULTIPLE is only required if you enable the SST MPI_DPintrank,size,provide;MPI_Init_thread(&argc,&argv,MPI_THREAD_MULTIPLE,&provide);MPI_Comm_rank(MPI_COMM_WORLD,&rank);MPI_Comm_size(MPI_COMM_WORLD,&size);// Create a new communicatorintcolor=3130,arank,asize;MPI_Commapp_comm;MPI_Comm_split(MPI_COMM_WORLD,color,rank,&app_comm);MPI_Comm_rank(app_comm,&arank);MPI_Comm_size(app_comm,&asize);// ADIOS IO Setupadios2::ADIOSadios(app_comm);adios2::IOsstIO=adios.DeclareIO("myIO");sstIO.SetEngine("Sst");adios2::Paramsparams;params["RendezvousReaderCount"]="1";params["QueueFullPolicy"]="Block";params["QueueLimit"]="1";params["DataTransport"]="RDMA";params["OpenTimeoutSecs"]="600";sstIO.SetParameters(params);// Setup the data to sendstd::vector<float>myArray={0.0,1.0,2.0,3.0,4.0};conststd::size_tNx=myArray.size();for(size_tk=0;k<myArray.size();++k){myArray[k]=myArray[k]+static_cast<float>(Nx*arank);}constfloatincrement=(float)(Nx*asize*1.0);// Define variable and local sizeautobpFloats=sstIO.DefineVariable<float>("y",{asize*Nx},{arank*Nx},{Nx});intworkflow_steps=2;adios2::EnginesstWriter=sstIO.Open("data_stream",adios2::Mode::Write);for(size_ti=0;i<workflow_steps;++i){sleep(3);if(arank==0)std::cout<<"\n Iteration "<<i<<std::endl;sstWriter.BeginStep();sstWriter.Put<float>(bpFloats,myArray.data());PrintData(myArray,rank,i);sstWriter.EndStep();for(size_tk=0;k<myArray.size();++k){myArray[k]+=increment;}}sstWriter.Close();MPI_Finalize();return0;}
frommpi4pyimportMPIimportnumpyasnpfromadios2importStream,Adios,bindings# MPI InitCOMM=MPI.COMM_WORLDRANK=COMM.Get_rank()SIZE=COMM.Get_size()if__name__=='__main__':# Create new communicator (needed for launch on MPMD mode)color=3230app_comm=COMM.Split(color,RANK)asize=app_comm.Get_size()arank=app_comm.Get_rank()adios=Adios(app_comm)# ADIOS IO Setupio=adios.declare_io("myIO")io.set_engine("SST")parameters={'RendezvousReaderCount':'1',# options: 1 for sync, 0 for async'QueueFullPolicy':'Block',# options: Block, Discard'QueueLimit':'1',# options: 0 for no limit'DataTransport':'RDMA',# options: MPI, WAN, UCX, RDMA'OpenTimeoutSecs':'600',# number of seconds SST is to wait for a peer connection on Open()}io.set_parameters(parameters)# Loop over workflow steps and read data at each stepworkflow_steps=2withStream(io,"data_stream","r",app_comm)asstream:foristepinrange(workflow_steps):stream.begin_step()var=stream.inquire_variable("y")shape=var.shape()count=int(shape[0]/asize)start=count*arankifarank==asize-1:count+=shape[0]%asizedata=stream.read("y",[start],[count])print(f"\tConsumer [{arank}]: received data {data}",flush=True)stream.end_step()
To build the C++ producer, use the following CMake file:
cmake_minimum_required(VERSION3.12)project(ADIOS2HelloExample)if(NOTTARGETadios2_core)set(_componentsCXX)find_package(MPICOMPONENTSC)if(MPI_FOUND)# Workaround for various MPI implementations forcing the link of C++ bindingsadd_definitions(-DOMPI_SKIP_MPICXX-DMPICH_SKIP_MPICXX)list(APPEND_componentsMPI)endif()find_package(ADIOS2REQUIREDCOMPONENTS${_components})endif()add_executable(producerproducer.cpp)target_link_libraries(produceradios2::cxx11_mpiMPI::MPI_CXX)install(TARGETSproducerRUNTIMEDESTINATION${CMAKE_INSTALL_BINDIR})
and execute the following commands:
moduleloadadios2
moduleloadcmake
cmake./
make
The example can be run from an interactive session with the following script, which runs the producer and consumer with two ranks per node and places the producer on socket 0 and the consumer on socket 1 of each node. The producer and consumer can also be run on separate nodes by specifying the --hostfile or --hostlist in the mpiexec commands.
#!/bin/bashmoduleloadadios2
exportOMP_PROC_BIND=spread
exportOMP_PLACES=threads
NODES=$(cat$PBS_NODEFILE|wc-l)PROCS_PER_NODE=2PROCS=$((NODES*PROCS_PER_NODE))# Run Python examplempiexec-n$PROCS--ppn$PROCS_PER_NODE--cpu-bindlist:1:2producer&mpiexec-n$PROCS--ppn$PROCS_PER_NODE--cpu-bindlist:53:54pythonconsumer.py
wait
Selecting the SST Data Transport Plane
The SST data transport plane can be selected with the parameter DataTransport. We recommend using RDMA; however, note that it requires running the applications on more than one node. The WAN data plane can also be used, but it may result in slower data transfer performance at scale. The MPI data plane is currently not available, but we are working on resolving the issue with the ADIOS2 team.