Advisor
References
Intel Advisor Performance Optimization Cookbook
Introduction
Intel® Advisor is a design and analysis tool for developing performant code. The tool supports C, C++, Fortran, SYCL, OpenMP, OpenCL™ code, and Python. It helps with the following:
- Performant CPU Code: Design your application for efficient threading, vectorization, and memory use.
- Efficient GPU Offload: Identify parts of the code that can be profitably offloaded. Optimize the code for compute and memory.
- Flow Graph Design and Analysis: Create, visualize, and analyze task and dependency computation for heterogeneous algorithms.
Roofline and Performance Insights for GPUs
Get actionable advice for performant GPU code. In addition to the Roofline Analysis for kernels, you can:
- Get specific, actionable recommendations to design code that runs optimally on GPUs.
- See the CPU and GPU code performance side-by-side with a unified dashboard.
- Discover GPU application performance characterization, such as bandwidth sensitivity, instruction mix, and cache-line use.
Offload Modeling
Understand if your code benefits from GPU porting or how much performance acceleration your GPU code can get from moving to a next-generation GPU. You can:
- Pinpoint offload opportunities where it pays off the most.
- Project the performance on a GPU.
- Identify bottlenecks and potential performance gains.
- Get guidance for optimizing a data transfer between host and target devices.
A quick instruction for Advisor roofline analysis on Intel GPUs
Step1: Setting the environments
Step 2-a: Collecting the GPU Roofline data on a single GPU (Survey analysis and Trip Count with FLOP analysis)
Step 2-b: Collecting the GPU Roofline data on one of MPI ranks(Survey analysis and Trip Count with FLOP analysis)
$ mpirun -n 1 gpu_tile_compact.sh advisor --collect=survey --profile-gpu --project-dir=$PRJ -- <your_executable> <your_arguments> : -n 1 gpu_tile_compact.sh <your_executable> <your_arguments>
$ mpirun -n 1 gpu_tile_compact.sh advisor --collect=tripcounts --profile-gpu --flop --no-trip-counts -- project-dir=$PRJ -- <your_executable> <your_arguments> : -n 1 gpu_tile_compact.sh <your_executable> <your_arguments>
Step 3-a: Generate a GPU Roofline report, and then review the HTML report
Step 3-b: Download the project folder to your local system and open it with the stand-alone Advisor Client
Simple examples
Advisor roofline analysis for one MPI rank out of 12 MPI ranks
$ mpiexec -n 1 gpu_tile_compact.sh advisor --collect=survey --profile-gpu --project-dir=Advisor_results -- ./Comp_GeoSeries_omp_mpicxx_DP 2048 1000 : -n 11 gpu_tile_compact.sh ./Comp_GeoSeries_omp_mpicxx_DP 2048 1000
$ mpiexec -n 1 gpu_tile_compact.sh advisor --collect=tripcounts --profile-gpu --flop --no-trip-counts --project- dir=Advisor_results -- ./Comp_GeoSeries_omp_mpicxx_DP 2048 1000 : -n 11 gpu_tile_compact.sh ./Comp_GeoSeries_omp_mpicxx_DP 2048 1000
$ advisor --report=all --project-dir=Advisor_results --report-output=Advisor_results/roofline_all.html