The ISCA 2020 paper
describes the goals of Accel-Sim and introduces the tool. This readme is meant to provide tutorial-like details on how to use the Accel-Sim
framework. If you use any component of Accel-Sim, please cite:
Mahmoud Khairy, Zhensheng Shen, Tor M. Aamodt, Timothy G. Rogers,
Accel-Sim: An Extensible Simulation Framework for Validated GPU Modeling,
in 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA)
This repository also includes AccelWattch: A Power Modeling Framework for Modern GPUs. The MICRO 2021 paper introduces AccelWattch. Please look at our AccelWattch MICRO’21 Artifact Manual for detailed information on various AccelWattch components. For information on just running AccelWattch, please look at the AccelWattch Overview section in this read-me.
If you use any component of AccelWattch, please cite:
Vijay Kandiah, Scott Peverelle, Mahmoud Khairy, Amogh Manjunath, Junrui Pan, Timothy G. Rogers, Tor Aamodt, Nikos Hardavellas,
AccelWattch: A Power Modeling Framework for Modern GPUs,
in 2021 IEEE/ACM International Symposium on Microarchitecture (MICRO)
This package is meant to be run on a modern linux distro.
A docker image that works with this repo can be found here.
The dockerfile used to build this image can be found here, which built on top of nvidia/cuda:12.8.0-cudnn-devel-ubuntu24.04
.
To build on local machine, install the following packages with CUDA toolkit:
# Assuming running on Ubuntu 24.04 and installing CUDA 12.8
sudo apt-get install -y wget build-essential xutils-dev bison zlib1g-dev flex
libglu1-mesa-dev git g++ libssl-dev libxml2-dev libboost-all-dev git g++
libxml2-dev vim python-setuptools build-essential python3-pip
pip3 install pyyaml plotly psutil
wget https://developer.download.nvidia.com/compute/cuda/12.8.1/local_installers/cuda_12.8.1_570.124.06_linux.run
sh cuda_12.8.1_570.124.06_linux.run --silent --toolkit
rm cuda_12.8.1_570.124.06_linux.run
The code for the Accel-Sim and AccelWattch frameworks are in this repo. Accel-Sim 1.0 uses the
GPGPU-Sim 4.0 performance model, which was released as part of the original
Accel-Sim paper. Building the trace-based Accel-Sim will pull the right version of
GPGPU-Sim 4.0 and the AccelWattch power model to use in Accel-Sim. AccelWattch replaces the GPUWattch power model in GPGPU-Sim 4.0.
There is an additional repo where we have collected a set of common GPU applications and a common infrastructure for building
them with different versions of CUDA. If you use/extend this app framework, it makes Accel-Sim easily usable
with a few simple command lines. The instructions in this README will take you through how to use Accel-Sim with
the apps in from this collection as well as just on your own, with your own apps.
AccelWattch microbenchmarks and AccelWattch validation set benchmarks are also included. For more information on these benchmarks, please look at our MICRO 2021 paper and AccelWattch MICRO’21 Artifact Manual.
Note, that all the python scripts in the following sections have more detailed options explanations when run with
--help
An NVBit tool for generating SASS traces from CUDA applications. Code for the tool lives in ./util/tracer_nvbit/
. To make the tool:
that get run in our travis regressions:
To extend the tracer, use other apps and understand what, exactly is going on, read this.
For convience, we have included a repository of pre-traced applications – to get all those traces, simply run:
./get-accel-sim-traces.py
and follow the instructions.
A simulator frontend that consumes SASS traces and feeds them into a performance model. The intial release of Accel-Sim coincides with the release of GPGPU-Sim 4.0, which acts as the detailed performance model. To build the Accel-Sim simulator that uses the traces, do the following:
pip3 install -r requirements.txt source ./gpu-simulator/setup_environment.sh # Build with make make -j -C ./gpu-simulator/ # Build with CMake cmake -S ./gpu-simulator/ -B ./gpu-simulator/build cmake --build ./gpu-simulator/build -j8 cmake --install ./gpu-simulator/build
This will produce an executable in:
./gpu-simulator/bin/release/accel-sim.out
Running the simple example in the tracer section:
./util/job_launching/run_simulations.py -B rodinia_2.0-ft -C QV100-PTX -N myTest-PTX
You can monitor the tests using:
./util/job_launching/monitor_func_test.py -v -N myTest
After the jobs finish – you can collect all the stats using:
./util/job_launching/get_stats.py -N myTest | tee stats.csv
If you want to run the accel-sim.out executable command itself for specific workload, you can use:
/gpu-simulator/bin/release/accel-sim.out -trace ./hw_run/rodinia_2.0-ft/9.1/backprop-rodinia-2.0-ft/4096___data_result_4096_txt/traces/kernelslist.g -config ./gpu-simulator/gpgpu-sim/configs/tested-cfgs/SM7_QV100/gpgpusim.config -config ./gpu-simulator/configs/tested-cfgs/SM7_QV100/trace.config
However, we encourage you to use our workload launch manager ‘run_simulations’ script as shown above, which will greatly simplify the simulation process and increase productivity.
To understand what is going on and how to just run the simulator in isolation without the framework, read this.
To better undersatnd the Accel-Sim front-end and the interface with GPGPU-Sim, read this.
A tool that matches, plots and correlates statistics from the performance model with real hardware statistics generated by profiling tools. To use the correlator, you must first generate hardware output and simulation statistics. To generate output from the GPU, use the scripts in ./util/hw_stats.
For example, to generate the profiler numbers for the short-running apps in our running example, do the following:
Note: this step assumes you have already built the apps using the instructions from simple example in the tracer section.
./util/hw_stats/run_hw.py -B rodinia_2.0-ft
Note: Different cards support different profilers. By default – this script will use nvprof. However, you can use nsight-cli instea