MGPUSim is a high-flexibility, high-performance, high-accuracy GPU simulator. It models GPUs that run the AMD GCN3 instruction sets. One main feature of MGPUSim is the support for multi-GPU simulation (you can still use it for single-GPU architecture research).
- Install the most recent version of Go from golang.org.
- Clone this repository, assuming the path is
[mgpusim_home]
. - Change your current directory to
[mgpusim_home]/samples/fir
. - Compile the simulator with the benchmark with
go build
. The compiler will generate an executed calledfir
(on Linux or Mac OS) orfir.exe
(on Windows) for you. - Run the simulation with
./fir -timing --report-all
to run the simulation. - Check the generated
metrics.csv
file for high-level metrics output.
If a modification to Akita is required, you can clone Akita next to the MGPUSim directory in your system. Then, you can modify the go.mod
file to include the following line.
replace github.com/sarchlab/akita/v3 => ../akita
This line will direct the go compiler to use your local version of Akita rather than the official release of Akita.
AMD APP SDK | DNN Mark | HeteroMark | Polybench | Rodinia | SHOC |
---|---|---|---|---|---|
Bitonic Sort | MaxPooling | AES | ATAX | Needleman-Wunsch | BFS |
Fast Walsh Transform | ReLU | FIR | BICG | FFT | |
Floyd-Warshall | KMeans | SPMV | |||
Matrix Multiplication | PageRank | Stencil2D | |||
Matrix Transpose | |||||
NBody | |||||
Simple Covolution |
You can run a simulation with the --report-all
argument to enable all the performance metrics.
- Total execution time
- Total kernel time
- Per-GPU kernel time
- Instruction count on each Compute Unit
- Average request latency on all the cache components
- Number of read-misses, read-mshr-hits, read-hits, write-misses, write-mshr-hits, and write hits on all the cache components
- Number of incoming transactions and outgoing transactions on all the RDMA components.
- Number of transactions on each DRAM controller.
- Create a new repository repo. Typically we create one repo for each project, which may contain multiple experiments.
- Create a folder in your repo for each experiment. Run
go init [git repo path]/[directory_name]
to initialize the folder as a new go module. For example, if your git repository is hosted athttps://github.com/syifan/fancy_project
and your experiment folder is named asexp1
, your module path should bejackfan.us.kg/syifan/fancy_project/exp1
. - Copy all the files under the directory
samples/experiment
to your experiment folder. In themain.go
file, change the benchmark and the problem size to run. Or you can use an argument to select which benchmark to run. The filerunner.go
,platform.go
,r9nano.go
, andshaderarray.go
serve as configuration files. So you need to change them according to your need. - It is also possible to modify an existing component or adding a new component. You should copy the folder that includes the component you want to modify to your repo first. Then, modify the configuration scripts to link the system with your new component. You can try to add some print commands to see if your local component is used. Finally, you can start to modify the component code.
- If you find any bug related to the simulator (e.g., simulator is not accurately modeling some behavior or the simulator is not getting the correct emulation result), please raise an issue in the issue tab.
- If you want a new feature (e.g., you need to implement some new instructions or you want to model some new components), please also raise an issue.
- If you want to add a feature or fix a bug, create a pull request.
- There is no particular style requirement other than the default Go style requirement. Please run
gofmt
,goimports
, orgoreturns
before making your merge request ready. Also, runninggolangci-lint run
in the root directory will point you out most of the styling errors.
If you use MGPUSim in your research, please cite our ISCA '19 paper.
@inproceedings{sun19mgpusim,
author = {Sun, Yifan and Baruah, Trinayan and Mojumder, Saiful A. and Dong, Shi and Gong, Xiang and Treadway, Shane and Bao, Yuhui and Hance, Spencer and McCardwell, Carter and Zhao, Vincent and Barclay, Harrison and Ziabari, Amir Kavyan and Chen, Zhongliang and Ubal, Rafael and Abell\'{a}n, Jos\'{e} L. and Kim, John and Joshi, Ajay and Kaeli, David},
title = {MGPUSim: Enabling Multi-GPU Performance Modeling and Optimization},
year = {2019},
isbn = {9781450366694},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3307650.3322230},
doi = {10.1145/3307650.3322230},
booktitle = {Proceedings of the 46th International Symposium on Computer Architecture},
pages = {197–209},
numpages = {13},
keywords = {simulation, multi-GPU systems, memory management},
location = {Phoenix, Arizona},
series = {ISCA '19}
}
Papers that use MGPUSim:
- Dynamic GMMU Bypass for Address Translation in Multi-GPU Systems
- Valkyrie: Leveraging Inter-TLB Locality to Enhance GPU Performance
- MGPU-TSM: A Multi-GPU System with Truly Shared Memory
- Griffin: Hardware-Software Support for Efficient Page Migration in Multi-GPU Systems
- HALCONE: A Hardware-Level Timestamp-based Cache Coherence Scheme for Multi-GPU systems
- Priority-Based PCIe Scheduling for Multi-Tenant Multi-GPU Systems
- Exploiting Adaptive Data Compression to Improve Performance and Energy-efficiency of Compute Workloads in Multi-GPU Systems
MIT © Project Akita Developers.