Quantum software benchmarking
Benchpress is an open-source tool for benchmarking quantum software.
The Benchpress open-source benchmarking suite comprises over 1,000 different tests. These are standardized benchmarking tests designed by other members of the quantum community. For example, Benchpress compares SDKs’ abilities to generate QASMBench circuits, Feynman circuits, and Hamiltonian circuits (https://portal.nersc.gov/cfs/m888/dcamps/hamlib/). It also includes tests designed to test a language's ability to transpiler circuits for specific hardware, including the heavy hex architecture of IBM quantum processors and other generic qubit layouts.
If you find an issue with the testing or how we completed it, we encourage you to make a pull request.
Benchpress currently supports the following SDKs:
- BQSKit (https://github.com/BQSKit/bqskit)
- Braket (https://github.com/amazon-braket/amazon-braket-sdk-python)
- Cirq (https://github.com/quantumlib/Cirq)
- Qiskit (https://github.com/Qiskit/qiskit)
- Qiskit IBM transpiler (https://github.com/Qiskit/qiskit-ibm-transpiler)
- Staq (https://github.com/softwareQinc/staq)
- Tket (https://github.com/CQCL/tket)
Running Benchpress is resource intensive. Although the exact requirements depend on the SDK in question, a full execution of all the SDKs requires a system with 96+Gb of memory and, in some cases, will consume as many CPU resources as are available / assigned. In addition, each suite of tests takes a non-negligible about of time, typically several hours or more depending on the machine and timeout specified.
Benchpress itself requires no installation. However running it requires the tools in requirements.txt
. In addition, running each of the frameworks has its own dependencies in the corresponding *-requirements.txt
file
With the parameter --timeout-skip-list=<SECs>
, a skiplist (a list of tests to skip, given they take too long) is created.
For example, the following line runs the tests in benchpress/tket_gym/construct
with a 1 hour timeout:
python -m pytest --timeout-skip-list=3600 benchpress/tket_gym/construct
This will create a skipfile.txt
file.
The mere existence of this file skips the tests listed there in the following executions.
No modifier needed.
To run the benchmarks in the default configuration from inside the environment in which you want to perform the tests run:
python -m pytest benchpress/*_gym
where *
is one of the frameworks that you want to test, and which matches the environment you are in.
To run the benchmarks and save to JSON one can do:
python -m pytest --benchmark-save=SAVED_NAME benchpress/*_gym
which will save the file to the CWD in the .benchmarks
folder
Further details on using pytest-benchmark
can be found here: https://pytest-benchmark.readthedocs.io/en/latest/usage.html
Benchmarking the amount of memory a test uses can be very costly in terms of time and memory. Here we use the pytest-memray
plugin. Calling the memory bechmark looks like:
python -m pytest --memray --trace-python-allocators --native --most-allocations=100 --benchmark-disable benchpress/*_gym
Here --memray
turns on the memory profiler, --trace-python-allocators
tracks all the memoryu allocations from Python, --native
track C/C++/Rust memory, --most-allocations=N
shows only the top N
tests in terms of memory consuption, and finally --benchmark-disable
turns off the timing benchmarks.
The pytest-memray
plugin will sometimes raise on building the histrogram included in the report by default. Currently the only way around this error, which does not affect the tests, is to manually comment out L322 and L323 from the plugin.py
file:
#histogram_txt = cli_hist(sizes, bins=min(len(sizes), N_HISTOGRAM_BINS))
#writeln(f"\t 📊 Histogram of allocation sizes: |{histogram_txt}|")
We have designed Benchpress in a manner to allow all tests to be executed on each SDK, regardless of whether that functionality is supported or not. This is facilitated by the use of "workouts" that define abstract base classes that define each set of tests. This design choice has the advantage of explicitly measuring the breadth of SDK functionality
In Benchpress each test status has a well defined meaning:
-
PASSED - Indicates that the SDK has the functionality required to run the test, and doing so completed without error, and within the desired time-limit.
-
SKIPPED - The SDK does not have the required functionality to execute the test. This is the default for all tests defined in the workouts.
-
FAILED - The SDK has the necessary functionality, but the test failed or the test did not complete within the set time-limit.
-
XFAIL - The test fails in an irrecoverable manner, and is therefore tagged as failed rather than being executed. E.g. the test tries to use more memory than is available.
Running the full suite of tests will easily take a week or more if executed in serial, e.g. so that memory bandwidth or multiprocessing usage does no skew results. Users can always select a subset of tests to reduce this overall time.
Benchpress makes use of files from the following open-source packages under terms of their licenses. License files are included in the corresponding directories.