Simulation performance improvements #359

ohuettenhofer · 2024-08-22T08:12:07Z

While investigating why the Qulacs backend does not seem to use multiple threads in my code, I noticed that very little time is spent in the actual simulation.

For testing, I used the Quantum Fourier Transform. By running either just a QFT, or a QFT and then an inverse QFT, we can make the resulting wavefunction dense or sparse, which is important for testing the performance.

Code

import tequila as tq
from math import pi
import time

SIZE = 20

def qft(size: int, inverse: bool = False) -> tq.QCircuit():
    U = tq.QCircuit()
    for i in range(size):
        U += tq.gates.H(target=i)
        for j in range(size - i - 1):
            U += tq.gates.Phase(target=i + j + 1, angle=(-1 if inverse else 1) * pi / 2 ** (j + 2))
            U += tq.gates.CRz(target=i, control=i + j + 1, angle=(-1 if inverse else 1) * pi / 2 ** (j + 1))
    return U

U = qft(SIZE)
V = qft(SIZE) + qft(SIZE, inverse=True)

start = time.time()
wfn = tq.simulate(U, backend="qulacs")
print(f"QFT time: {time.time() - start} s")

start = time.time()
wfn = tq.simulate(V, backend="qulacs")
print(f"QFT + inverse QFT time: {time.time() - start} s")

Running this code on the devel branch takes ~160 seconds for the normal QFT and ~20 seconds for the combined QFT and inverse QFT on my machine. Note that the second one is much faster even though the simulated circuit is twice as large, this is because the result is a sparse wavefunction.
When measuring the time of self.circuit.update_quantum_state(state) in do_simulate in simulator_qulacs.py (which I believe is the actual simulation), it's only ~0.17s / ~0.28s (QFT / QFT + inverse), so there is a lot of room for speedups.
All the times here are simple measurements of single runs, and not proper benchmarks, but the differences are so large that it shouldn't matter if the numbers are inaccurate.

I made four commits which speed this up:

For a dense wavefunction, a lot of time is spent in the apply_keymap function in simulate. This doesn't seem necessary when all qubits are active, since then the mapping has no effect. So I added a check that skips the keymap entirely in that case, reducing the runtime to ~24s / ~20s.
The numpy.isclose call in from_array in qubit_wavefunction.py is expensive, likely because it checks for edge cases (infinity, nan) and because it can handle arrays. Assuming we simply want to skip values close to zero, this can be done much faster using a simple comparison.
Also, we only need to create the index bitstring when we actually want to set the value, so moving it inside the if clause saves a lot of time for sparse wavefunctions. These two changes reduce the runtime to ~6s / ~0.7s.
Still in the from_array function, if the numbering of the array doesn't match the numbering of the wavefunction, a keymap is applied to adjust it. This should not be necessary because the initialize_bitstring function should already return the correct type of bitstring. However currently the returned type depends on the numbering_in parameter, changing it to the numbering_out parameter allows removing the keymap and reduces the time to ~3.7s / ~0.6s.
Reversing the bits in a bitstring is currently done by converting the integer to a binary string, reversing the characters, and converting them back to an integer. This is inefficient and can be replaced with a function that uses bit operations. An interpretation of this code is that each line flips a bit in the position of each bit, so when all position bits are flipped, the position p is moved to 31 - p. This assumes the integer has exactly 32 bit, so if it has less, this adds zeros which we can get rid of with a bitshift.
Another small optimization is to calculate the logarithm to get the bitstring length, instead of converting to a binary string and counting (the integer bit_length() method can't be used because some code passes Numpy ints). Together, this reduces the runtime to ~2.5s / ~0.6s.

In total, the speedup is ~60x / ~30x for the script above, but is likely much less for other code. I have not tested if these speedups transfer to other backends. Further improvements are definitely possible, but I think I have covered a good amount of the low hanging fruit.

… in `from_array`

…B2LSB/LSB2MSB keymap in `from_array`

kottmanj · 2024-08-27T12:11:41Z

some of the code you changed is still from the first prototype of tequila -- back then there wasn't much thought on speed. Would have make sense to improve it earlier, but I think it's because the effects on the variational applications are not that high. The wfn = tq.simulate(U) wavefunction simulation was initially just a bonus feature (as tq it is primarily designed for expectation values) -- But it is great to have that bonus feature accelerated :-) [especially since we also often use it for development purposes]

I need to double check code changes, but I think this is good to go.

Thank you once more!

ohuettenhofer added 4 commits August 21, 2024 15:16

Only apply keymap if necessary in simulate

9cc5881

Simplify comparison and only calculate index bitstring when necessary…

b59d1a5

… in `from_array`

Fix numbering_out parameter in initialize_bitstring and remove MS…

33b943d

…B2LSB/LSB2MSB keymap in `from_array`

Optimize initialize_bitstring

4aa7e93

kottmanj merged commit d1e5b95 into tequilahub:devel Aug 30, 2024
11 checks passed

ohuettenhofer deleted the perf branch October 30, 2024 17:01

ohuettenhofer restored the perf branch October 30, 2024 17:01

ohuettenhofer deleted the perf branch October 30, 2024 17:05

ohuettenhofer mentioned this pull request Nov 11, 2024

Refactor QubitWaveFunction, add support for Numpy array backed dense representation #370

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simulation performance improvements #359

Simulation performance improvements #359

ohuettenhofer commented Aug 22, 2024

kottmanj commented Aug 27, 2024

Simulation performance improvements #359

Simulation performance improvements #359

Conversation

ohuettenhofer commented Aug 22, 2024

kottmanj commented Aug 27, 2024