JIT optimization: value re-numbering in CSE doesn't help span loops as much as array loops #8177
Labels
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
enhancement
Product code improvement that does NOT require public API changes/additions
optimization
tenet-performance
Performance related issue
Milestone
CSE updates the value numbers, when a value being CSEd has a single definition, so that all uses have that same conservative value number. This improves subsequent range check elimination, because assertions generated for those nodes will have the same value number. This works just as well for spans as it does for arrays.
CSE also updates the value numbers of comparisons against checked bounds, since the code in range check opt that computes ranges for operands will consult the VNs on the compares, and only benefit if the VNs for the compares are functions of the same VNs for the bounds. This part needs to be able to find the compares whose VNs need updating (which are compares that are functions of the expressions that become CSE uses). For efficiency, rather than searching the trees at this point, CSE builds a map eariler, during
optValnumCSE_Locate
, mapping checked bound uses to the compares that are functions of them. Only compares that get inserted as values of this map get updated. So ideally this map would include as keys all nodes that are occurrences of any CSE candidate which has a checked bound VN. But computing "does this candidate have a checked bound VN?" requires visiting all the occurrences of the candidate, andoptValnumCSE_Locate
runs while we are building the candidate lists, before we can enumerate them. So this code would be a bit more effective if it were re-worked to update the compares that are functions of all the occurrences of relevant CSE candidates.This optimization currently works better for arrays than for spans, since this code proactively reports all array length VNs as checked bound VNs, and so they all get included in the map built by
optValnumCSE_Locate
. This special case for arrays could be removed if the update gets reworked.I'm creating this issue so it doesn't get lost, but I'm putting it in the "Future" mile stone because it's not at all clear that it's worth pursuing; even adding this update for compares of arrays in the first place was debatable, and the "motivating" case here is a test in Span/IndexerBench not getting optimized as well as the equivalent array case, but with the subsequent change to promote implicit byrefs (#10453), we're already optimizing that "motivating" case at array parity without this change.
category:cq
theme:cse
skill-level:expert
cost:medium
impact:small
The text was updated successfully, but these errors were encountered: