[GR-58118] Adapt saturation to open world. #9962
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In its original design saturation is an analysis optimization that tries to avoid doing work during analysis that wouldn't lead to any potential for optimization during graph strengthening. In summary when a type state's cardinality reaches a threshold we stop tracking individual types and fallback to its declared type information, i.e., to the stamp information from Graal graphs. Then, during graph strengthening we skip optimizing when a flow corresponding to a Graal node is saturated.
For layered images, and for open world analysis in general, we wanted to reuse the saturation stamp to signal that some values tracked by the analysis are incomplete, i.e., they contain types that could be extended in subsequent layers, and they shouldn't be used to optimize graphs. To achieve this we force saturate parameters of entry point methods and returns from virtual invokes. However, saturation was not designed for this use case, and it needs some modifications to support open world analysis.
In a closed world we stop the propagation of saturation if we have a safe approximation since we try to retain some level of precision even when using saturation. For example:
These kind of optimizations are problematic for open world analysis when we use saturated to also mean incomplete. They can stop the propagation of saturation injected at entry points prematurely, so deep code may be wrongfully optimized. This PR makes saturation more aggressive: it allows it to propagate freely through many places where we previously restricted it. This will lead to less stamps strengthened based on analysis results, but the remaining ones should be correct. Still, we try to regain some precision where possible. For example we use the idea of "closed types" to refer to types whose hierarchy is complete in the observable universe. They are for example leaf types or array types. Since we know all the sub-types we can again restrict saturation for example when filtering with a closed type. Future work will further explore improving the precision of open world analysis, while preserving soundness.
An alternative idea was to implement a taint analysis on top of the existing points-to analysis to propagate the incomplete stamp from entry points. While this is theoretically possible in general, in our case the problem is that a taint analysis doesn't compose well with saturation. Propagating saturation has side effects on the type flow graphs: a saturated flow will de-register its uses and observers. Therefore there wouldn't be any edges left to propagate the taint if flows in the graph already saturated. We don't want to disable saturation since it has significant performance benefits. We could preserve the original graph intact just for taint propagation but the implementation effort for that would be higher. Instead, we chose to make the saturation more general and adapt to open world constraints. In its essence the idea of saturation borrows a lot from taint analysis to begin with.