-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concrete-eval: Use concrete eval information even if the result is big #47283
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider for example a lookup of a small struct inside a large, multi-level constant structure. The final result might be quite tiny, but if we refuse to propagate the intermediate levels, we might end up doing an (expensive) full-constprop when propagating the constant information could have given us a much cheaper concrete evaluation.
Maybe we can have a benchmark target for this kind of case in BaseBenchmarks.jl?
This PR itself LGTM.
Yes, should be reasonably to do. Just build a trie or a heap out of tuples or something. |
I will try to come up with one. For the meanwhile, let's try the other benchmarks and merge this if there are no apparent regressions. @nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
…o large When we originally implemented concrete-eval, we decided not to create `Const` lattice elements for constants that are too large, on the fear that these would end up in the IR and blowing up the size. Now that we have some experience with this, I think that decision was incorrect for several reasons: 1. We've already performed the concrete evaluation (and allocated the big object), so we're just throwing away precision here that we could have otherwise used (Although if the call result is unused, we probably shouldn't do concrete eval at all - see #46774). 2. There's a number of other places in the compiler where we read large values into `Const`. Unless we add these kinds of check there too, we need to have appropriate guards in the optimizer and the cache anyway, to prevent the IR size blowup. 3. It turns out that throwing away this precision actually causes significant performance problems for code that is just over the line. Consider for example a lookup of a small struct inside a large, multi-level constant structure. The final result might be quite tiny, but if we refuse to propagate the intermediate levels, we might end up doing an (expensive) full-constprop when propagating the constant information could have given us a much cheaper concrete evaluation. This commit simply removes that check. If we see any regressions as a result of this, we should see if there are additional guards needed in the optimizer.
cbebf22
to
209d943
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
Nanosolider is fine here right, since "allinference" went down it actually got faster, |
Sounds reasonable. Let's get this merged. |
It is possible for concrete-eval to refine effects, but be unable to inline (because the constant is too large, c.f. #47283). However, in that case, we would still like to use the extra effect information to make sure that the optimizer can delete the statement if it turns out to be unused. There are two cases: the first is where the call is not inlineable at all. This one is simple, because we just apply the effects on the :invoke as we usually would. The second is trickier: If we do end up inlining the call, we need to apply the overriden effects to every inlined statement, because we lose the identity of the function as a whole. This is a bit nasty and I don't really like it, but I'm not sure what a better alternative would be. We could always refuse to inline calls with large-constant results (since we currently pessimize what is being inlined anyway), but I'm not sure that would be better. This is a simple solution and works for the case I have in practice, but we may want to revisit it in the future.
It is possible for concrete-eval to refine effects, but be unable to inline (because the constant is too large, c.f. #47283). However, in that case, we would still like to use the extra effect information to make sure that the optimizer can delete the statement if it turns out to be unused. There are two cases: the first is where the call is not inlineable at all. This one is simple, because we just apply the effects on the :invoke as we usually would. The second is trickier: If we do end up inlining the call, we need to apply the overriden effects to every inlined statement, because we lose the identity of the function as a whole. This is a bit nasty and I don't really like it, but I'm not sure what a better alternative would be. We could always refuse to inline calls with large-constant results (since we currently pessimize what is being inlined anyway), but I'm not sure that would be better. This is a simple solution and works for the case I have in practice, but we may want to revisit it in the future.
@@ -1483,15 +1487,12 @@ function handle_concrete_result!(cases::Vector{InliningCase}, result::ConcreteRe | |||
return true | |||
end | |||
|
|||
may_inline_concrete_result(result::ConcreteResult) = | |||
isdefined(result, :result) && is_inlineable_constant(result.result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am hoping someday we can get rid of this arbitrary size cutoff entirely, and just do something simple like:
is_inlineable_constant(x) = (!ismutable(x) && datatype_pointerfree(typeof(x))) || x isa Type || x isa Symbol
this avoids embedding pointers to random bits of mutable memory, but otherwise drops the random size cut-off
It is possible for concrete-eval to refine effects, but be unable to inline (because the constant is too large, c.f. #47283). However, in that case, we would still like to use the extra effect information to make sure that the optimizer can delete the statement if it turns out to be unused. There are two cases: the first is where the call is not inlineable at all. This one is simple, because we just apply the effects on the :invoke as we usually would. The second is trickier: If we do end up inlining the call, we need to apply the overriden effects to every inlined statement, because we lose the identity of the function as a whole. This is a bit nasty and I don't really like it, but I'm not sure what a better alternative would be. We could always refuse to inline calls with large-constant results (since we currently pessimize what is being inlined anyway), but I'm not sure that would be better. This is a simple solution and works for the case I have in practice, but we may want to revisit it in the future.
It is possible for concrete-eval to refine effects, but be unable to inline (because the constant is too large, c.f. #47283). However, in that case, we would still like to use the extra effect information to make sure that the optimizer can delete the statement if it turns out to be unused. There are two cases: the first is where the call is not inlineable at all. This one is simple, because we just apply the effects on the :invoke as we usually would. The second is trickier: If we do end up inlining the call, we need to apply the overriden effects to every inlined statement, because we lose the identity of the function as a whole. This is a bit nasty and I don't really like it, but I'm not sure what a better alternative would be. We could always refuse to inline calls with large-constant results (since we currently pessimize what is being inlined anyway), but I'm not sure that would be better. This is a simple solution and works for the case I have in practice, but we may want to revisit it in the future.
This undoes some of the choices made in #47283 and #47305. As Shuhei correctly pointed out, even with the restriction to `nothrow`, adding the extra flags on the inlined statements results in incorrect IR. Also, my bigger motivating test case turns out to be insufficiently optimized without the effect_free flags (which I removed in the final revision of #47305). I think for the time being, the best course of action here is to just stop inlining concrete-eval'ed calls entirely. The const result is available for inference, so in most cases the call will get deleted. If there's an important case we care about where this does not happen, we should take a look at that separately.
This undoes some of the choices made in #47283 and #47305. As Shuhei correctly pointed out, even with the restriction to `nothrow`, adding the extra flags on the inlined statements results in incorrect IR. Also, my bigger motivating test case turns out to be insufficiently optimized without the effect_free flags (which I removed in the final revision of #47305). I think for the time being, the best course of action here is to just stop inlining concrete-eval'ed calls entirely. The const result is available for inference, so in most cases the call will get deleted. If there's an important case we care about where this does not happen, we should take a look at that separately.
…rge (#47371) This undoes some of the choices made in #47283 and #47305. As Shuhei correctly pointed out, even with the restriction to `nothrow`, adding the extra flags on the inlined statements results in incorrect IR. Also, my bigger motivating test case turns out to be insufficiently optimized without the effect_free flags (which I removed in the final revision of #47305). I think for the time being, the best course of action here is to just stop inlining concrete-eval'ed calls entirely. The const result is available for inference, so in most cases the call will get deleted. If there's an important case we care about where this does not happen, we should take a look at that separately.
When we originally implemented concrete-eval, we decided not to create
Const
lattice elements for constants that are too large, on the fear that these would end up in the IR and blowing up the size. Now that we have some experience with this, I think that decision was incorrect for several reasons:Const
. Unless we add these kinds of check there too, we need to have appropriate guards in the optimizer and the cache anyway, to prevent the IR size blowup.This commit simply removes that check. If we see any regressions as a result of this, we should see if there are additional guards needed in the optimizer.