Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

union-types: use insertion (stable) sort instead of qsort #45896

Merged
merged 1 commit into from
Jul 6, 2022

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Jul 1, 2022

Different platforms implement qsort differently, leading to
platform-specific errors. This is a quick port of the ml_matches
algorithm for use instead. For small unions (almost always), this should
also be slightly faster, though insignificant.

Refs #45874

@vtjnash vtjnash added backport 1.6 Change should be backported to release-1.6 backport 1.7 backport 1.8 Change should be backported to release-1.8 labels Jul 1, 2022
Copy link
Member

@quinnj quinnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally forgot about this code. Silly qsort on windows.

Different platforms implement qsort differently, leading to
platform-specific errors. This is a quick port of the ml_matches
algorithm for use instead. For small unions (almost always), this should
also be slightly faster, though insignificant.

Refs #45874
@vtjnash vtjnash force-pushed the jn/isort_unions branch from 4d90a04 to cfe770a Compare July 1, 2022 19:46
while j > lo && lt(o, x, v[j-1])
v[j] = v[j-1]
while j > lo
y = v[j-1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any case where the compiler doesn't already do this optimization?
The only case I can think of would be stateful getindex

function getindex(x::LogAtReadArray, i::Int)
    println(i)
    getindex(x.data, i)
end

which isn't particularly compelling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any moderately complex or non-inlined getindex might not be eliminated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess those likely exist, I've just never seen a subtype of AbstractVector with a complex getindex.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any non-inlined lt user-method will also disallow this from occurring as a direct optimization (it would be semantically invalid)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how that would be semantically invalid, but I'm not familiar with the relevant compiler semantics.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could throw for example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not seeing it. Even if lt throws, y = v[j-1]; lt(o, x, y) is indistinguishable from lt(o, x, v[j-1]).

As a sanity check,

using Primes

@noinline function lt(x, y)
    isprime(x) && error()
    x < y
end

@code_llvm sort!([1,2,3], 1, 3, Lt(lt))

Looks the same before and after.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks different before/after to me. There is an excess load in the before case.

Copy link
Member

@LilithHafner LilithHafner Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now! lt could mutate v. That would make this optimization change behavior. That behavior change is okay with me, but not okay with the compiler. The compiler cannot store the result of v[j - 1] unless v[j - 1] is pure with respect to the interleaving calls (including lt) which it can only know if both getindex and lt are simple and inlined.

I must have made a mistake in diffing. You're right, there is an extra load in the before case.

I'd be hard-pressed to find a case where this makes a measurable runtime difference, but I now see why it is an improvement in theory.

Thanks, @KristofferC and @vtjnash for explaining this to me!

@KristofferC KristofferC merged commit 8cc5445 into master Jul 6, 2022
@KristofferC KristofferC deleted the jn/isort_unions branch July 6, 2022 12:34
KristofferC pushed a commit that referenced this pull request Jul 7, 2022
Different platforms implement qsort differently, leading to
platform-specific errors. This is a quick port of the ml_matches
algorithm for use instead. For small unions (almost always), this should
also be slightly faster, though insignificant.

Refs #45874

(cherry picked from commit 8cc5445)
KristofferC pushed a commit that referenced this pull request Jul 8, 2022
Different platforms implement qsort differently, leading to
platform-specific errors. This is a quick port of the ml_matches
algorithm for use instead. For small unions (almost always), this should
also be slightly faster, though insignificant.

Refs #45874

(cherry picked from commit 8cc5445)
@KristofferC KristofferC removed the backport 1.8 Change should be backported to release-1.8 label Jul 8, 2022
@LilithHafner LilithHafner added the sorting Put things in order label Jul 19, 2022
ffucci pushed a commit to ffucci/julia that referenced this pull request Aug 11, 2022
…45896)

Different platforms implement qsort differently, leading to
platform-specific errors. This is a quick port of the ml_matches
algorithm for use instead. For small unions (almost always), this should
also be slightly faster, though insignificant.

Refs JuliaLang#45874
pcjentsch pushed a commit to pcjentsch/julia that referenced this pull request Aug 18, 2022
…45896)

Different platforms implement qsort differently, leading to
platform-specific errors. This is a quick port of the ml_matches
algorithm for use instead. For small unions (almost always), this should
also be slightly faster, though insignificant.

Refs JuliaLang#45874
KristofferC pushed a commit that referenced this pull request Dec 21, 2022
Different platforms implement qsort differently, leading to
platform-specific errors. This is a quick port of the ml_matches
algorithm for use instead. For small unions (almost always), this should
also be slightly faster, though insignificant.

Refs #45874

(cherry picked from commit 8cc5445)
KristofferC pushed a commit that referenced this pull request Dec 21, 2022
Different platforms implement qsort differently, leading to
platform-specific errors. This is a quick port of the ml_matches
algorithm for use instead. For small unions (almost always), this should
also be slightly faster, though insignificant.

Refs #45874

(cherry picked from commit 8cc5445)
KristofferC pushed a commit that referenced this pull request Dec 21, 2022
Different platforms implement qsort differently, leading to
platform-specific errors. This is a quick port of the ml_matches
algorithm for use instead. For small unions (almost always), this should
also be slightly faster, though insignificant.

Refs #45874

(cherry picked from commit 8cc5445)
staticfloat pushed a commit that referenced this pull request Dec 23, 2022
Different platforms implement qsort differently, leading to
platform-specific errors. This is a quick port of the ml_matches
algorithm for use instead. For small unions (almost always), this should
also be slightly faster, though insignificant.

Refs #45874

(cherry picked from commit 8cc5445)
KristofferC pushed a commit that referenced this pull request Oct 10, 2023
Different platforms implement qsort differently, leading to
platform-specific errors. This is a quick port of the ml_matches
algorithm for use instead. For small unions (almost always), this should
also be slightly faster, though insignificant.

Refs #45874

(cherry picked from commit 8cc5445)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.6 Change should be backported to release-1.6 sorting Put things in order
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants