-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: robust division for Complex{Float64} #5112
RFC: robust division for Complex{Float64} #5112
Conversation
Awesome. I'll let @JeffBezanson take a look and pull the trigger. |
This looks really thorough! |
How does the performance compare to the existing version? |
It seems like this would work for Float32 as well, at least. I realize the
|
I'll work up some benchmarks later, in the paper they benchmark the robust algorithm as something like 1/3 slower than smith's algorithm (which I believe is the current algorithm). As for In the paper their reference "correct" answers come from using smith's algorithm with |
There is no magic to do Float32 operations as Float64, but you are right that we should do Float32 complex division by converting to Float64, using the old algorithm, and converting back. |
I'm not too worried about performance here; I think the extra accuracy is well worth it. |
OK I think this is ready to merge. But I'd like to ask that the link to the scilab file be removed, since you did not copy the code from there and I don't want to create confusion about whether we are bound by its license. |
Ok. That was fun. Now I need to find another easy issue. |
bump |
RFC: robust division for Complex{Float64}
Great to have this merged. |
Isn't the main point of the old (Smith) algorithm to avoid spurious overflow (in Speaking of spurious overflow, the definition
|
# a + i*b | ||
# p + i*q = --------- | ||
# c + i*d | ||
function robust_cdiv{T<:Float64}(a::T,b::T,c::T,d::T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is T<:Float64
for? Since Float64
is a concrete type, how is it possible T
to be anything but Float64
?
(It would be nice to see our own benchmarks; I've learned the hard way that one should always be cautious about performance claims in papers.) |
I used I'll put benchmarks back on my todo. |
There is no performance penalty, but it would be cleaner to remove the |
Aside: it can actually be useful to use a concrete type as an upper type bound if the parameter could be foo{T<:Float64}(v::Vector{T}) = ... This |
I put together a simple benchmark, and the speed difference seems to be greater than the paper suggested. The paper benchmarked the robust algorithm at about 4/3 the time of the smith algorithm if I'm remembering correctly. Here I'm seeing closer to a factor of 5. No idea why so much memory is being allocated. https://gist.github.com/ggggggggg/8046958 |
Are you running that in julia 0.2? It should be much faster with current master. |
I did a simple benchmark with master:
and got
I haven't had a chance to try the old (Smith) code. However, it looks like:
|
I ran my benchmark on 0.2 on my work machine, so I built the latest master at Jeff's suggestion and reran it. I get much closer performance now.
Also I the simple check for potential overflow doesn't work. At the least I think it misses cases with underflow. I ran stevegj's |
I'm not surprised that my test was too simple (e.g. I just noticed that the My point is that it is worth using the naive algorithm if we can devise a simple test for when it is valid, and the robust algorithm otherwise, because in the overwhelming majority of real-world cases (no underflow/overflow) the naive algorithm is fine. Modifying your benchmark code to look at the naive algorithm and a "checked" version (the naive algorithm or the robust one depending on a simple test), I find that the results are even better than my benchmark from above:
The robust version is 50% slower than Smith. The naive version is 20x faster than Smith, and the naive version plus a simple check is 10x faster than Smith. For a factor of 10 improvement, it is probably worth spending some time to figure out a reliable test for whether the naive algorithm is sufficient. |
The following routine passes all of the
and is about 50% faster than the old Smith method in your benchmark. If I manually inline @JeffBezanson, I don't suppose much can be done about the function-call overhead? Is this another case for #1106? PS. The whole Benchmark routines:
|
I see. You're fast, here is another routine that passes all the harddiv tests, I think it is slightly slower than yours, but I can only test on 0.2 right now which we know is very slow with
Updated my gist with this benchmark and the harddiv tests. |
Here is another routine that is about 2.25x faster than Smith, just by inlining the complex operations in the naive division:
|
I benchmarked your routine, along with an optimized version of your routine that inlines
|
Inline |
The problem is that it is manually inlined (directly into the benchmark function), so this is not possible as a way to implement |
Robust complex division for Complex{Float64} based on arxiv.1210.4539. Includes testing based on 10 example hard complex division.
Closes #5072