-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gk.jl example code slower than matlab #950
Comments
Thanks for trying out the profiler! It looks like all the time is spent on line 79. Two comments:
will do the same thing much faster. Of course, this is what cumsum does, i.e., you could replace the whole loop with just
If you care, there also seem to be a number of other aspects to this that are definitely not performance-optimal. For example, "exp" is expensive, so you don't want to call it more often than necessary. So rather than
you probably want
|
I should add that calling cumsum will allocate a new vector, while the explict version I wrote out re-uses the same memory on each iteration. So for the ultimate in speed it's better to avoid cumsum in this case. |
So far, I have not tuned the julia code, because it is identical to the matlab code. It would be good for us to be at least as fast as anything out there, without all these improvements. Once we get there, we should update both the julia and matlab codes with the suggested improvements, and see where we stand. The fact of the matter is that people will write code in a style they are comfortable, and the compiler ought to do the best it can. Wouldn't it be cool if julia can make such suggestions to the user. :-) |
Care to try manually to see if using memcpy fixes the performance gap? The arrays branch is still quite some ways away from merging, and it might be nice to know whether this really will address the problem. Basically, add a function
and then in line 79 replace the sum(p[1:i]) with sum(myref(p, 1:i)).
As you surely know, Matlab's MLint is quite good for "low-level" optimizations, but choice of algorithm is beyond its capabilities. Still, it would be a useful thing to have. It may require an IDE, unless it could be done when functions are loaded into memory. |
I used myref for p[1:i] and p[1:n] a few lines below. For 1000x1000, it is slightly faster than matlab. However, for 10000x10000 matrices, it is still twice as slow as matlab. -viral On 20-Jun-2012, at 4:31 PM, Tim Holy wrote:
|
Is there any role for mmap here to improve performance on big arrays? |
Perhaps Matlab is avoiding the temporary altogether? I added the following functions:
and then wrote the loop this way:
The profiling results on the relevant lines are as follows:
So avoiding the copy makes it even faster, nearly 4x faster than the original on gk(1000,[0.1]). That would seem to beat Matlab. Of course, you have to delete the slower versions of those lines in the loop! (I put all three of them in there because the # of iterations is variable, so this seemed the safest way to compare them directly.) So this would seem to suggest that the biggest savings are to be had in avoiding the ref operation. Just one other thing to check, you are running Matlab single-threaded? |
No, I don't think mmap will help. mmap's strength is efficient handling of data sets too big to store in RAM, but 10k x 10k is, by modern standards, still not very memory intensive. |
I'll ensure I am using only one thread in matlab - that always gets me. In generaly, I do believe that matlab has optimized away a lot of temporary creations over the years. -viral On 20-Jun-2012, at 8:45 PM, Tim Holy wrote:
|
This code is no longer twice as slow as matlab. Matlab is 1.6 seconds, whereas julia is 2.1 seconds for me, on the example invocation described above. |
This just got significantly faster for me, perhaps due to LLVM 3.2. Could you try it again? |
Yes, we are now consistently faster than Matlab now. |
The code in
examples/gk/gk.jl
is about twice as slow as the same code in matlab. The code does is a mix of scalar loops and a few vectorized operations.For a quick run, do
gk(1000,[0.1])
.The profiler seems to suggest that much of the time is spent in the vectorized operations in the code. Julia is faster than matlab on small problem sizes, while slower on larger problem sizes, suggesting that the vectorized operations may be the reason for the poor performance.
The text was updated successfully, but these errors were encountered: