-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nsqd: optimize guid generation #663
Conversation
btw test failure is just go1.4.2 on i386, TestEphemeralTopicsAndChannels got non-zero number of topics at the end |
OK, I feel pretty silly now for not seemingly not reading idPump() - this won't make much of a difference if there isn't a constant stream of errors from NewGUID() (well, except for the int64 division trick) |
@ploxiln any of this worth landing? I'm a little concerned about micro-optimizations like bit-shift-for-division, but maybe that one? |
I'm not sure this should be merged as-is, I expect we'll want to tweak it. We may not need the subjective uglyness of the additional argument to NewGUID() - it seems that errors shouldn't be that common ... I like the shift optimization - it makes sense that it could be noticeably helpful on ARM, I'll check. |
Let's drop the time stuff and leave the shift optimization and land it. |
Thanks @ploxiln |
7050423
to
e57ab2e
Compare
I've reduced this PR to just the shift optimization. But I haven't tested it on arm yet, and I don't think there's any rush. I'm in the process of getting stuff set up on a raspberrypi 2 which has 4 Cortex A7 cores, which are lower-power and in-order rather than out-of-order but from the same generation as the Cortex A15. And it's also an interesting target in any case. |
It's probably worth mentioning that the time-part of the id will jump back 18577 hours. I don't think there's much chance of overlap; that's about as long as nsq has existed.
|
(maybe I should adjust |
Trying to use the ubuntu-packaged go1.3.3 on arm, the build fails with "undefined: atomic.Value". Accidental compatibility slip, or just an arm limitation? Anyway, continuing with testing with a binary build of go 1.4.2 for arm I got from http://dave.cheney.net/unofficial-arm-tarballs (seems legit, I even think I've heard of this guy...). I may be able to build and test go 1.5.1 but that won't be until late tomorrow if I do. |
Did have a noticeable improvement for pub:
Also: many ARM SoC have multiple cores, and GOMAXPROCS=2 almost eliminates the difference this branch makes, for this simple bench.sh (pretty consistent pub perf of 10.5 mb/s -> 10.8 mb/s) |
There were a few options for twepoch, I decided to keep it almost exactly the same but update the comment. I've done as much half-hearted ARMv7 benchmarking as I'm inclined to do. fwiw |
64-bit division operations appear to be slow on some ARM systems, so replace division by 1000000 with a bitshift equivalent to division by 1048576. adapt guid twepoch comment for revised guid timestamp scheme
0e329a5
to
8cf036a
Compare
@ploxiln correct me if I'm wrong, but won't this also change the theoretical number of unique IDs we're able to generate per second? |
Yes, there will be 5% fewer unique IDs per second. >>> (10**9 / 2**20) * 1.0 / 1e3
0.953 |
So, it reduces the theoretical max messages per workerID per second from 4,096,000 to 3,903,488 |
nsqd: optimize guid generation
Humble idea inspired by #658
Current-nanosecond-time and 64-bit division operations
appear to be slow on some ARM systems, so minimize them:
by making it nextErrorLog instead
equivalent to division by 1048576
My benchmarks on darwin / x86_64 laptop (which will depend on temperature and other factors of course) show a negligible improvement, but it will be interesting to see the results for an ARM system (I have a raspberry pi I could dig out and try, maybe later).
before:
after: