-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse a buffer in ACDSI.InputAdapter #561
Conversation
Generate changelog in
|
@ellisjoe I can't assign reviewers, would appreciate a look |
The allocation in .read was 30% of all allocations in codex-foundry during a period where we were allocating and GCing ~20GB of byte[]s per minute.
bce214b
to
64b9924
Compare
I considered checking if therefore the buffer will never have an array we can write to, and we have to pay for this copy |
readBuffer = new byte[newLength(size, required - size, size << 1)]; | ||
} | ||
|
||
// copied from jdk.internal.util.ArraysSupport |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that GPL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bleh, yes it is. I'll rewrite & simplify a bit
byte[] bytes = new byte[dst.remaining()]; | ||
int read = input.read(bytes, 0, bytes.length); | ||
if (readBuffer.length < dst.remaining()) { | ||
resize(dst.remaining()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is resizing necessary? Can't we read from the source in a loop using the buffer we already have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in practice the buffer should always fit, since we read into a byte buffer created with same size in
Line 45 in e42e049
super(new InputAdapter(input), Utils.getCipherInstance(ALGORITHM, PROPS), BUFFER_SIZE, |
so this covers the case that something changes in the calling code to use a larger buffer. looping intuitively feels more expensive, more os calls vs single allocation that gets reused from then on (assuming buffer size stays constant)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looping intuitively feels more expensive
Beyond 8kB (or in some tests 16kB) larger buffers don't tend to reduce overhead for network or local disk operations in my experience. I'd bias toward the loop unless there's a benchmark that proves larger values can work better (and in that case I'd update the buffer size while keeping the loop as a fallback).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I guess. The code gets more complicated but can do that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
private SeekableInput input; | ||
private byte[] readBuffer = new byte[BUFFER_SIZE]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final, no need to null this out after close imo -- assuming we're not holding a reference to the closed stream for a long time.
private byte[] readBuffer = new byte[BUFFER_SIZE]; | |
private final byte[] readBuffer = new byte[BUFFER_SIZE]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, makes sense if we know the buffer is constant size
Can someone click the 'generate changelog' button, it's not available for me :P |
Hm, it timed out on the second push too, but I really don't see how that could be caused by this change. I'll try another run tomorrow, I guess, hoping that whatever's wrong atm will pass |
int read = input.read(readBuffer, 0, chunk); | ||
|
||
if (read == -1) { | ||
return totalRead; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be:
if (read == -1) {
if (totalRead == 0) {
totalRead = -1;
}
}
there's a reference implementation here: https://github.com/apache/commons-crypto/blob/master/src/main/java/org/apache/commons/crypto/stream/input/StreamInput.java#L58-L75
I think this logic ends up being complex enough to warrant a few tests as well. When I was playing with this myself I pulled the InputAdapter up. I'm still trying to find some common util that we can use here but haven't found anything yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was under the impression that DecryptionTests
would cover this - but in fact am struggling to have those run locally (tests in intellij die with 'Process finished with exit code 134 (interrupted by signal 6: SIGABRT)') - is there a trick to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does look like the circleci hang was related to that bug - unclear how but clearly I don't understand the mechanics of these tests |
…eInput (#561) (#576) Co-authored-by: Jacek Lach <[email protected]>
Before this PR
The allocation in .read was 30% of all allocations in internal product
during a period where we were allocating and GCing ~20GB of byte[]s
per minute.
After this PR
==COMMIT_MSG==
Reduce byte[] allocations during reads via ApacheCtrDecryptingSeekableInput
==COMMIT_MSG==
Possible downsides?