-
Notifications
You must be signed in to change notification settings - Fork 657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PutObject streams with unknown size use streaming V4. #750
Conversation
37f8101
to
60a9531
Compare
api.go
Outdated
@@ -700,7 +700,8 @@ func (c Client) newRequest(method string, metadata requestMetadata) (req *http.R | |||
case signerType.IsV2(): | |||
// Add signature version '2' authorization header. | |||
req = s3signer.SignV2(*req, accessKeyID, secretAccessKey) | |||
case signerType.IsStreamingV4() && method == "PUT": | |||
case metadata.objectName != "" && method == "PUT": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change has the following user-visible impact.
-
signature type set in configuration or overridden via overrideSignerType` is not honored, for all PUT requests on object resources. This may confuse existing users.
-
removes some performance advantages that were available to users previously.
E.g, user uploading an object of known size over HTTPS would like to benefit from not having to compute sha256 sum of the payload. It looks like this request will be signed using streaming signature forcing sha256 sum computation for every 'chunk'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct we just default to sha256 anyways.. This is faster than actually doing md5sum with https in the benchmarks.. Defaulting to streaming was already done for all PUT operations, this change just goes one step further and enables that also for situation where size of the object is not known.
Streaming signature is a subset of V4 not a different auth mechanism in itself . That is why the differentiation is removed and implemented in such a way that .. if V4 is configured and object is being uploaded they all default to streaming signature implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct we just default to sha256 anyways.. This is faster than actually doing md5sum with https in the benchmarks..
How does the performance of streaming signature compare with not computing sha256 sum of payload when object is being uploaded over a HTTPS connection using signature v4? Is streaming signature based upload (which computes sha256 sum of every chunk) faster than UNSIGNED PAYLOAD using signature v4 over HTTPS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with unsigned payload we calculate content-md5 . That is what slows it down.. but you are right I can change that for secure connection can just skip this as well.
5f97bac
to
fdfe15c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code changes look good to me. I have a minor (yet important) comment w.r.t a comment.
api.go
Outdated
@@ -700,7 +700,11 @@ func (c Client) newRequest(method string, metadata requestMetadata) (req *http.R | |||
case signerType.IsV2(): | |||
// Add signature version '2' authorization header. | |||
req = s3signer.SignV2(*req, accessKeyID, secretAccessKey) | |||
case signerType.IsStreamingV4() && method == "PUT": | |||
case metadata.objectName != "" && method == "PUT" && metadata.customHeader.Get("X-Amz-Copy-Source") == "" && !c.secure: | |||
// Following condition is only invoked when objectName is set, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment describes the low-level details which are available in the code. We could describe the same at a higher-level as,
// Streaming signature is used by default for a PUT object request on a HTTP object storage server.
In a higher-level description the reader is required to make the connection between the goal and the case expression that represents it. This seems reasonable to me. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.. done.
fdfe15c
to
fd78ae2
Compare
@@ -700,7 +700,8 @@ func (c Client) newRequest(method string, metadata requestMetadata) (req *http.R | |||
case signerType.IsV2(): | |||
// Add signature version '2' authorization header. | |||
req = s3signer.SignV2(*req, accessKeyID, secretAccessKey) | |||
case signerType.IsStreamingV4() && method == "PUT": | |||
case metadata.objectName != "" && method == "PUT" && metadata.customHeader.Get("X-Amz-Copy-Source") == "" && !c.secure: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@harshavardhana, can you add a comment to explain why we check c.secure
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we don't want to do TLS sha256.
Looks like minio-go tests are not passing here:
Could be not related to this PR. |
This looks like some other problem. There shouldn't be an error here. |
fd78ae2
to
fa7151d
Compare
Currently we used to fall back and use non-streaming version but we can instead just use the streaming version instead. Does not fix any apparent bug, but it just allows for avoiding any checksum calculation for large buffers.
fa7151d
to
0c1d594
Compare
@vadmeste can you take a look again the PR is updated. |
if size < minPartSize { | ||
return c.putObjectNoChecksum(bucketName, objectName, reader, size, metadata, progress) | ||
} | ||
|
||
// For all sizes greater than 64MiB do multipart. | ||
return c.putObjectMultipartStream(bucketName, objectName, reader, size, metadata, progress) | ||
} | ||
|
||
func (c Client) putObjectMultipartStreamNoLength(bucketName, objectName string, reader io.Reader, size int64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay. I don't understand one thing: why are we passing size
parameter here though this function name is "putObjectMultipartStreamNoLength" (nolength). Besides, this function seems to be called only when size < 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can change that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be done in a later PR which refactors PutObject and get this merged ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be done in a later PR which refactors PutObject and get this merged ?
Sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Currently we used to fall back and use non-streaming
version but we can instead just use the streaming
version instead.
Does not fix any apparent bug, but it just allows
for avoiding any checksum calculation for large buffers.