Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix EventStream to properly handle utf-8 #216

Merged
merged 18 commits into from
Jan 9, 2024
Merged

Conversation

waahm7
Copy link
Contributor

@waahm7 waahm7 commented Nov 22, 2023

Issue #, if available:

Description of changes:

  • There was a bug where the length of the String was not properly passed down to the C layer, since .count returns the number of characters in the String, but one character can take up multiple bytes in UTF-8.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

switch header.value {
case .bool(let value):
return aws_event_stream_add_bool_header(
rawHeaders,
headerName,
UInt8(header.name.count),
headerNameLength,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is spooky. Please search the codebase for all other uses of withCString or .count to see if we're making this mistake anywhere else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have removed withCString since it is not needed and Swift will automatically do it for you under the hood. I have also verified the usage of .count.

@@ -54,36 +54,37 @@ extension EventStreamMessage {
}
let addCHeader: () throws -> Int32 = {
return try header.name.withCString { headerName in
let headerNameLength = UInt8(header.name.utf8.count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if .utf8.count is really just O(N) under the hood...

If it is, I'd be tempted to just call strlen(cstring) because it seems harder to mess up, compared to checking that we always do swiftstring.utf8.count and never accidentally do swiftstring.count

really wish withCString() passed both pointer and length to the closure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated and removed withCString. Yeah, I agree it's really easy to mess up, given that we need to pass .utf8.count to C APIs instead of .count.

Copy link
Contributor

@graebm graebm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix & ship

value)
case .int32(let value):
return aws_event_stream_add_int32_header(
let headerNameLength = UInt8(header.name.utf8.count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: move this up, so we can do if headerNameLength > EventStreamHeader.maxNameLength
instead of needing to type the easy-to-mess-up .utf8.count multiple times

header.name,
headerNameLength,
bytes,
UInt16($0.count),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: similar, we get length in 2 mildly different ways here: value.count up at the length check, then $0.count down here. Can we use a temp variable, or do value.count in both places, or move the length-check down so we can do$0.count in both places?

@waahm7 waahm7 requested a review from graebm January 8, 2024 21:44
@waahm7 waahm7 merged commit 4aebaa6 into main Jan 9, 2024
9 checks passed
@waahm7 waahm7 deleted the fix-encoding-decoding-string branch January 9, 2024 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants