Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak diagnostics with invalid UTF-8 so they can pass over the wire #237

Merged
merged 2 commits into from
Nov 22, 2022

Conversation

apparentlymart
Copy link
Contributor

@apparentlymart apparentlymart commented Nov 18, 2022

A correct provider should only ever return valid UTF-8 strings as the diagnostic Summary or Detail, but since diagnostics tend to be describing unexpected situations and are often derived from errors in downstream libraries it's possible that a provider might incorrectly return incorrect garbage as part of a diagnostic message.

The protobuf serializer rejects non-UTF8 strings with a generic message that is unhelpful to end-users:

string field contains invalid UTF-8

Here we make the compromise that it's better to make a best effort to return a diagnostic that is probably only partially invalid so that the end user has a chance of still getting some clue about what problem occurred. The new helper functions here achieve that by replacing any invalid bytes with a correctly-encoded version of the Unicode Replacement Character, which will then allow the string to pass over the wire protocol successfully and hopefully end up as an obviously-invalid character in the CLI output or web UI that's rendering the diagnostics.

This does introduce some slight additional overhead when returning responses, but it should be immaterial for any response that doesn't include any diagnostics, relatively minor for responses that include valid diagnostics, and only markedly expensive for a diagnostic string with invalid bytes that will therefore need to be re-encoded on a rune-by-rune basis.


My motivation for doing this was in the hope of exposing whatever is the root cause of the error over in hashicorp/terraform-provider-azurerm#17563. I've not yet proven that it's the diagnostics which are causing that misbehavior but it seems likely to me because I don't see any other strings in the ConfigureProvider response and the problem seems to be fixed by logging in and therefore presumably fixing whatever this error is trying to complain about.

I don't expect that this will actually correct that other issue, but I hope that upgrading the hashicorp/azurerm provider to include these changes will allow the provider to describe the real underlying problem and therefore to determine whether there's a real bug to fix there or whether it's just a cosmetic issue of encoding the error message improperly.

@apparentlymart apparentlymart requested a review from a team as a code owner November 18, 2022 02:24
@bflad bflad self-assigned this Nov 18, 2022
@bflad bflad added this to the v0.15.0 milestone Nov 18, 2022
@bflad bflad added the bug Something isn't working label Nov 18, 2022
@bflad
Copy link
Contributor

bflad commented Nov 18, 2022

Hey @apparentlymart 👋 On the surface, this change seems reasonable for inclusion. I'm going to label this as a bug fix, since we should certainly provide a better experience than raising a relatively meaningless error to practitioners in this situation.

Would you mind adding a .changelog/237.txt file with changelog entries? e.g.

```release-note:bug
tfprotov5: Prevented diagnostic "string field contains invalid UTF-8" errors by rewriting with valid replacement characters
```

```release-note:bug
tfprotov6: Prevented diagnostic "string field contains invalid UTF-8" errors by rewriting with valid replacement characters
```

It may also be worth noting that more recent versions of terraform-plugin-go (v0.10.0), terraform-plugin-sdk (v2.18.0), terraform-plugin-framework (v0.10.0), and terraform-provider-azurerm (v3.14.0) should also have additional diagnostics logging that occurs before the response is sent: #203

Maybe those newer logs can provide some additional insight while we go through the motions to getting this through the various releases to get it in front of practitioners.

Personally, I'm also wondering if this is a good time as any to introduce native Go fuzz testing for a few code paths in this Go module so we can try to tease out any additional oddities or panics. I'll create a placeholder issue since I thought I had already done that in the past.

Thanks!

@apparentlymart
Copy link
Contributor Author

Oh yes... sorry I forgot about the changelog automation in this repository. I've added a changelog file to my commit.

It does look like those additional log lines would help unmask whatever is going on in hashicorp/terraform-provider-azurerm#17563, by printing the same diagnostic information in the log stream before trying to send it over the wire.

My intent with this PR was to try to get at least a partial error message to the user in the UI even if the provider's message formatting is buggy, and so I think this change is still valuable in that sense. Now that we have the logging you mentioned I could also see the argument that it might be better to let this fail and have the user report a bug -- otherwise the encoding errors are unlikely to ever get reported and fixed -- but I think I personally would err on the side of trying to give the end-user as much information as possible to debug with, even if that does mean that some mostly-cosmetic encoding bugs in providers will go unfixed.

If we were to rule that it's better to raise an error for that case then I would still suggest swapping for a more specific error which explicitly declares it as a provider bug, so that the end-user knows to just report it and not waste time trying to debug it locally.

Ultimately I'll leave this call up to you! I'm happy to take whichever path you think best.

Copy link
Contributor

@bflad bflad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! Thanks, @apparentlymart 🚀 I'm not sure if the adjusted unit test cases are valuable enough, but thought they may be useful given the current logic.

A correct provider should only ever return valid UTF-8 strings as the
diagnostic Summary or Detail, but since diagnostics tend to be describing
unexpected situations and are often derived from errors in downstream
libraries it's possible that a provider might incorrectly return incorrect
garbage as part of a diagnostic message.

The protobuf serializer rejects non-UTF8 strings with a generic message
that is unhelpful to end-users:
    string field contains invalid UTF-8

Here we make the compromise that it's better to make a best effort to
return a diagnostic that is probably only partially invalid so that the
end user has a chance of still getting some clue about what problem
occurred. The new helper functions here achieve that by replacing any
invalid bytes with a correctly-encoded version of the Unicode Replacement
Character, which will then allow the string to pass over the wire protocol
successfully and hopefully end up as an obviously-invalid character in
the CLI output or web UI that's rendering the diagnostics.

This does introduce some slight additional overhead when returning
responses, but it should be immaterial for any response that doesn't
include any diagnostics, relatively minor for responses that include
valid diagnostics, and only markedly expensive for a diagnostic string
with invalid bytes that will therefore need to be re-encoded on a
rune-by-rune basis.
@apparentlymart
Copy link
Contributor Author

Good call on those extra test cases! I needed to add them to both copies of the logic (across both protocol versions) so I did it locally in my editor rather than by accepting your suggestions, and so I decided to add new test cases rather than replacing the existing ones since it was easier to do than it would be in the suggestions UI.

Note that I don't have write access to this repository, so if this looks good I'll have to ask you to merge it for me. 😀

@bflad bflad merged commit 0488e08 into hashicorp:main Nov 22, 2022
@bflad bflad modified the milestones: v0.15.0, v0.14.2 Nov 22, 2022
@bflad
Copy link
Contributor

bflad commented Nov 22, 2022

Releasing this now in terraform-plugin-go version 0.14.2.

@apparentlymart
Copy link
Contributor Author

Somehow when I was first working on this I didn't notice that there's a stdlib function strings.ToValidUTF8 which does something similar (but not quite the same) as the function I added here. I think my mistake is that I was only looking at the unicode/utf8 package, expecting to find functionality specifically related to UTF-8 encoding in there.

I'm not going to open a follow-up PR to change that now because it seems like unnecessary churn just to get almost-the-same functionality a different way, but I did want to note it here in case someone looks back at this in future and wonders why we have a hand-written "to valid UTF8" function. 😀

(The subtle difference between what I wrote and what the stdlib does is that my function turns each invalid byte into a separate unicode replacement character, whereas stdlib collects many consecutive invalid bytes and replaces them all together with a single unicode replacement character. Both of those are consistent with the goal of this PR, but if we did swap this out we'd probably need to tweak a couple tests to account for this difference.)

Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants