Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BE-615] AER: use a static identifier for GraphQL validation failures #3465

Conversation

zionts
Copy link
Contributor

@zionts zionts commented Nov 2, 2019

This PR changes the apollo usage reporting library to use static identifiers for operation documents that are not able to be executed.

When users of this module receive many un-executable operation documents, such as a non parse-able operation documents, invalid operation documents, or invalid operation names, every operation document is sent to Apollo Studio. This results in a cardinality explosion within Graph Manager. After a few thousand of these invalid operation names / documents are reported, the UI for the customer is borderline unusable due to the cardinality explosion & schema validation reaches a capacity of operations as well.

In general, we want to avoid storing & exposing personal information in Studio, and in current reporting agents, this is also problematic for operations that fail to execute. Because we currently report these operations with a signature matching the entire operation body, this is an easy trap for users to accidentally send user information through our system, when argument literals exist in the document.

The static identifiers are:

  • ## GraphQLParseFailure for documents that don't parse as valid graphql
  • ## GraphQLValidationFailure for documents that aren't valid given the schema running on the server
  • ## GraphQLUnknownOperationName for operation documents which don't have an operation name in the document

Additionally, it allows users to optionally include the body of the operations that fail validation with sendUnexecutableOperationDocuments. This will send the operation body as part of the trace so they can be viewed alongside the trace.

@zionts
Copy link
Contributor Author

zionts commented Nov 2, 2019

This needs an upstream PR on monorepo, too, but ready for review otherwise :)

glasser
glasser previously requested changes Nov 7, 2019
packages/apollo-engine-reporting/src/agent.ts Outdated Show resolved Hide resolved
packages/apollo-engine-reporting/src/agent.ts Outdated Show resolved Hide resolved
packages/apollo-engine-reporting/src/agent.ts Outdated Show resolved Hide resolved
packages/apollo-engine-reporting/src/agent.ts Outdated Show resolved Hide resolved
package-lock.json Outdated Show resolved Hide resolved
@zionts zionts force-pushed the adam/19/10/avoid-cardinality-explosions-and-pii-in-invalid-graphql branch from 8b8814b to aa8d823 Compare November 14, 2019 02:19
@zionts zionts dismissed glasser’s stale review November 14, 2019 02:20

Addressed — please re-review when possible :)

Copy link
Member

@glasser glasser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(can't respond to one thread, putting it here)

No more prettier in this project, but I fixed it manually.

prettier is still installed at a consistent version in apollo-server; it just isn't automatically enforced via a linter. However, apollo-engine-reporting still is perfectly formatted with prettier, and I'd like to keep it this way. @abernix and I talked this week. I understand why he thinks Prettier specifically isn't a good fit for our open source projects but he agrees that that doesn't need to mean that we should regard our files as having no style rules at all, or that we shouldn't add in other style enforcement via eslint (just not the line length rules from prettier specifically). Let's keep this package prettier-clean, manually for now and maybe automatically later.

packages/apollo-engine-reporting/src/agent.ts Outdated Show resolved Hide resolved
// and the operation name and signature will always be reported with a static
// identifier. Whether the operation was a parse failure or a validation
// failure will be embedded within the stats report key itself
sendOperationDocumentsOnValidationFailure?: boolean;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not separate flags for validation and parsing? Or a name that is explicit? I really think it's misleading to use the word "validation" to mean "parse or validation". Words have meanings.

} else {
const signature = await this.getTraceSignature({
queryHash,
documentAST,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get how this works: this variable is nullable but the parameter it's being passed to isn't?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The null case is handled above in an else if branch, so here we know that documentAST is defined.

packages/apollo-engine-reporting/src/agent.ts Outdated Show resolved Hide resolved
packages/apollo-engine-reporting/src/extension.ts Outdated Show resolved Hide resolved
packages/apollo-engine-reporting/src/extension.ts Outdated Show resolved Hide resolved
@@ -177,6 +193,7 @@ export class EngineReportingExtension<TContext = any>
// isn't actually in the document. We want to know the name in that case
// too, which is why it's important that we save the name now, and not just
// rely on requestContext.operationName (which will be null in this case).
this.gqlValidationSucceeded = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, at least have a comment that this would better belong in validationDidEnd. Though frankly you could also just change graphql-extensions to pass validationErrors to validationDidEnd. I made plenty of graphql-extensions changes as needed by this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed like making more extensions-specific changes would make this logic harder to port to the new plugin API in a way that was pretty certainly in alignment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would actually align the stack API with the new plugin API, which already passes validation errors to its version of validationDidEnd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also graphql-extensions is a bit of a beast I just prefer not to touch 😅 Along with the fact that we already know there is overhead for each function call we're putting onto the stack, which is currently causing aer to spew tons of garbage, adding another function for validationDidEnd is not something I really really want to do, but if you want to push for it, I shall comply :P

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function already exists, it just doesn't receive the list of errors...

// Optional: when GraphQL parsing or validation against the GraphQL schema fails, these fields
// can include reference to the operation being sent for users to dig into the set of operations
// that are failing validation.
string operationBodyOnValidationFailure = 27;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, these shouldn't use "validation" to mean "parsing or validation". Split the fields or make a better name.

Maybe unexecutedOperationBody / unexecutedOperationName?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it!

@zionts zionts force-pushed the adam/19/10/avoid-cardinality-explosions-and-pii-in-invalid-graphql branch 2 times, most recently from 2d9e00f to b0f2429 Compare December 9, 2019 23:10
@glasser
Copy link
Member

glasser commented Dec 14, 2019

sorry for the lag. will try to get to this early next week. looks like it needs a rebase.

packages/apollo-engine-reporting/src/agent.ts Outdated Show resolved Hide resolved
packages/apollo-engine-reporting/src/extension.ts Outdated Show resolved Hide resolved
@@ -177,6 +193,7 @@ export class EngineReportingExtension<TContext = any>
// isn't actually in the document. We want to know the name in that case
// too, which is why it's important that we save the name now, and not just
// rely on requestContext.operationName (which will be null in this case).
this.gqlValidationSucceeded = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would actually align the stack API with the new plugin API, which already passes validation errors to its version of validationDidEnd.

packages/apollo-engine-reporting/src/agent.ts Outdated Show resolved Hide resolved
packages/apollo-engine-reporting/src/agent.ts Outdated Show resolved Hide resolved
@zionts zionts force-pushed the adam/19/10/avoid-cardinality-explosions-and-pii-in-invalid-graphql branch from 539d2da to 815b562 Compare December 17, 2019 02:20
@zionts zionts dismissed glasser’s stale review December 17, 2019 02:45

le stale again

@zionts zionts removed the request for review from pcarrier December 17, 2019 02:45
@@ -65,11 +65,17 @@ test('trace construction', async () => {
addTrace,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about end-to-end tests! Search for eg sets the trace key to operationName when it is defined

@zionts zionts changed the title AER: use a static identifier for GraphQL validation failures [BE-615] AER: use a static identifier for GraphQL validation failures Jan 24, 2020
abernix added a commit that referenced this pull request May 8, 2020
… failure.

Addresses feedback in below referenced [[Comment]].

If operation resolution (parsing and validating the document followed by
selecting the correct operation) resulted in the population of the
`operationName`, we'll use that. (For anonymous operations,
`requestContext.operationName` is null, which we represent here as the empty
string.)

If the user explicitly specified an `operationName` in their request but
operation resolution failed (due to parse or validation errors or because
there is no operation with that name in the document), we still put _that_
user-supplied `operationName` in the trace. This allows the error to be
better understood in Graph Manager. (We are considering changing the
behavior of `operationName` in these three error cases; see [[#3465]] below for
details.)

[Comment]: #3998 (comment)
[#3465]: #3465
Base automatically changed from master to main June 24, 2020 18:16
@jsegaran jsegaran force-pushed the adam/19/10/avoid-cardinality-explosions-and-pii-in-invalid-graphql branch from 8c3488a to d04a273 Compare August 3, 2020 00:05
@jsegaran jsegaran requested a review from glasser August 3, 2020 00:25
@jsegaran jsegaran force-pushed the adam/19/10/avoid-cardinality-explosions-and-pii-in-invalid-graphql branch from 0afac33 to 786bdcb Compare August 4, 2020 04:30
@@ -163,6 +163,18 @@ export const plugin = <TContext>(
}
}

/**
* This is set to true at the beginning of the pipeline. If we resole
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor spelling/capitalization/punctuation notes: "resolve", "GraphQL", "can start as true", "operation, the operation must"

packages/apollo-engine-reporting/src/plugin.ts Outdated Show resolved Hide resolved
packages/apollo-engine-reporting/src/agent.ts Outdated Show resolved Hide resolved
packages/apollo-engine-reporting/src/agent.ts Outdated Show resolved Hide resolved
packages/apollo-engine-reporting/src/agent.ts Outdated Show resolved Hide resolved
packages/apollo-engine-reporting/src/plugin.ts Outdated Show resolved Hide resolved
glasser added a commit that referenced this pull request Sep 23, 2020
Josh should update this commit message to be accurate :)
@glasser glasser force-pushed the adam/19/10/avoid-cardinality-explosions-and-pii-in-invalid-graphql branch from d66ef1e to 483ce9a Compare September 23, 2020 23:18
Josh should update this commit message to be accurate :)
@jsegaran jsegaran force-pushed the adam/19/10/avoid-cardinality-explosions-and-pii-in-invalid-graphql branch from 7fc3644 to 4f70449 Compare September 28, 2020 18:03
@jsegaran jsegaran force-pushed the adam/19/10/avoid-cardinality-explosions-and-pii-in-invalid-graphql branch from 4f70449 to e5e8dbb Compare September 28, 2020 18:05
@jsegaran jsegaran force-pushed the adam/19/10/avoid-cardinality-explosions-and-pii-in-invalid-graphql branch from 14e4218 to fd9c1d5 Compare September 28, 2020 22:05
@jsegaran jsegaran requested a review from glasser September 28, 2020 22:06
Copy link
Member

@glasser glasser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looking great

packages/apollo-server-core/src/utils/pluginTestHarness.ts Outdated Show resolved Hide resolved
</td>
<td>

Whether to include the entire document in the trace if the operation was a GraphQL parse or validation error (i.e. failed the GraphQL parse or validation phases). This will be included as a separate field on the trace and the operation name and signature will always be reported with a cosntant identifier. Whether the operation was a parse failure or a validation failure will be embedded within the stats report key itself.
Copy link
Member

@glasser glasser Sep 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cosntant -> constant

But also I think we can make this be a little more user-focused. Maybe something more along the lines of

Statistics about operations that your server cannot execute are not reported under each document separately to Apollo's servers, but are grouped together as "parse failure", "validation failure", or "unknown operation name". By default, the usage reporting plugin does not include the full operation document in reported traces, because it is challenging to strip potential private information (like string constants) from invalid operations. If you'd like the usage reporting plugin to include the full operation document so you can view it in Apollo Studio's trace view, set this to true.

The wording isn't ideal but I think this helps explain what's going on and why you might want to set it better?

@jsegaran jsegaran requested a review from glasser September 30, 2020 18:37
@jsegaran jsegaran force-pushed the adam/19/10/avoid-cardinality-explosions-and-pii-in-invalid-graphql branch from 7d322cf to d71abe9 Compare September 30, 2020 18:51
Copy link
Member

@glasser glasser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have some changelog suggestions, and i do hope you remember to make the commit message up to date, but otherwise looks great!

@glasser
Copy link
Member

glasser commented Oct 2, 2020

for the PR description (final commit message when squashing) make sure you have double #s in the operation names

@abernix abernix added this to the Release 2.19.0 milestone Oct 5, 2020
@abernix abernix changed the base branch from main to release-2.19.0 October 5, 2020 12:22
@abernix
Copy link
Member

abernix commented Oct 5, 2020

I've retargeted this to release-2.19.0, which is where I expect this will land.

@jsegaran jsegaran merged commit b427e78 into release-2.19.0 Oct 16, 2020
@jsegaran jsegaran deleted the adam/19/10/avoid-cardinality-explosions-and-pii-in-invalid-graphql branch October 16, 2020 21:49
glasser added a commit that referenced this pull request Jun 9, 2021
We've already done #3465, so we no longer need a single const to
represent "either the actual resolved operation name that definitely
points to a real parsed and validated operation, or else what the user
wrote in the request". We can use the latter in the one case where we
report an unexecuted operation name, and the former otherwise.
glasser added a commit that referenced this pull request Jun 9, 2021
We've already done #3465, so we no longer need a single const to
represent "either the actual resolved operation name that definitely
points to a real parsed and validated operation, or else what the user
wrote in the request". We can use the latter in the one case where we
report an unexecuted operation name, and the former otherwise.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants