Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record details about replication action and associated document, if any, upon replication failure. #3630

Merged

Conversation

nirvdrum
Copy link
Contributor

@nirvdrum nirvdrum commented Jan 17, 2022

This PR contains:

  • A NEW FEATURE

Describe the problem you have without this PR

This PR is a proposed solution to #3622. Errors encountered during GraphQL replication are wrapped in a new ReplicationError object that records whether it was a pull or push replication that failed, and if a push failure, the associated document data in the failed request.

For the time being, I've only applied these changes to GraphQL replication, although I think it could be handled for all forms of replication. There were a variety of design decisions made in getting this far and I'm happy to adjust as necessary. I just wanted to get something functional that could then be evaluated. Some areas that would be helpful to get feedback on are:

  • Should the replication error payload be a discriminated union or just fields in the class? I went with the discriminated union because it made the constructor cleaner (less optional data), but I could also see that the replication type (pull or push) should be a field directly in the error.
  • Should the replication error wrap all errors upon replication or just logical errors? E.g., GraphQL can have an HTTP 200 response with server-side logic errors encoded in the GraphQL response and I think those should always be wrapped/recorded. But, should we also wrap network errors, authentication errors, HTTP errors and so on? This PR does do that for consistency and because I think it could make logging and error handling easier in a client, but it makes the code a little uglier.
  • If all errors are wrapped, I think we can safely indicate that the object supplied to error$ in this case is a ReplicationError. I have kept it as any for the time being to avoid any future issues with the type definition changing. When typed as any, a client should not be relying on anything about the object structure.
  • Should the replication error include all of the push document data or just the document ID?

To give an example of how I'm making use of this new feature, here's code from my client handling a uniqueness constraint violation from the GraphQL server:

replicationState.error$.subscribe((error: ReplicationError<Todo>) => {
  if (error.payload.type === 'push') {
    const { documentData } = error.payload;
    if (error.innerErrors && error.innerErrors[0].message === 'Uniqueness violation. duplicate key value violates unique constraint "todos_profile_id_rank_key"') {
      this.db.todos.findOne(documentData.id).exec().then((doc) => {
        doc?.update({
          $set: {
            rank: LexoRank.parse(documentData.rank).genNext().toString()
          }
        });
      })
    }
  }

Todos

  • Documentation
  • Changelog
  • Evaluate design decisions
  • Ensure testing is sufficient

@nirvdrum
Copy link
Contributor Author

I'm not positive I'm testing this properly either, so please that give a good look. It also took me a very long time to get the tests in a good state. Is there a way to easily run just a single test or a way to debug one? I'm not familiar at all with the test runner being used. Consequently, my solution was to delete a bunch of tests temporarily and then run all remaining tests everytime I wanted to make a change. That still took several minutes, making this a very slow development effort.

@pubkey
Copy link
Owner

pubkey commented Jan 17, 2022

Yes the test runner is pretty outdated. I also just commend out tests here
You can run npm run test:fast:lokijs or npm run test:fast:pouchdb to run all tests in parallel which speeds up things.

Public facing types are prefixed with Rx so it should be RxReplicationError. The rest of the code looks good to me.
Documentation must still be updated.

@nirvdrum
Copy link
Contributor Author

Thanks for the review. It'll probably be a few days before I can get everything updated.

@nirvdrum nirvdrum force-pushed the improved-graphql-replication-error-handling branch 3 times, most recently from dee8b9a to 903b92a Compare January 25, 2022 04:26
@nirvdrum nirvdrum force-pushed the improved-graphql-replication-error-handling branch from 903b92a to 50d1261 Compare January 27, 2022 04:36
@nirvdrum
Copy link
Contributor Author

@pubkey I've rebased the branch against master and updated the GraphQL Replication docs. I have two outstanding questions:

  1. Currently, only interactions with the GraphQL server are wrapped in RxReplicationError. I have seen non-RxReplicationError errors show up in my error handler because of issues related to the underlying database (e.g., schema changed without migration). I can wrap all of runPull and runPush and wrap all exceptions, but I don't know if that's desirable.
  2. Do you want this applied to custom and CouchDB replication as well? If so, I can give it a shot and add tests for it. I just don't have any real world application using either, so it's difficult for me to really exercise the functionality. I suspect CouchDB replication has fewer sources of issues than GraphQL replication does since it's synchronizing at the DB level.

I know there's a desire to base GraphQL replication on custom replication. I can help with that in a future PR. When they're merged, I'll have a much easier time testing the error handling since I'll be using it in my own application. But, if you want the error wrapping added for custom replication now, that's sensible, too.

When you get a chance, please take a look and let me know what you'd like to see done before approving the PR. Thanks!

@nirvdrum nirvdrum force-pushed the improved-graphql-replication-error-handling branch from 50d1261 to 88b1035 Compare January 29, 2022 05:05
GraphQL errors are wrapped in a `RxReplicationError`, which has a `payload` property with information to help you handle the underlying error.
The payload has a `type` field with a value of `"pull"` or `"push"`, corresponding to an error in either a GraphQL pull or push replication operation, respectively.
If the error occurs doing a _push_, the `payload` also contains a `documentData` property, which corresponds to the document data supplied to the push query builder.
NB: You may see errors in this observable that are not `RxReplicationError`.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "NB:" for "nota bene"?
We shouldn't use these abbreviation because most readers of the docs a from non-latin based mother languages.
In other parts I did it like here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look. I hadn't considered the language barrier. Changing it is easy enough. Is it okay to use "e.g."? That's another Latin abbreviation.

@pubkey
Copy link
Owner

pubkey commented Jan 29, 2022

We do not need this for the couchdb replication for now.

I have no opinion about having non-RxReplicationError in the same stream. We can leave that for now. Maybe we should add an 'internal' error type in the future. Or some other way to determine if an error is expected or a problem in the code.

If you want to, you can add the improved error handling also to the replications plugin. But I am also ok with not having that for now.
Migrating the GrahphQL replication to use the replication-primitives will likely cause a breaking change which requires a new major version. This is why I did not do that for now because I want to first collect more breaking changes.

Code looks good for me, I added one comment to the docs.

@nirvdrum nirvdrum force-pushed the improved-graphql-replication-error-handling branch from 88b1035 to e3f8d1c Compare January 29, 2022 19:10
@nirvdrum
Copy link
Contributor Author

Thanks for the review. I've updated the documentation issue you highlighted.

@pubkey pubkey merged commit 54946a2 into pubkey:master Jan 29, 2022
pubkey added a commit that referenced this pull request Jan 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants