Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update peering state and RPC for deferred deletion #13430

Merged
merged 3 commits into from
Jun 13, 2022
Merged

Conversation

freddygv
Copy link
Contributor

Backport of ENT-2041

Description

When deleting a peering we do not want to delete the peering and all
imported data in a single operation, since deleting a large amount of
data at once could overload Consul.

Instead we defer deletion of peerings so that:

  1. When a peering deletion request is received via gRPC the peering is
    marked for deletion by setting the DeletedAt field.

  2. A leader routine will monitor for peerings that are marked for
    deletion and kick off a throttled deletion of all imported resources
    before deleting the peering itself.

This PR mostly addresses point #1 by modifying the peering service
to mark peerings for deletion. Another key change is to add a
PeeringListDeleted state store function which can return all peerings
marked for deletion. This function is what will be watched by the
deferred deletion leader routine.

PR Checklist

  • updated test coverage
  • external facing docs updated
  • not a security concern

freddygv added 2 commits June 13, 2022 12:10
When deleting a peering we do not want to delete the peering and all
imported data in a single operation, since deleting a large amount of
data at once could overload Consul.

Instead we defer deletion of peerings so that:

1. When a peering deletion request is received via gRPC the peering is
   marked for deletion by setting the DeletedAt field.

2. A leader routine will monitor for peerings that are marked for
   deletion and kick off a throttled deletion of all imported resources
   before deleting the peering itself.

This commit mostly addresses point #1 by modifying the peering service
to mark peerings for deletion. Another key change is to add a
PeeringListDeleted state store function which can return all peerings
marked for deletion. This function is what will be watched by the
deferred deletion leader routine.
1. Fix a bug where the peering leader routine would not track all active
   peerings in the "stored" reconciliation map. This could lead to
   tearing down streams where the token was generated, since the
   ConnectedStreams() method used for reconciliation returns all streams
   and not just the ones initiated by this leader routine.

2. Fix a race where stream contexts were being canceled before
   termination messages were being processed by a peer.

   Previously the leader routine would tear down streams by canceling
   their context right after the termination message was sent. This
   context cancelation could be propagated to the server side faster
   than the termination message. Now there is a change where the
   dialing peer uses CloseSend() to signal when no more messages will
   be sent. Eventually the server peer will read an EOF after receiving
   and processing the preceding termination message.

   Using CloseSend() is actually not enough to address the issue
   mentioned, since it doesn't wait for the server peer to finish
   processing messages. Because of this now the dialing peer also reads
   from the stream until an error signals that there are no more
   messages. Receiving an EOF from our peer indicates that they
   processed the termination message and have no additional work to do.

   Given that the stream is being closed, all the messages received by
   Recv are discarded. We only check for errors to avoid importing new
   data.
@freddygv freddygv added the pr/no-changelog PR does not need a corresponding .changelog entry label Jun 13, 2022
@github-actions github-actions bot added the theme/api Relating to the HTTP API interface label Jun 13, 2022
@freddygv freddygv merged commit 9890dfa into main Jun 13, 2022
@freddygv freddygv deleted the peering/deletion branch June 13, 2022 18:53
@kisunji kisunji added the theme/cluster-peering Related to Consul's cluster peering feature label Jul 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/no-changelog PR does not need a corresponding .changelog entry theme/api Relating to the HTTP API interface theme/cluster-peering Related to Consul's cluster peering feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants