Skip to content
This repository has been archived by the owner on Nov 23, 2024. It is now read-only.

auto_increment_increment validation #63

Merged
merged 1 commit into from
Apr 22, 2020
Merged

Conversation

mriddle
Copy link
Contributor

@mriddle mriddle commented Mar 25, 2020

This patch adds auto_increment_increment validation to the following scenarios:

  • Connection initialization
  • Allocating one ID
  • Allocating multiple IDs

A lot of refactoring was done as part of this PR and has either been merged separately or moved to #70

These changes have been benchmarked, should have no noticeable impact to performance and continue to be thread safe.

Validation during initialization is done by comparing the client configuration (increment_by) with each alloc server (auto_increment_increment). If the server is misconfigured, an error is reported via the notifier (defaults to error logs) and it's removed from the pool. The connection will be retried every 10 minutes (configurable via the connection_retry) indefinitely.

Validation during allocation happens by storing the ID returned and comparing it with the last 4 identifiers, ensuring they've been incremented correctly by the configured amount. The alloc servers are shared and there's no guarantee that the process will get the next increment. As an example, one process (think unicorn worker or puma thread) may receive 5, 15, 25 while the other received 10 & 20. This is why we validate by checking the remained. This will not raise an error if the auto_increment_increment increases by a factor of 2.

If the auto_increment_increment changes while the clients are using the server, the alloc server is removed from the list of servers and an error is reported via the notifier. The server is retried after a minute, indefinitely.

An exception is raised and ID allocation ceases if all servers are misconfigured. This will result in the process raising a NoServersAvailableException exception and halting. New unicorns or puma threads will attempt to connect each time they are spawned.

To support valid updates to the auto_increment_increment on active alloc servers, the suppress_increment_exceptions option can be used to swallow the exceptions and call the notifier. This will allow those who have monitors configured to be notified of the change and also allow the servers to remain in use while the update is being performed.

Related:

@mriddle mriddle self-assigned this Mar 25, 2020
end
raise InvalidIncrementException unless valid_steps
end

def self.get_uid_for_class(klass, options = {})
with_connections do |connection|
Timeout.timeout(self.global_uid_options[:query_timeout], TimeoutException) do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate issue that we should come back for later: Timeout.timeout is bad mojo and we should do the query timeout some other way

Copy link
Contributor Author

@mriddle mriddle Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggrossman perhaps we should have the alloc server connections set the wait_timeout MySQL session variable to the configured self.global_uid_options[:query_timeout]

That will remove the need for us to manage a timeout here, and the bad mojo that comes with it.

E.g. in the database.yml

test_id_server_1:
  <<: *MYSQL
  database: global_uid_test_id_server_1
  variables:
    auto_increment_increment: 5
    auto_increment_offset: 1
    max_execution_time: 10

test_id_server_2:
  <<: *MYSQL
  database: global_uid_test_id_server_2
  variables:
    auto_increment_increment: 5
    auto_increment_offset: 2
    max_execution_time: 10

That avoids the need for this gem to mess around with connection variables too, giving the client complete control.

EDIT: Ah, nvm. max_execution_time is only supported in MySQL 5.7

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea. I'm in favor of that. We should document that the alloc server connections should definitely declare an appropriate wait_timeout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could wait for another PR but it's a good idea to get rid of Timeout.timeout. It can cause entirely unpredictable badness if it actually triggers.

Copy link
Contributor Author

@mriddle mriddle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @ggrossman! I pushed the draft PR up early for visibility, sorry, I should have been more clear that it was a still a work in progress and I intended to make more changes :)

@mriddle mriddle force-pushed the mriddle-identifier-validation branch 6 times, most recently from 6a44c31 to 6915462 Compare March 27, 2020 16:47
@bquorning
Copy link
Member

I wonder if we should make separate PRs of each of the three “remove” commits:

  • Remove the concept of dry_run
  • Remove ServerVariables module
  • Remove unused and often misleading options

This was referenced Apr 1, 2020
@mriddle mriddle added this to the 4.0.0 milestone Apr 2, 2020
@mriddle mriddle force-pushed the mriddle-identifier-validation branch 2 times, most recently from 1455d1f to 862e9b1 Compare April 2, 2020 12:49
@mriddle mriddle mentioned this pull request Apr 2, 2020
@mriddle mriddle force-pushed the mriddle-identifier-validation branch 6 times, most recently from 213e7d7 to c2480db Compare April 7, 2020 12:53
@mriddle mriddle marked this pull request as ready for review April 7, 2020 14:46
@mriddle mriddle requested a review from a team as a code owner April 7, 2020 14:46
@mriddle
Copy link
Contributor Author

mriddle commented Apr 8, 2020

I forgot to mention that this work along with the work below is targeted for a 4.0.0 gem release (milestone) since the gem is undergoing significant change.

@mriddle mriddle changed the title ID validation auto_increment_increment validation Apr 14, 2020
Copy link
Member

@bquorning bquorning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So many comments. Sorry.

@mriddle mriddle force-pushed the mriddle-identifier-validation branch from beacf78 to 5f90b12 Compare April 20, 2020 10:27
@mriddle
Copy link
Contributor Author

mriddle commented Apr 20, 2020

Addressed feedback, squashed and force pushed, thanks for the solid review 🙏

@mriddle
Copy link
Contributor Author

mriddle commented Apr 20, 2020

I'll leave it open for another 48 hours in case others want to provide feedback before merging.

@mriddle mriddle force-pushed the mriddle-identifier-validation branch from e9e929a to 184a8f3 Compare April 21, 2020 07:27
@mriddle mriddle force-pushed the mriddle-identifier-validation branch from 184a8f3 to b68ec42 Compare April 21, 2020 08:41
@mriddle mriddle force-pushed the mriddle-identifier-validation branch from b68ec42 to 4b3bdbc Compare April 21, 2020 09:07
@mriddle
Copy link
Contributor Author

mriddle commented Apr 21, 2020

Addressed feedback, squashed and force pushed. Thanks

[Breaking change]

This patch adds `auto_increment_increment` validation to the following
scenarios:
 * Connection initalization
 * Generating one ID
 * Generating multiple IDs

These changes should have no noticeable impact on performance and are
thread safe.

The intent is to prevent data corruption when a server is misconfigured
and in the event that one is found, it's removed from the pool of
available alloc servers allowing creation to continue with the valid
alloc servers.

Validation during initialization is done by comparing the client
configuration and with what's set on each server. If the server is
misconfigured, an error is reported via the `notifier` and it's left out
of the pool.

Validation during creation happens by comparing each pair of identifiers
and ensuring they've been incremented correctly by the configured amount.

If the `auto_increment_increment` changes while the clients are using
the server, the alloc server is removed from the list of servers and
an error is reported via the `notifier`.

An InvalidIncrementException is raised if all servers are misconfigured.

The changeset is considered a breaking change as any clients upgrading
will see an increment exception if their `increment_by` was set
incorrectly, this previously would have gone unnoticed.
@mriddle mriddle force-pushed the mriddle-identifier-validation branch from 4b3bdbc to eb5fd15 Compare April 21, 2020 09:55
@mriddle
Copy link
Contributor Author

mriddle commented Apr 21, 2020

Updated the commit message to reflect the final state

@mriddle mriddle merged commit 81c1321 into master Apr 22, 2020
@mriddle mriddle deleted the mriddle-identifier-validation branch April 22, 2020 05:03
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants