Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename lhmn_ tables to lhma_ to avoid IBP stalls #41

Merged
merged 4 commits into from
Aug 1, 2018

Conversation

bbuchalter
Copy link

When an LHM worker fails, cleanup_current_run must be called to remove
the triggers and "new" tables (which start with lhmn_). The previous
behavior was to drop the table immediately. However, if this is an active
table, the InnoDB buffer pool can be full of pages related to this "lhmn_"
table. When it is dropped, this forces IBP to clear to those pages and can
cause MySQL to become unresponsive.

By instead renaming this table with the archive prefix (lhma_) when can let
the buffer unload relevant pages overtime, and then later, safely, drop the
archive tables as part of regular scheduled maintenance.

Brian Buchalter added 3 commits July 31, 2018 16:19
The amount of behavior implemented in the base Lhm module was excessive.
The code written there was intentionally made terse to try and limit
the amount of code written there. By extracting it to it's own class,
we can be more expressive, which will make future refactoring easier.
We're about to change the behavior of the current cleanup
and I'd like to have more explicit tests about exactly what
will be executed.
In the next commit, we'll need to be able to generate timestamps
so let's extract this logic first.
@bbuchalter bbuchalter changed the title Rename lhmn_ tables to lhmna_ to avoid IBP stalls Rename lhmn_ tables to lhma_ to avoid IBP stalls Aug 1, 2018
When an LHM worker fails, `cleanup_current_run` must be called
to remove the triggers and "new" tables (which start with lhmn_).
The previous behavior was to drop the table
immediately. However, if this is an active table, the InnoDB buffer pool
can be full of pages related to this "lhmn_" table. When it is dropped,
this forces IBP to clear to those pages and can cause MySQL to become
unresponsive.

By instead renaming this table with the archive prefix (lhma_)
when can let the buffer unload relevant pages overtime, and then
later, safely, drop the archive tables as part of regular scheduled
maintenance.
@bbuchalter bbuchalter force-pushed the rename_instead_of_drop_tables branch from 7326422 to a9a4349 Compare August 1, 2018 17:06

def all_triggers_for_origin
@all_triggers_for_origin ||= connection.select_values("show triggers like '%#{origin_table_name}'").collect do |trigger|
trigger.respond_to?(:trigger) ? trigger.trigger : trigger
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this isn't your code but wtf? I wonder if this is Mysql vs Mysql2 stuff.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's likely the cause.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is exactly what it is, yes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordanwheeler do you happen to know which is the mysql2 syntax?

Copy link

@insom insom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to follow commit by commit so I can see how the factoring took shape. 🚢

@time = time
end

def to_s

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty fancy there guy

Copy link

@jordanwheeler jordanwheeler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd feel better if the timestamp stuff was tested better, but it's the same code which you haven't actually changed, and testing it nicely there would likely require timecop or something, which is a lot of effort for such a small change.

i just thought i'd mention that. i like what you've done here 👍

@bbuchalter bbuchalter merged commit 2bed67e into master Aug 1, 2018
@bbuchalter bbuchalter deleted the rename_instead_of_drop_tables branch August 1, 2018 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants