Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Reduce MySQL binlog expiry from 30 days to 3 #926

Merged
merged 1 commit into from
Nov 23, 2023

Conversation

fghaas
Copy link
Contributor

@fghaas fghaas commented Oct 23, 2023

MySQL 8 defaults to a binlog expiry period of 2592000 seconds (30 days), which for Tutor/Open edX purposes can be considered
excessive.

On the one hand, it is unlikely that a MySQL server configured for Tutor uses MySQL replication at all (considering that up until Tutor 15 and MySQL 5.7, the binlog was disabled by default, rendering replication impossible). Even if it does, a replica lagging more than two days behind the primary server would be unacceptable.

Likewise, it is unlikely that an Open edX database is backed up less than once a day, thus is is unlikely that Open edX admins would benefit from the ability to do point-in-time restore over a 30-day period.

On the other hand, having a 30-day binlog expiry period can considerably increase the storage space requirements for the MySQL container, particularly on busy Open edX platforms. When left unchecked, this can even cause the MySQL container to run into "No space left on device" situations, disabling the MySQL database altogether. Thus, the MySQL default settings are likely to be a net disadvantage for Open edX admins.

Finally, all of the above considerations apply only if the Open edX administrator has chosen to run their own MySQL and not opted for a DBaaS solution like AWS RDS.

Thus, it should be acceptable to run with a reduced binlog expiry period of 3 days (rather than 30) by default.

Therefore, inject the --binlog-expire-logs-seconds=259200 argument into the Tutor-generated command to start mysqld.

Reference:
https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#sysvar_binlog_expire_logs_seconds

@regisb
Copy link
Contributor

regisb commented Oct 27, 2023

Your proposed change makes a lot of sense. Having a 2-day binlog that would allow users to restore their databases would be extremely useful. I would go in favour of this PR rather than #922 -- though I very much want to get to the bottom of our conversation on plugins vs settings.

@fghaas
Copy link
Contributor Author

fghaas commented Oct 27, 2023

Yes, at this stage I am also leaning more toward this change than toward the one proposed in #922, but we're not done testing yet which is why I haven't moved this out of the Draft stage. :)

@fghaas
Copy link
Contributor Author

fghaas commented Oct 27, 2023

Having a 2-day binlog that would allow users to restore their databases would be extremely useful.

Just to clarify this point: having a 2-day binlog does not allow users to restore their databases, at least not if the database has been lost. They still need a full database backup to do that. What the binlog then allows them to do, once they have restored the latest full backup, is restore to a certain state of the database at a selected time between the most recent backup and the last recorded state of the binlog.

(You'll notice that in an environment where MySQL runs in a container, possibly even in a hosted Kubernetes environment, losing the database while retaining the binlog data intact is a somewhat unlikely scenario.)

@regisb
Copy link
Contributor

regisb commented Oct 30, 2023

Right. Thanks for the clarification! I still think that it's reassuring to have a binlog in case of an unfortunate DROP TABLE on the production database.

@fghaas
Copy link
Contributor Author

fghaas commented Oct 30, 2023

Ah, okay. Yes, you have a point there.

@fghaas fghaas changed the title fix: Reduce MySQL binlog expiry from 30 days to 2 fix: Reduce MySQL binlog expiry from 30 days to 3 Oct 30, 2023
@fghaas
Copy link
Contributor Author

fghaas commented Oct 30, 2023

I have now updated this to set the binlog expiry period to 3 days, rather than 2. Rationale: in most countries, the weekend is two days. If, say, some problem arose on a Saturday at 2am local time, and the problem went unnoticed until Monday morning when someone comes to work, then with a 2-day binlog expiry period it would already be too late for point-in-time restore. Thus, 3 days is probably more prudent than 2, and still more reasonable than the default 30.

@fghaas fghaas marked this pull request as ready for review October 30, 2023 13:57
@fghaas
Copy link
Contributor Author

fghaas commented Oct 31, 2023

Just to add a comment since I don't know if GitHub sends notifications when a PR goes out of the Draft stage: I believe this one is now good to be reviewed.

@regisb
Copy link
Contributor

regisb commented Nov 7, 2023

@fghaas I assume that mysql does the "right thing" when we start it with --binlog-expire-logs-seconds=259200, after it has been running without it for some time? The right thing being: keep running and clear expired log entries.

MySQL 8 defaults to a binlog expiry period of 2592000 seconds
(30 days), which for Tutor/Open edX purposes can be considered
excessive.

On the one hand, it is unlikely that a MySQL server configured for
Tutor uses MySQL replication at all (considering that up until Tutor
15 and MySQL 5.7, the binlog was disabled by default, rendering
replication impossible). Even if it does, a replica lagging more than
two days behind the primary server would be unacceptable.

Likewise, it is unlikely that an Open edX database is backed up less
than once a day, thus is is unlikely that Open edX admins would
benefit from the ability to do point-in-time restore over a 30-day
period.

On the other hand, having a 30-day binlog expiry period can
considerably increase the storage space requirements for the MySQL
container, particularly on busy Open edX platforms. When left
unchecked, this can even cause the MySQL container to run into "No
space left on device" situations, disabling the MySQL database
altogether. Thus, the MySQL default settings are likely to be a net
disadvantage for Open edX admins.

Finally, all of the above considerations apply only if the Open edX
administrator has chosen to run their own MySQL and not opted for a
DBaaS solution like AWS RDS.

Thus, it should be acceptable to run with a reduced binlog expiry
period of 3 days (rather than 30) by default.

Therefore, inject the --binlog-expire-logs-seconds=259200 argument
into the Tutor-generated command to start mysqld.

Reference:
https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#sysvar_binlog_expire_logs_seconds
@fghaas
Copy link
Contributor Author

fghaas commented Nov 21, 2023

@fghaas I assume that mysql does the "right thing" when we start it with --binlog-expire-logs-seconds=259200, after it has been running without it for some time? The right thing being: keep running and clear expired log entries.

Yes, at least according to our tests that's exactly what happens. Just for the record though, note that we run on MySQL 8.0, and do not intend to move to 8.1 for the time being. So to be perfectly sure that this is safe you might still want to conduct some testing with 8.1.

Copy link
Contributor

@regisb regisb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again Florian!

@regisb regisb merged commit 8fdb6f5 into overhangio:master Nov 23, 2023
1 check passed
@regisb
Copy link
Contributor

regisb commented Nov 23, 2023

I can confirm that the binlog was automatically reduced to fewer entries after applying your patch, even with MySQL 8.1.

@fghaas
Copy link
Contributor Author

fghaas commented Nov 23, 2023

Excellent, thanks for confirming!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

2 participants