-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make acl-config more resilient #1188
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jpwhite4
reviewed
Dec 12, 2019
jpwhite4
reviewed
Dec 12, 2019
Why do you create _bkup tables? Surely they are not necessary if you are using staging tables. |
plessbd
reviewed
Dec 13, 2019
We've run into cases where multiple instances of `acl-config` have been run at the same time. This bit of code should make it so there's only one instance of `acl-config` running at any one time.
- `acl-config` should only concern itself with synchronizing the contents of the configuration files w/ the database. The initial creation of the acl tables can / should be handled when setting up the database.
There have been a couple of instances now where `acl-config` has not completed successfully which has left the db in an inconsistent state. We previously had limited protection against this for the "non-managed" tables ( user_acls, user_acl_group_by_parameters). But really we should be protecting the entire process against this class of error. To that end `acl-config` has been modified to work as such: - we create a backup table for the "non-managed" tables. this will be used later in the process. - a `_staging` table is created for each acl related table. This new staging table is identical to the production table in all but name. - we process each module, hierarchy, and set of realms, group bys statistics etc. but instead of inserting the information into the production tables we populate the staging tables. - we populate the staging version of the "non-managed" tables from the backup tables. - drop the "non-managed" backup tables. - In a single DB transaction, we DELETE FROM all of the production tables and then INSERT INTO all the data from the corresponding staging tables. - finally, we drop the staging tables as they are no longer required. This new process means that none of the information in the production tables is modified until we reach the second to last step, and this entire step is protected by a DB transaction.
- So it turns out that the vast majority of what `acls-import` was doing is not needed anymore. Currently we only need to keep it around for making sure that the public user is created ( which may be able to be moved later ) and the population of `report_template_acls` which needs to occur *after* `acl-config` is run. - I've renamed `refreshAclTables` to `syncTableData`and added `$tables` as a function argument to make what it does a bit more explicit. - I've added a bit of "optimization" to `syncTableData` that takes advantage of how we have the tables laid out. Basically every table is linked to either modules or a table that's linked to modules so instead of deleting / populating each table we can just delete modules and that will cascade to the rest of the tables. Then we can populate everything. I'm still up in the air on this one, it's probably safer to just delete / populate each table, but I'll toss it up for comment.
Without this here, on a fresh install, the Users `user_acl_group_by_parameters` would not be populated.
- These should have been here to begin with but are being added now because of bugs encountered during development. Specifically, during an upgrade where more then one migration is executed ( i.e. 8.5.0 -> 8.5.1, 8.5.1 -> 9.0.0 ), even though we have the population of tables wrapped in a commit, we would end up with 3 duplicate sets of `realms` entries which in turn messed everything else up. Once these unique indexes were added that behavior went away ( but oddly enough, no errors were ever thrown before or after ). I checked the counts of each table before deletion, after deletion, and after population and they all checked out. I even directly queried the `realms` table after the commit and just before `acl-config` was done and it always came back `3`. This is something that I would like to pursue further, but it will have to be one of the things on my "Poke at in between finishing actual work" list.
After conducting some tests it was found that this piece of code is not needed. If `acl-config` is interrupted the lock file is unlocked automatically. **NOTE:** I did investigate the `@flock, @fclose, @unlink` blocks and whether or not they are needed. I ran into problems during the upgrade in which the `acl-config` runs that occur as part of the migration would complete successfully, but the one that runs on ingestion would fail to acquire a lock and bork the whole process. Also, during development when an exception would be thrown early in one of the migration `acl-config` runs the ingestion `acl-config` would also not be able to acquire a lock. Once I added the `@flock, @fclose, @unlink` blocks the ingestion `acl-config` worked correctly in both cases. I did also attempt to not include the `@unlink` but this brought back the problem with the ingestion `acl-config` not being able to acquire a lock again.
Updated the population sql queries for the non-managed tables to correctly work with the staging tables.
All of this code is no longer necessary w/ the way that `acl-config` now operates.
- A user should only ever have one instance of an acl, this index enforces that at the table level.
2a1e26a
to
bfcddd1
Compare
plessbd
approved these changes
Jan 31, 2020
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Category:ACL
Access Control Lists
enhancement
Enhancement of the functionality of an existing feature
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
acl-config
:acl-config
running simultaneouslyacl-config
is ctrl-c'd etc.acls-xdmod-management
multiple times withinacl-config
, we now runit once after the backup tables are handled.
acl-config
works, previously we wholesale Truncated / re-populated the "managed" tables ( any table that we can completely recreate from available
configuration files ) and then re-populated the "non-managed" tables i.e.
user_acls
anduser_acl_group_by_parameters
. This method did not allow us to utilize db transactions as bothTRUNCATE
and DDL ( drop / create / update table ) force an implicit commit on execution. Thisleft us open to having the database in an inconsistent state. Now, instead of truncating / populating
the tables directly we instead create / populate a set of
staging
tables from the configuration filesfirst. This leaves the production tables in tact in the event some problem arises. Then, wrapped in
a db transaction, we DELETE FROM && INSERT INTO each production table from it's associated
staging table.
Motivation and Context
We have run into a number of problems lately that we believe to be caused by multiple copies of
acl-config
running at the same time and or interrupted runs ofacl-config
. It's hoped that these changes will resolve these issues.Tests performed
Types of changes
Checklist: