-
Notifications
You must be signed in to change notification settings - Fork 449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fate reservations moved out of memory #4524
Conversation
- Reservations for MetaFateStore were moved out of memory into ZooKeeper - Reservations for UserFateStore were moved out of memory into the Accumulo Fate table - Added test MultipleStoresIT This commit is one part needed for having Fate be distributed
Marking this as ready for review, since it may not be reviewed otherwise |
core/src/main/java/org/apache/accumulo/core/fate/user/UserFateStore.java
Outdated
Show resolved
Hide resolved
- Combined the LockID and reservation attempt UUID into one new object: FateReservation - Now cover edge case with tryReserve() where a write may make it to the server, but the server dies before a response is received - New FATE Thread: DeadReservationCleaner which deletes reservations held by Managers that have since died - Created new classes: ColumnValueMappingIterator and ReservationMappingIterator. ColumnValueMappingIterator abstracts out the common functionality of ReservationMappingIterator and StatusMappingIterator. ReservationMappingIterator is an iterator used for determining if the reservation column for a FateId has a FateReservation set or not - Expanded/improved/simplified MultipleStoresIT tests
I have made the following changes in e6671fb:
There are also some comments/questions labeled "TODO 4131" that I would like input on. I also have to resolve the merge conflicts which I will do now. (edit: resolved) |
hadoop version used in the github build (3.0.3), causing a build failure
hadoop version used in the github build (3.0.3), causing a build failure. This was not fixed with 0809fd3. Should be fixed now.
Changed MultipleStoresIT from using a shaded library in hadoop (Mockito) to using EasyMock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looking at this PR. Posting the comments I have so far.
core/src/main/java/org/apache/accumulo/core/fate/MetaFateStore.java
Outdated
Show resolved
Hide resolved
byte[] serialize() { | ||
try (ByteArrayOutputStream baos = new ByteArrayOutputStream(); | ||
DataOutputStream dos = new DataOutputStream(baos)) { | ||
dos.writeUTF(status.name()); | ||
if (isReserved()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a change to make in this PR. Thinking if we push all of the Fate data into a single node it would be nice to use JSON. That would make the data stored in ZK more human readable and it would make it easier to serialize and de-serialize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a good idea
@@ -502,4 +520,14 @@ protected Serializable deserializeTxInfo(TxInfo txInfo, byte[] data) { | |||
throw new IllegalStateException("Bad node data " + txInfo); | |||
} | |||
} | |||
|
|||
/** | |||
* TODO 4131 this is a temporary method used to create a dummy lock when using a FateStore outside |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a follow on issue. We can have a path in ZK for utilities to get a ZK lock. Currently I think some of the fate admin utilities that change the fate store require that no manager process is running. These utilities could be changed to get a lock at this utility path in ZK and reserve fate txs they operate on instead of requiring no manager process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that sounds like a great change. And yes you are correct about some of the fate admin utils requiring the Manager to be down, so that would be helpful for those as well.
core/src/main/java/org/apache/accumulo/core/fate/user/UserFateStore.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/fate/user/UserFateStore.java
Outdated
Show resolved
Hide resolved
@Override | ||
public void run() { | ||
while (keepRunning.get()) { | ||
store.deleteDeadReservations(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not looked into it, but need to understand what happens if this thread dies/throws an exception. Accumulo servers processes can designate some threads as critical and that would essentially cause the manager process to die when that is done. May want to do that for this thread, if its not already done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the DeadReservationCleaner dies, the Manager continues to run normally. I changed it to be a critical thread. When looking at this, I noticed that none of the other Fate threads are critical. It seems that maybe they should be? I can create a follow on issue if so. If the WorkFinder or any of the actual workers (TransactionRunner) die, Manager will continue to run normally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be good to open a follow on issue about making all of the fate threads critical.
test/src/main/java/org/apache/accumulo/test/fate/MultipleStoresIT.java
Outdated
Show resolved
Hide resolved
test/src/main/java/org/apache/accumulo/test/fate/MultipleStoresIT.java
Outdated
Show resolved
Hide resolved
- Simplified MetaFateStore.isReserved(FateId) (unnecessary error handling) - Added check in MetaFateStore.FateTxStoreImpl.setStatus() that was missing: needed to ensure that the reservation stored in the store and in ZK were equivalent - Added some logging to MetaFateStore.FateTxStoreImpl.unreserve() - No longer using a ConditionalMutation to check if a FateId is reserved for UserFateStore, just scan the table instead - Made the DeadReservationCleaner a critical thread to the Manager (if it dies, so does the Manager) - Stores now take a Predicate<ZooUtil.LockID> isLockHeld which is used to determine if the given LockID is live. This considerably simplified the new MultipleStoresIT and the stores themselves.
@keith-turner Thanks for the review. I addressed the suggested changes. This PR is ready for re-review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some small changes. Feel free to take or leave any
core/src/main/java/org/apache/accumulo/core/fate/user/UserFateStore.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still review this, posting the comments I have made so far. Hoping to circle back to this really soon like later today or tomorrow and finish reviewing.
I went through and resolved all the comments I made last time, all of those updates look good. The one I left unresolved were only because those made need follow on issues opened so I did not want to forget about those.
core/src/main/java/org/apache/accumulo/core/fate/MetaFateStore.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/fate/AbstractFateStore.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/fate/user/UserFateStore.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/fate/AbstractFateStore.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reviewed everything but the test changes, plan on looking at those tomorrow and then I will be finished reviewing.
core/src/main/java/org/apache/accumulo/core/fate/AbstractFateStore.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/fate/FateStore.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/fate/MetaFateStore.java
Outdated
Show resolved
Hide resolved
@@ -55,7 +60,7 @@ | |||
|
|||
//TODO use zoocache? - ACCUMULO-1297 | |||
//TODO handle zookeeper being down gracefully - ACCUMULO-1297 | |||
|
|||
// TODO 4131 noticed this class is not in the fate.zookeeper package. Should it be? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it probably should be. That would be consistent with UserFateStore, which is in a sub package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's what I thought. This can be done in a follow on issue/PR. This can be kept unresolved until then
core/src/main/java/org/apache/accumulo/core/fate/user/FateMutatorImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/fate/user/FateMutatorImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/fate/user/UserFateStore.java
Outdated
Show resolved
Hide resolved
- Stricter check of the column value for fate reservations. If anything unexpected is seen, an error is now thrown. - Simplified MetaFateStore.getActiveReservations() to only read from ZooKeeper once. - Combined the two scans that were done in AbstractFateStore.runnable() into one. This meant adding FateReservation to FateIdStatus and refactoring Meta/UserFateStore.getTransactions(). - No longer use/store a string representation of the FateReservation (was used in UserFateStore). Now, only the serialized value is used. This keeps the usage of FateReservation consistent across Meta and UserFateStore. It was also unneccessary, so simplifies code. - Moved AbstractFateStore.createAndReserve() implementation into Meta and UserFateStore, and rewrote the impl for each to work with the new way reservations are stored. This also allowed me to delete methods that were only used by AFS.createAndReserve(): create(FateId, FateKey), getStatusAndKey(FateId), create(FateKey). - Fixed how concurrentStatusChangeCallers was decremented in AbstractFateStore.waitForStatusChange() - Small change to MetaFateStore.deleteDeadReservations() to avoid reading from ZK unnecessarily - Added isReservedBy() method to MetaFateStore.NodeValue to avoid code duplication and make the code more clear. - Since the FateIdStatus now has the FateReservation, realized Meta and UserFateStore.getActiveReservations() could now be simplified to just call list(). This also made the impls the same, so moved to AbstractFateStore. This also made me realize that I had put getActiveReservations() method signature in FateStore, but would be better suited for ReadOnlyFateStore, so moved it there. - Deleted FateStore.isReserved(FateId)... No longer needed/used - Moved UNKNOWN status check in AbstractFateStore.reserve() into waiting loop - Now log when a dead reservation is detected and deleted - Minor change to Fate: no longer create the executor for the dead reservation cleaner if it's not going to be used
@keith-turner, @DomGarguilo |
- New createAndReserve was not working as expected for UserFateStore, fixed the bug
I have found and addressed the bug in my most recent commit. This PR is now ready for full review. |
test/src/main/java/org/apache/accumulo/test/fate/MultipleStoresIT.java
Outdated
Show resolved
Hide resolved
testDeadReservationsCleanup(FateInstanceType.USER); | ||
} | ||
|
||
private void testDeadReservationsCleanup(FateInstanceType storeType) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test looks good. Looking over this test realized we need another test in a follow on PR. Need a test that checks that when the reservation does not exists in the persisted store that every update fails. Test would look something like the following.
FateStore.FateTxStore<?> txStore = store1.reserve(fateId);
// TODO delete the reservation in persisted store
assertThrows(Exception.class, ()->txStore.push(repo));
assertThrows(Exception.class, ()->txStore.setStatus(ReadOnlyFateStore.TStatus.IN_PROGRESS));
assertThrows(Exception.class, ()->txStore.setTransactionInfo(t,v));
//etc test all methods that write
For now may only be able to make this work with the UserFateStore as the current zookeeper schema does not support the conditional checks of reservations on updates.
// TODO 4131 could potentially have separate classes for testing MetaFateStore and UserFateStore | ||
// similar to how FateTestRunner is used, however that interface doesn't work as nicely here | ||
// since we are using multiple stores instead of just one. Can do something similar to | ||
// FateTestRunner here if desired |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting. Wonder if we could pass a factory to test like the following.
interface TestFateStoreFactory {
FateStore create(ZooUtil.LockID lockid);
}
Then test could use the factory to create as many stores as needed. We could run the test passing in an impl that creates MetaFateStore and another that creates UserFateStore.
@keith-turner I have compiled and wrote all the follow on issues for this to be merged. This covers all the issues I marked in this PR as Here is a synopsis of the follow on issues I will create (so far - still need some questions answered):
Here are some questions I still have:
I made the max wait = curr num callers in seconds. Does this address the first TODO? For the other two TODOs, I was not sure what was to be done, so I left them. Can these be safely deleted without creating follow on issue(s)? If follow on issues should be made, perhaps it would be best for you to write these or explain it here because I am not sure what should be done.
Once the above questions are answered, I can complete my full list of new follow-on issues. This will allow me to remove all the |
I resolved a few of the comments.
That type seems like the best existing type to use, so I do not think an issue is needed.
30 seconds seems too short to me. Would be doing a lot of work and usually never finding anything, should probably be longer maybe 2 or 3 minutes.
I was thinking of putting all data that is related to a single fate id into a single zk node. This will allow atomic updates. It will be slower (always have to rewrite all repos when adding/removing). We are only using it for the metadata table. So thinking it will make things more correct at the expense of speed but that tradeoff is good for metadata related fates.
Yeah you can delete those. I will look into it and open an issue if needed. |
@kevinrr888 it seems like the only unresolved conversations are covered by your list of issues to open. I just left those open for now. |
@keith-turner - The I will post a list of all the outstanding changes to be made below. The trivial ones, I will just make a PR for without creating an issue, the more involved follow-ons will be tracked in an issue. Here is the list of changes to be made: To be tracked by issue:
Will do in one or a few small PRs:
|
- AbstractFateStore still had a synchronization block for accessing `deferred` which was changed to a synchronized map in this PR; removed since there shouldn't be any more sync blocks in this class.
Resolved the merge conflicts with
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a look at the changes since my last review and those all look good.
This may still need to be looked into just posting this as a reminder just in case |
This commit makes several improvements/fixes: improvements to the `admin fate` util which were made possible from apache#4524, replaced incorrect use of `createDummyLockID()` in real code (now only used in tests), `<User/Meta>FateStore` now support a null lock id if they will be used as read-only stores: write ops will fail on a store with a null lock, and some other misc. changes. Full list of changes: - Removed the check for a dead Manager in the Admin fate util (AdminUtil) which was checked before `admin fate delete <tx>` or `admin fate fail <tx>` was able to run. This check is no longer needed with the changes from apache#4524. apache#4524 moved reservations out of Manager memory into the FATE table (for UserFateStore) and into ZK (for MetaFateStore). Prior to this, the Admin process would have no way of knowing if the Manager process had a transaction reserved, so the Manager had to be shutdown to ensure it was not. But now that reservations are visible to any process, we can try to reserve the transaction in Admin, and if it cannot be reserved and deleted/failed in a reasonable time, we let the user know that the Manager would need to be shutdown if deleting/failing the transaction is still desired. - This has several benefits: - It is one less thing to worry about when implementing multiple managers in the future since Admin assumes only one Manager for these commands. However, there is still the case where the Manager may keep a transaction reserved for a long period of time and the Admin can never reserve it. In this case, we inform the user that the transaction could not be deleted/failed and that if deleting/failing is still desired, the Manager may need to be shutdown. - It covers a potential issue in the previously existing code where there was nothing stopping or ensuring a Manager is not started after the check is already performed in Admin but before the delete/ fail was executed. - It also should make the commands easier to use now since the Manager is not required to be shutdown before use. - Changes and adds some tests for `admin fate fail` and `admin fate delete`: ensures the Manager is not required to be down to fail/delete a transaction, and ensures that if the Manager does have a transaction reserved, admin will fail to reserve and fail/delete the transaction. - Another change which was needed as a prerequisite for the above changes was creating a ZK lock for Admin so transactions can be properly reserved by the command. Added new constant `Constants.ZADMIN_LOCK = "/admin/lock"`, changed `ServiceLockPaths`, and added `Admin.createAdminLock()` to support this - New class `TestLock` (in test package) which is used by tests to create a real ZK lock, or a dummy one. Removed `createDummyLockID()` from `AbstractFateStore` (moved to TestLock), and `createDummyLock()` is now only used in test code. Added new constant `ZTEST_LOCK = "/test/lock"`, changed `ServiceLockPaths`, and added `createTestLock()` which is used to create a real lock id (one held in ZK) which is needed for some tests. - This fixes an unexpected failure that could have occurred for `ExternalCompaction_1_IT`. Was using a dummy lock for the store before and the fate data was being stored in the same locations that the Manager uses for it's fate data. The DeadReservationCleaner running in Manager would have cleaned up reservations created using this store if it ran when reservations were present. Now the test creates a real ZK lock so the DeadReservationCleaner won't clean these up unexpectedly. - Stores now support a null lock id for the situation where they will be used as read-only stores. A store with a null lock id will fail on write ops. Changed all existing uses of stores to only have a lock id if writes will occur (previously, all instances of the stores had a lock id). - Removed unused or unneccesary constructors for AbstractFateStore, MetaFateStore, UserFateStore - Ensured all tests changed, all FATE tests, and sunny day tests still pass closes apache#4904
…ed (#5028) * `admin fate` improvements, `LockID`s use for fate stores improved/fixed This commit makes several improvements/fixes: improvements to the `admin fate` util which were made possible from #4524, replaced incorrect use of `createDummyLockID()` in real code (now only used in tests), `<User/Meta>FateStore` now support a null lock id if they will be used as read-only stores: write ops will fail on a store with a null lock, and some other misc. changes. Full list of changes: - Removed the check for a dead Manager in the Admin fate util (AdminUtil) which was checked before `admin fate delete <tx>` or `admin fate fail <tx>` was able to run. This check is no longer needed with the changes from #4524. #4524 moved reservations out of Manager memory into the FATE table (for UserFateStore) and into ZK (for MetaFateStore). Prior to this, the Admin process would have no way of knowing if the Manager process had a transaction reserved, so the Manager had to be shutdown to ensure it was not. But now that reservations are visible to any process, we can try to reserve the transaction in Admin, and if it cannot be reserved and deleted/failed in a reasonable time, we let the user know that the Manager would need to be shutdown if deleting/failing the transaction is still desired. - This has several benefits: - It is one less thing to worry about when implementing multiple managers in the future since Admin assumes only one Manager for these commands. However, there is still the case where the Manager may keep a transaction reserved for a long period of time and the Admin can never reserve it. In this case, we inform the user that the transaction could not be deleted/failed and that if deleting/failing is still desired, the Manager may need to be shutdown. - It covers a potential issue in the previously existing code where there was nothing stopping or ensuring a Manager is not started after the check is already performed in Admin but before the delete/ fail was executed. - It also should make the commands easier to use now since the Manager is not required to be shutdown before use. - Changes and adds some tests for `admin fate fail` and `admin fate delete`: ensures the Manager is not required to be down to fail/delete a transaction, and ensures that if the Manager does have a transaction reserved, admin will fail to reserve and fail/delete the transaction. - Another change which was needed as a prerequisite for the above changes was creating a ZK lock for Admin so transactions can be properly reserved by the command. Added new constant `Constants.ZADMIN_LOCK = "/admin/lock"`, changed `ServiceLockPaths`, and added `Admin.createAdminLock()` to support this - New class `TestLock` (in test package) which is used by tests to create a real ZK lock, or a dummy one. Removed `createDummyLockID()` from `AbstractFateStore` (moved to TestLock), and `createDummyLock()` is now only used in test code. Added new constant `ZTEST_LOCK = "/test/lock"`, changed `ServiceLockPaths`, and added `createTestLock()` which is used to create a real lock id (one held in ZK) which is needed for some tests. - This fixes an unexpected failure that could have occurred for `ExternalCompaction_1_IT`. Was using a dummy lock for the store before and the fate data was being stored in the same locations that the Manager uses for it's fate data. The DeadReservationCleaner running in Manager would have cleaned up reservations created using this store if it ran when reservations were present. Now the test creates a real ZK lock so the DeadReservationCleaner won't clean these up unexpectedly. - Stores now support a null lock id for the situation where they will be used as read-only stores. A store with a null lock id will fail on write ops. Changed all existing uses of stores to only have a lock id if writes will occur (previously, all instances of the stores had a lock id). - Removed unused or unneccesary constructors for AbstractFateStore, MetaFateStore, UserFateStore - Ensured all tests changed, all FATE tests, and sunny day tests still pass closes #4904
closes #4131
Fate reservations were moved out of memory. This is a WIP PR* for one part that is needed for having Fate be distributed in the future.
*There are still a few things I have marked as TODO that I would like input/suggestions on. These are marked in the code as "TODO 4131".