Fate reservations moved out of memory #4524

kevinrr888 · 2024-05-03T20:19:48Z

closes #4131
Fate reservations were moved out of memory. This is a WIP PR* for one part that is needed for having Fate be distributed in the future.

Reservations for MetaFateStore were moved out of memory into ZooKeeper
Reservations for UserFateStore were moved out of memory into the Accumulo Fate table
- Is an additional column which just indicates whether the FateId of that row is reserved or not
Added test MultipleStoresIT
- Tests the new functionality of how reservations are stored
- I also included a couple tests in MultipleStoresIT that are testing the reserve/unreserve functionality and that it still works as intended. This can be moved from this class if desired.
Verified existing tests in the fate test package (FateIT, FateOpsCommandsIT, FateStoreIT, ZooMutatorIT, FateMutatorImplIT, UserFateStoreIT) still pass

*There are still a few things I have marked as TODO that I would like input/suggestions on. These are marked in the code as "TODO 4131".

- Reservations for MetaFateStore were moved out of memory into ZooKeeper - Reservations for UserFateStore were moved out of memory into the Accumulo Fate table - Added test MultipleStoresIT This commit is one part needed for having Fate be distributed

kevinrr888 · 2024-05-07T15:28:24Z

Marking this as ready for review, since it may not be reviewed otherwise

core/src/main/java/org/apache/accumulo/core/fate/user/UserFateStore.java

- Combined the LockID and reservation attempt UUID into one new object: FateReservation - Now cover edge case with tryReserve() where a write may make it to the server, but the server dies before a response is received - New FATE Thread: DeadReservationCleaner which deletes reservations held by Managers that have since died - Created new classes: ColumnValueMappingIterator and ReservationMappingIterator. ColumnValueMappingIterator abstracts out the common functionality of ReservationMappingIterator and StatusMappingIterator. ReservationMappingIterator is an iterator used for determining if the reservation column for a FateId has a FateReservation set or not - Expanded/improved/simplified MultipleStoresIT tests

kevinrr888 · 2024-06-03T14:15:25Z

I have made the following changes in e6671fb:

Combined the LockID and reservation attempt UUID into one new object: FateReservation
Now cover edge case with tryReserve() where a write may make it to the server, but the server dies before a response is received
New FATE Thread: DeadReservationCleaner which deletes reservations held by Managers that have since died
Created new classes: ColumnValueMappingIterator and ReservationMappingIterator. ColumnValueMappingIterator abstracts out the common functionality of ReservationMappingIterator and StatusMappingIterator. ReservationMappingIterator is an iterator used for determining if the reservation column for a FateId has a FateReservation set or not
Expanded/improved/simplified MultipleStoresIT tests

There are also some comments/questions labeled "TODO 4131" that I would like input on.

I also have to resolve the merge conflicts which I will do now. (edit: resolved)

Formatting, fixed a test failure occurring with a newly added test that was merged in (logic error in my existing code)

hadoop version used in the github build (3.0.3), causing a build failure

hadoop version used in the github build (3.0.3), causing a build failure. This was not fixed with 0809fd3. Should be fixed now.

Changed MultipleStoresIT from using a shaded library in hadoop (Mockito) to using EasyMock

keith-turner

Still looking at this PR. Posting the comments I have so far.

core/src/main/java/org/apache/accumulo/core/fate/MetaFateStore.java

keith-turner · 2024-06-07T19:38:58Z

core/src/main/java/org/apache/accumulo/core/fate/MetaFateStore.java

    byte[] serialize() {
      try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
          DataOutputStream dos = new DataOutputStream(baos)) {
        dos.writeUTF(status.name());
+        if (isReserved()) {


Not a change to make in this PR. Thinking if we push all of the Fate data into a single node it would be nice to use JSON. That would make the data stored in ZK more human readable and it would make it easier to serialize and de-serialize.

That sounds like a good idea

keith-turner · 2024-06-07T19:48:39Z

core/src/main/java/org/apache/accumulo/core/fate/AbstractFateStore.java

@@ -502,4 +520,14 @@ protected Serializable deserializeTxInfo(TxInfo txInfo, byte[] data) {
      throw new IllegalStateException("Bad node data " + txInfo);
    }
  }
+
+  /**
+   * TODO 4131 this is a temporary method used to create a dummy lock when using a FateStore outside


This would be a follow on issue. We can have a path in ZK for utilities to get a ZK lock. Currently I think some of the fate admin utilities that change the fate store require that no manager process is running. These utilities could be changed to get a lock at this utility path in ZK and reserve fate txs they operate on instead of requiring no manager process.

Yes, that sounds like a great change. And yes you are correct about some of the fate admin utils requiring the Manager to be down, so that would be helpful for those as well.

core/src/main/java/org/apache/accumulo/core/fate/user/UserFateStore.java

keith-turner · 2024-06-07T20:40:59Z

core/src/main/java/org/apache/accumulo/core/fate/Fate.java

+    @Override
+    public void run() {
+      while (keepRunning.get()) {
+        store.deleteDeadReservations();


I have not looked into it, but need to understand what happens if this thread dies/throws an exception. Accumulo servers processes can designate some threads as critical and that would essentially cause the manager process to die when that is done. May want to do that for this thread, if its not already done.

After the DeadReservationCleaner dies, the Manager continues to run normally. I changed it to be a critical thread. When looking at this, I noticed that none of the other Fate threads are critical. It seems that maybe they should be? I can create a follow on issue if so. If the WorkFinder or any of the actual workers (TransactionRunner) die, Manager will continue to run normally.

That would be good to open a follow on issue about making all of the fate threads critical.

test/src/main/java/org/apache/accumulo/test/fate/MultipleStoresIT.java

- Simplified MetaFateStore.isReserved(FateId) (unnecessary error handling) - Added check in MetaFateStore.FateTxStoreImpl.setStatus() that was missing: needed to ensure that the reservation stored in the store and in ZK were equivalent - Added some logging to MetaFateStore.FateTxStoreImpl.unreserve() - No longer using a ConditionalMutation to check if a FateId is reserved for UserFateStore, just scan the table instead - Made the DeadReservationCleaner a critical thread to the Manager (if it dies, so does the Manager) - Stores now take a Predicate<ZooUtil.LockID> isLockHeld which is used to determine if the given LockID is live. This considerably simplified the new MultipleStoresIT and the stores themselves.

kevinrr888 · 2024-07-02T14:22:40Z

@keith-turner Thanks for the review. I addressed the suggested changes. This PR is ready for re-review

DomGarguilo

Just some small changes. Feel free to take or leave any

core/src/main/java/org/apache/accumulo/core/fate/Fate.java

core/src/main/java/org/apache/accumulo/core/fate/FateStore.java

core/src/main/java/org/apache/accumulo/core/fate/user/UserFateStore.java

keith-turner

Still review this, posting the comments I have made so far. Hoping to circle back to this really soon like later today or tomorrow and finish reviewing.

I went through and resolved all the comments I made last time, all of those updates look good. The one I left unresolved were only because those made need follow on issues opened so I did not want to forget about those.

core/src/main/java/org/apache/accumulo/core/fate/MetaFateStore.java

core/src/main/java/org/apache/accumulo/core/fate/AbstractFateStore.java

core/src/main/java/org/apache/accumulo/core/fate/Fate.java

core/src/main/java/org/apache/accumulo/core/fate/user/UserFateStore.java

core/src/main/java/org/apache/accumulo/core/fate/AbstractFateStore.java

keith-turner

I have reviewed everything but the test changes, plan on looking at those tomorrow and then I will be finished reviewing.

core/src/main/java/org/apache/accumulo/core/fate/AbstractFateStore.java

core/src/main/java/org/apache/accumulo/core/fate/FateStore.java

core/src/main/java/org/apache/accumulo/core/fate/MetaFateStore.java

keith-turner · 2024-07-09T21:56:07Z

core/src/main/java/org/apache/accumulo/core/fate/MetaFateStore.java

@@ -55,7 +60,7 @@

 //TODO use zoocache? - ACCUMULO-1297
 //TODO handle zookeeper being down gracefully - ACCUMULO-1297
-
+// TODO 4131 noticed this class is not in the fate.zookeeper package. Should it be?


Yeah it probably should be. That would be consistent with UserFateStore, which is in a sub package.

Yeah that's what I thought. This can be done in a follow on issue/PR. This can be kept unresolved until then

core/src/main/java/org/apache/accumulo/core/fate/user/FateMutatorImpl.java

core/src/main/java/org/apache/accumulo/core/logging/FateLogger.java

core/src/main/java/org/apache/accumulo/core/fate/user/UserFateStore.java

- Stricter check of the column value for fate reservations. If anything unexpected is seen, an error is now thrown. - Simplified MetaFateStore.getActiveReservations() to only read from ZooKeeper once. - Combined the two scans that were done in AbstractFateStore.runnable() into one. This meant adding FateReservation to FateIdStatus and refactoring Meta/UserFateStore.getTransactions(). - No longer use/store a string representation of the FateReservation (was used in UserFateStore). Now, only the serialized value is used. This keeps the usage of FateReservation consistent across Meta and UserFateStore. It was also unneccessary, so simplifies code. - Moved AbstractFateStore.createAndReserve() implementation into Meta and UserFateStore, and rewrote the impl for each to work with the new way reservations are stored. This also allowed me to delete methods that were only used by AFS.createAndReserve(): create(FateId, FateKey), getStatusAndKey(FateId), create(FateKey). - Fixed how concurrentStatusChangeCallers was decremented in AbstractFateStore.waitForStatusChange() - Small change to MetaFateStore.deleteDeadReservations() to avoid reading from ZK unnecessarily - Added isReservedBy() method to MetaFateStore.NodeValue to avoid code duplication and make the code more clear. - Since the FateIdStatus now has the FateReservation, realized Meta and UserFateStore.getActiveReservations() could now be simplified to just call list(). This also made the impls the same, so moved to AbstractFateStore. This also made me realize that I had put getActiveReservations() method signature in FateStore, but would be better suited for ReadOnlyFateStore, so moved it there. - Deleted FateStore.isReserved(FateId)... No longer needed/used - Moved UNKNOWN status check in AbstractFateStore.reserve() into waiting loop - Now log when a dead reservation is detected and deleted - Minor change to Fate: no longer create the executor for the dead reservation cleaner if it's not going to be used

kevinrr888 · 2024-07-23T19:00:11Z

@keith-turner, @DomGarguilo
I have addressed all of the review changes, and this PR is ready for re-review. However, some tests are no longer passing (e.g., ComprehensiveIT). I believe this is an issue with my new implementation of createAndReserve. I believe the issue is with the UserFateStore implementation, but could be both. Posting this new commit now to get the other changes out, and to potentially receive feedback on what might be wrong with createAndReserve(). I will continue to look into what the issue might be in the meantime. Please take a look at my new review comments and my new commit message

- New createAndReserve was not working as expected for UserFateStore, fixed the bug

kevinrr888 · 2024-07-24T19:16:59Z

I have found and addressed the bug in my most recent commit. This PR is now ready for full review.
The tests verified to still pass: the sunny day tests, all fate tests (ZooMutatorIT, FateMutatorImplIT, UserFateStoreIT, FateInterleavingIT, FateIT, FateOpsCommandsIT, ComprehensiveFlakyFateIT, DeleteRowsFlakyFateIT, ManagerRepoIT, and MultipleStoresIT (new test added)) except FateStoreIT. FateStoreIT reflects a method that has been removed in this PR, and want to receive feedback about the removal of this method before changes are made to this test. Have a "TODO 4131" marking this so this is not forgotten.

test/src/main/java/org/apache/accumulo/test/fate/MultipleStoresIT.java

keith-turner · 2024-08-02T20:42:25Z

test/src/main/java/org/apache/accumulo/test/fate/MultipleStoresIT.java

+    testDeadReservationsCleanup(FateInstanceType.USER);
+  }
+
+  private void testDeadReservationsCleanup(FateInstanceType storeType) throws Exception {


This test looks good. Looking over this test realized we need another test in a follow on PR. Need a test that checks that when the reservation does not exists in the persisted store that every update fails. Test would look something like the following.

FateStore.FateTxStore<?> txStore = store1.reserve(fateId); // TODO delete the reservation in persisted store assertThrows(Exception.class, ()->txStore.push(repo)); assertThrows(Exception.class, ()->txStore.setStatus(ReadOnlyFateStore.TStatus.IN_PROGRESS)); assertThrows(Exception.class, ()->txStore.setTransactionInfo(t,v)); //etc test all methods that write

For now may only be able to make this work with the UserFateStore as the current zookeeper schema does not support the conditional checks of reservations on updates.

keith-turner · 2024-08-02T20:47:24Z

test/src/main/java/org/apache/accumulo/test/fate/MultipleStoresIT.java

+// TODO 4131 could potentially have separate classes for testing MetaFateStore and UserFateStore
+// similar to how FateTestRunner is used, however that interface doesn't work as nicely here
+// since we are using multiple stores instead of just one. Can do something similar to
+// FateTestRunner here if desired


This is interesting. Wonder if we could pass a factory to test like the following.

interface TestFateStoreFactory { FateStore create(ZooUtil.LockID lockid); }

Then test could use the factory to create as many stores as needed. We could run the test passing in an impl that creates MetaFateStore and another that creates UserFateStore.

kevinrr888 · 2024-08-06T13:56:39Z

@keith-turner I have compiled and wrote all the follow on issues for this to be merged. This covers all the issues I marked in this PR as TODO 4131 and all the unresolved review comments.

Here is a synopsis of the follow on issues I will create (so far - still need some questions answered):

Refactor MultipleStoresIT to function more similarly to other Fate tests like FateIT
Add FateKey toString() method
Move MetaFateStore to org.apache.accumulo.core.fate.zookeeper
Deprecate AbstractFateStore.createDummyLockID(): this includes creating a path in ZK for utilities to get a ZK lock, changing admin fate fail and admin fate delete commands to get a LockID at this path and no longer require the Manager to be down, and only use a LockID for a store if write operations are expected: the store should fail on write operations if writes are not expected.
Replace AFS.verifyReserved with a condition on the RESERVATION_COLUMN to verify that it is reserved
Make WorkFinder and the TransactionRunners critical to the Manager
Refactor how the RESERVATION_COLUMN works for UserFateStore: create/delete column on reserve/unreserve
Additional fate test case from Fate reservations moved out of memory #4524 (comment)

Here are some questions I still have:

This was an existing TODO that you had marked in your "distributed FATE" WIP PR:

// TODO 4131
// TODO make the max time a function of the number of concurrent callers, as the number of
// concurrent callers increases then increase the max wait time
// TODO could support signaling within this instance for known events
// TODO made the maxWait low so this would be responsive... that may put a lot of load in
// the case there are lots of things waiting...

I made the max wait = curr num callers in seconds. Does this address the first TODO? For the other two TODOs, I was not sure what was to be done, so I left them. Can these be safely deleted without creating follow on issue(s)? If follow on issues should be made, perhaps it would be best for you to write these or explain it here because I am not sure what should be done.

Re one of your review comments:
"Not a change to make in this PR. Thinking if we push all of the Fate data into a single node it would be nice to use JSON. That would make the data stored in ZK more human readable and it would make it easier to serialize and de-serialize."
Could you elaborate on what you mean by "all of the Fate data into a single node". By "all" do you mean the Repos, TxInfo, Status, Reservation, and FateKey would be in one ZK node? Writing the issue, I want to be sure I'm understanding correctly.
Can Fate reservations moved out of memory #4524 (comment) be resolved? Or should an issue be made?
There was a TODO regarding whether the ZooUtil.LockID was the best type for the lock. Should I make a follow on issue regarding this or is ZooUtil.LockID okay?
Can this Fate reservations moved out of memory #4524 (comment) be resolved or should a follow on issue be made?
Can these be resolved: Fate reservations moved out of memory #4524 (comment), Fate reservations moved out of memory #4524 (comment)
The DeadReservationCleaner runs every 30 seconds. Is this okay? Should this be longer? Shorter?

Once the above questions are answered, I can complete my full list of new follow-on issues. This will allow me to remove all the TODO 4131 in this PR and all the comments in this PR can be resolved as all of these will be addressed or have a follow on issue created for them.

keith-turner · 2024-08-06T21:41:47Z

I resolved a few of the comments.

There was a TODO regarding whether the ZooUtil.LockID was the best type for the lock. Should I make a follow on issue regarding this or is ZooUtil.LockID okay?

That type seems like the best existing type to use, so I do not think an issue is needed.

The DeadReservationCleaner runs every 30 seconds. Is this okay? Should this be longer? Shorter?

30 seconds seems too short to me. Would be doing a lot of work and usually never finding anything, should probably be longer maybe 2 or 3 minutes.

Could you elaborate on what you mean by "all of the Fate data into a single node". By "all" do you mean the Repos, TxInfo, Status, Reservation, and FateKey would be in one ZK node? Writing the issue, I want to be sure I'm understanding correctly.

I was thinking of putting all data that is related to a single fate id into a single zk node. This will allow atomic updates. It will be slower (always have to rewrite all repos when adding/removing). We are only using it for the metadata table. So thinking it will make things more correct at the expense of speed but that tradeoff is good for metadata related fates.

Can these be safely deleted without creating follow on issue(s)? If follow on issues should be made, perhaps it would be best for you to write these or explain it here because I am not sure what should be done.

Yeah you can delete those. I will look into it and open an issue if needed.

keith-turner · 2024-08-06T21:53:44Z

@kevinrr888 it seems like the only unresolved conversations are covered by your list of issues to open. I just left those open for now.

kevinrr888 · 2024-08-07T17:08:06Z

- AbstractFateStore still had a synchronization block for accessing `deferred` which was changed to a synchronized map in this PR; removed since there shouldn't be any more sync blocks in this class.

kevinrr888 · 2024-08-21T17:47:07Z

Resolved the merge conflicts with AbstractFateStore using the new CountDownTimer.
Also noticed a sync block that should have been removed so removed that.

AbstractFateStore should probably be looked at again before this is merged.

keith-turner

Took a look at the changes since my last review and those all look good.

kevinrr888 · 2024-09-19T17:07:13Z

@kevinrr888 -

This was an existing TODO that you had marked in your "distributed FATE" WIP PR:
// TODO 4131
// TODO make the max time a function of the number of concurrent callers, as the number of
// concurrent callers increases then increase the max wait time
// TODO could support signaling within this instance for known events
// TODO made the maxWait low so this would be responsive... that may put a lot of load in
// the case there are lots of things waiting...
I made the max wait = curr num callers in seconds. Does this address the first TODO? For the other two TODOs, I was not sure what was to be done, so I left them. Can these be safely deleted without creating follow on issue(s)? If follow on issues should be made, perhaps it would be best for you to write these or explain it here because I am not sure what should be done.

@keith-turner -

Yeah you can delete those. I will look into it and open an issue if needed.

This may still need to be looked into just posting this as a reminder just in case

This commit makes several improvements/fixes: improvements to the `admin fate` util which were made possible from apache#4524, replaced incorrect use of `createDummyLockID()` in real code (now only used in tests), `<User/Meta>FateStore` now support a null lock id if they will be used as read-only stores: write ops will fail on a store with a null lock, and some other misc. changes. Full list of changes: - Removed the check for a dead Manager in the Admin fate util (AdminUtil) which was checked before `admin fate delete <tx>` or `admin fate fail <tx>` was able to run. This check is no longer needed with the changes from apache#4524. apache#4524 moved reservations out of Manager memory into the FATE table (for UserFateStore) and into ZK (for MetaFateStore). Prior to this, the Admin process would have no way of knowing if the Manager process had a transaction reserved, so the Manager had to be shutdown to ensure it was not. But now that reservations are visible to any process, we can try to reserve the transaction in Admin, and if it cannot be reserved and deleted/failed in a reasonable time, we let the user know that the Manager would need to be shutdown if deleting/failing the transaction is still desired. - This has several benefits: - It is one less thing to worry about when implementing multiple managers in the future since Admin assumes only one Manager for these commands. However, there is still the case where the Manager may keep a transaction reserved for a long period of time and the Admin can never reserve it. In this case, we inform the user that the transaction could not be deleted/failed and that if deleting/failing is still desired, the Manager may need to be shutdown. - It covers a potential issue in the previously existing code where there was nothing stopping or ensuring a Manager is not started after the check is already performed in Admin but before the delete/ fail was executed. - It also should make the commands easier to use now since the Manager is not required to be shutdown before use. - Changes and adds some tests for `admin fate fail` and `admin fate delete`: ensures the Manager is not required to be down to fail/delete a transaction, and ensures that if the Manager does have a transaction reserved, admin will fail to reserve and fail/delete the transaction. - Another change which was needed as a prerequisite for the above changes was creating a ZK lock for Admin so transactions can be properly reserved by the command. Added new constant `Constants.ZADMIN_LOCK = "/admin/lock"`, changed `ServiceLockPaths`, and added `Admin.createAdminLock()` to support this - New class `TestLock` (in test package) which is used by tests to create a real ZK lock, or a dummy one. Removed `createDummyLockID()` from `AbstractFateStore` (moved to TestLock), and `createDummyLock()` is now only used in test code. Added new constant `ZTEST_LOCK = "/test/lock"`, changed `ServiceLockPaths`, and added `createTestLock()` which is used to create a real lock id (one held in ZK) which is needed for some tests. - This fixes an unexpected failure that could have occurred for `ExternalCompaction_1_IT`. Was using a dummy lock for the store before and the fate data was being stored in the same locations that the Manager uses for it's fate data. The DeadReservationCleaner running in Manager would have cleaned up reservations created using this store if it ran when reservations were present. Now the test creates a real ZK lock so the DeadReservationCleaner won't clean these up unexpectedly. - Stores now support a null lock id for the situation where they will be used as read-only stores. A store with a null lock id will fail on write ops. Changed all existing uses of stores to only have a lock id if writes will occur (previously, all instances of the stores had a lock id). - Removed unused or unneccesary constructors for AbstractFateStore, MetaFateStore, UserFateStore - Ensured all tests changed, all FATE tests, and sunny day tests still pass closes apache#4904

…ed (#5028) * `admin fate` improvements, `LockID`s use for fate stores improved/fixed This commit makes several improvements/fixes: improvements to the `admin fate` util which were made possible from #4524, replaced incorrect use of `createDummyLockID()` in real code (now only used in tests), `<User/Meta>FateStore` now support a null lock id if they will be used as read-only stores: write ops will fail on a store with a null lock, and some other misc. changes. Full list of changes: - Removed the check for a dead Manager in the Admin fate util (AdminUtil) which was checked before `admin fate delete <tx>` or `admin fate fail <tx>` was able to run. This check is no longer needed with the changes from #4524. #4524 moved reservations out of Manager memory into the FATE table (for UserFateStore) and into ZK (for MetaFateStore). Prior to this, the Admin process would have no way of knowing if the Manager process had a transaction reserved, so the Manager had to be shutdown to ensure it was not. But now that reservations are visible to any process, we can try to reserve the transaction in Admin, and if it cannot be reserved and deleted/failed in a reasonable time, we let the user know that the Manager would need to be shutdown if deleting/failing the transaction is still desired. - This has several benefits: - It is one less thing to worry about when implementing multiple managers in the future since Admin assumes only one Manager for these commands. However, there is still the case where the Manager may keep a transaction reserved for a long period of time and the Admin can never reserve it. In this case, we inform the user that the transaction could not be deleted/failed and that if deleting/failing is still desired, the Manager may need to be shutdown. - It covers a potential issue in the previously existing code where there was nothing stopping or ensuring a Manager is not started after the check is already performed in Admin but before the delete/ fail was executed. - It also should make the commands easier to use now since the Manager is not required to be shutdown before use. - Changes and adds some tests for `admin fate fail` and `admin fate delete`: ensures the Manager is not required to be down to fail/delete a transaction, and ensures that if the Manager does have a transaction reserved, admin will fail to reserve and fail/delete the transaction. - Another change which was needed as a prerequisite for the above changes was creating a ZK lock for Admin so transactions can be properly reserved by the command. Added new constant `Constants.ZADMIN_LOCK = "/admin/lock"`, changed `ServiceLockPaths`, and added `Admin.createAdminLock()` to support this - New class `TestLock` (in test package) which is used by tests to create a real ZK lock, or a dummy one. Removed `createDummyLockID()` from `AbstractFateStore` (moved to TestLock), and `createDummyLock()` is now only used in test code. Added new constant `ZTEST_LOCK = "/test/lock"`, changed `ServiceLockPaths`, and added `createTestLock()` which is used to create a real lock id (one held in ZK) which is needed for some tests. - This fixes an unexpected failure that could have occurred for `ExternalCompaction_1_IT`. Was using a dummy lock for the store before and the fate data was being stored in the same locations that the Manager uses for it's fate data. The DeadReservationCleaner running in Manager would have cleaned up reservations created using this store if it ran when reservations were present. Now the test creates a real ZK lock so the DeadReservationCleaner won't clean these up unexpectedly. - Stores now support a null lock id for the situation where they will be used as read-only stores. A store with a null lock id will fail on write ops. Changed all existing uses of stores to only have a lock id if writes will occur (previously, all instances of the stores had a lock id). - Removed unused or unneccesary constructors for AbstractFateStore, MetaFateStore, UserFateStore - Ensured all tests changed, all FATE tests, and sunny day tests still pass closes #4904

kevinrr888 marked this pull request as ready for review May 7, 2024 15:28

keith-turner reviewed May 11, 2024

View reviewed changes

core/src/main/java/org/apache/accumulo/core/fate/user/UserFateStore.java Outdated Show resolved Hide resolved

kevinrr888 mentioned this pull request May 29, 2024

FATE threads should not be started in the constructor #4609

Open

kevinrr888 added 6 commits June 3, 2024 11:02

Merge branch 'elasticity' into elasticity-feature-4131

8d72860

Changes:

47d16b0

Formatting, fixed a test failure occurring with a newly added test that was merged in (logic error in my existing code)

Changed a method call which is available in hadoop 3.3.6 but not in the

0809fd3

hadoop version used in the github build (3.0.3), causing a build failure

formatting

ee6bb88

Changed a method call which is available in hadoop 3.3.6 but not in the

a92095d

hadoop version used in the github build (3.0.3), causing a build failure. This was not fixed with 0809fd3. Should be fixed now.

Mockito -> EasyMock in MultipleStoresIT

be1ac10

Changed MultipleStoresIT from using a shaded library in hadoop (Mockito) to using EasyMock

keith-turner reviewed Jun 7, 2024

View reviewed changes

kevinrr888 added 4 commits June 25, 2024 09:33

Merge branch 'elasticity' into elasticity-feature-4131

bc388d5

Merge branch 'elasticity' into elasticity-feature-4131

d704555

Build fix

176b9c3

DomGarguilo reviewed Jul 2, 2024

View reviewed changes

keith-turner reviewed Jul 9, 2024

View reviewed changes

keith-turner reviewed Jul 10, 2024

View reviewed changes

core/src/main/java/org/apache/accumulo/core/fate/user/UserFateStore.java Outdated Show resolved Hide resolved

keith-turner mentioned this pull request Jul 11, 2024

Create prototype gRPC getCompactionJob() service for external compactions #4715

Closed

ctubbsii added this to the 4.0.0 milestone Jul 12, 2024

kevinrr888 added 2 commits July 24, 2024 12:12

Merge branch 'elasticity' into elasticity-feature-4131

0447ede

Fixed UserFateStore.createAndReserve()

d352678

- New createAndReserve was not working as expected for UserFateStore, fixed the bug

keith-turner approved these changes Aug 2, 2024

View reviewed changes

kevinrr888 mentioned this pull request Aug 5, 2024

Modified UserFateStore.create to behave like MetaFateStore #4787

Merged

Verify err msg contains fate id in MultipleStoresIT

71227de

Removed TODOs for this PR: 'TODO 4131'

9d46daa

kevinrr888 added 2 commits August 21, 2024 13:36

Merge branch 'elasticity' into elasticity-feature-4131

55f02d1

removed synchronization block:

3557252

- AbstractFateStore still had a synchronization block for accessing `deferred` which was changed to a synchronized map in this PR; removed since there shouldn't be any more sync blocks in this class.

dlmarion changed the base branch from elasticity to main August 26, 2024 12:16

kevinrr888 linked an issue Sep 11, 2024 that may be closed by this pull request

Move reservations in FATE from memory into the persisted store #4131

Closed

keith-turner approved these changes Sep 14, 2024

View reviewed changes

Merge branch 'main' into elasticity-feature-4131

fa73611

kevinrr888 merged commit d84a3b9 into apache:main Sep 19, 2024
8 checks passed

kevinrr888 mentioned this pull request Sep 19, 2024

Refactor MultipleStoresIT #4903

Closed

keith-turner mentioned this pull request Oct 12, 2024

Move metadata operations for compactions from coordinator to compactors #4978

Open

kevinrr888 mentioned this pull request Oct 17, 2024

Improvements to reservation column in FATE table and bug fix #4992

Merged

kevinrr888 mentioned this pull request Oct 31, 2024

admin fate improvements, LockIDs use for fate stores improved/fixed #5028

Merged

kevinrr888 deleted the elasticity-feature-4131 branch November 1, 2024 15:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fate reservations moved out of memory #4524

Fate reservations moved out of memory #4524

kevinrr888 commented May 3, 2024 •

edited

Loading

kevinrr888 commented May 7, 2024

kevinrr888 commented Jun 3, 2024 •

edited

Loading

keith-turner left a comment

keith-turner Jun 7, 2024

kevinrr888 Jul 2, 2024

keith-turner Jun 7, 2024

kevinrr888 Jul 2, 2024

keith-turner Jun 7, 2024

kevinrr888 Jul 2, 2024

keith-turner Jul 2, 2024

kevinrr888 commented Jul 2, 2024

DomGarguilo left a comment

keith-turner left a comment

keith-turner left a comment

keith-turner Jul 9, 2024

kevinrr888 Jul 23, 2024

kevinrr888 commented Jul 23, 2024 •

edited

Loading

kevinrr888 commented Jul 24, 2024

keith-turner Aug 2, 2024

keith-turner Aug 2, 2024

kevinrr888 commented Aug 6, 2024 •

edited

Loading

keith-turner commented Aug 6, 2024 •

edited

Loading

keith-turner commented Aug 6, 2024

kevinrr888 commented Aug 7, 2024 •

edited

Loading

kevinrr888 commented Aug 21, 2024

keith-turner left a comment

kevinrr888 commented Sep 19, 2024

Fate reservations moved out of memory #4524

Fate reservations moved out of memory #4524

Conversation

kevinrr888 commented May 3, 2024 • edited Loading

kevinrr888 commented May 7, 2024

kevinrr888 commented Jun 3, 2024 • edited Loading

keith-turner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevinrr888 commented Jul 2, 2024

DomGarguilo left a comment

Choose a reason for hiding this comment

keith-turner left a comment

Choose a reason for hiding this comment

keith-turner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevinrr888 commented Jul 23, 2024 • edited Loading

kevinrr888 commented Jul 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevinrr888 commented Aug 6, 2024 • edited Loading

keith-turner commented Aug 6, 2024 • edited Loading

keith-turner commented Aug 6, 2024

kevinrr888 commented Aug 7, 2024 • edited Loading

kevinrr888 commented Aug 21, 2024

keith-turner left a comment

Choose a reason for hiding this comment

kevinrr888 commented Sep 19, 2024

kevinrr888 commented May 3, 2024 •

edited

Loading

kevinrr888 commented Jun 3, 2024 •

edited

Loading

kevinrr888 commented Jul 23, 2024 •

edited

Loading

kevinrr888 commented Aug 6, 2024 •

edited

Loading

keith-turner commented Aug 6, 2024 •

edited

Loading

kevinrr888 commented Aug 7, 2024 •

edited

Loading