-
Notifications
You must be signed in to change notification settings - Fork 537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concurrent array editing and iteration error hadling #21760
concurrent array editing and iteration error hadling #21760
Conversation
@Josmithr Since we will now error when concurrently iterating and editing the array, would this be considered a breaking change? Just wanted to double check with someone from the API council :) |
Strictly speaking, yes. If we allowed a usage pattern before, and no longer allow it, then it's breaking. That said, was the behavior well-defined before? Could users safely concurrently iterate and edit, or could that lead to unsafe behavior? If so, that might be a reasonable argument to make the change anyways. I would defer to @CraigMacomber or @taylorsw04 on that 🙂 Another way of putting it: if we are adding new safeguards to help identify potential existing issues in app authors' code, then this might be an okay change to make. |
This change may impact users (even if we consider those users unsupported) in a way they would like to know about, so it should get a changeset explaining its impact (even if we consider it non breaking: this is a useful feature we should document and get included in release notes) This impacts an API, so yes, API council review is appropriate. As for is this a breaking change and thus not allowed to be merged without a major version bump? We do not have official policy about how to make that call. The closest I know is my draft document. I think the most relevant case is this one about undocumented behaviour https://github.com/microsoft/FluidFramework/pull/14421/files#diff-c45b354bea51ccd8727afb1db811eb0764306f87f21982d9199cd04e633098a5R62 My opinion (and that's not what matters, but perhaps API council will agree), is that the documentation we inherit from ReadonlyArray (which is where we get the iteration related methods) is not truly clear about this. JavaScript arrays are clear on the behavior of their iteration methods. If someone relies on that behavior, they should be doing ReadonlyArray.values is only documented as:
This does not clarify how this behaves across edits. The other methods seem similar. I didn't check ReadonlyArray[Symbol.iterator] as my IDE provides no way to find definitions or docs for members which are indexed, but I suspect tis about the same as well. Therefor I think this case fall under undocumented behavior, and thus is allowed to change and be considered non breaking, for supported customer (which requires them to not rely on undocumented behavior), and thus can be merged. |
@@ -850,6 +854,23 @@ abstract class CustomArrayNodeBase<const T extends ImplicitAllowedTypes> | |||
destinationFieldPath, | |||
index, | |||
); | |||
this.#generationNumber += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is unneeded. Performing an edit sends a change event.
this.#generationNumber += 1; |
this.#generationNumber += 1; | |
this.#generationNumber += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with suggested change
@@ -672,6 +672,10 @@ abstract class CustomArrayNodeBase<const T extends ImplicitAllowedTypes> | |||
|
|||
protected abstract get simpleSchema(): T; | |||
|
|||
#generationNumber: number = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field should have a doc comment explaining why it exists and how its used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated field with doc comment
@@ -672,6 +672,10 @@ abstract class CustomArrayNodeBase<const T extends ImplicitAllowedTypes> | |||
|
|||
protected abstract get simpleSchema(): T; | |||
|
|||
#generationNumber: number = 0; | |||
#updateGenerationNumber = getFlexNode(this).on("nodeChanged", () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are two issues here:
#updateGenerationNumber
is undocumented, and its name is confusing. This is a function, but calling it does not update the generation number. Calling this unregisters generation number updates! Consider doing this in the constructor. If you really want to do it during property static initialization time, use a static initialization block https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Classes/Static_initialization_blocks , or save it to a property with a more clear name if you actually do need to unregister at some point (I don't think we do)
I think this might not handle unhydrated nodes correctly. Using Tree.on instead of the flex tree APis should fix that as I believe that workes correctly in this case. Ensure you have a test which creates a node thats unhydrated, then as a second step inserts it into the tree. Make sure that case still works correctly, even though the underlying flex tree dummy node gets replaced with a real one (which would break your events here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried to use a static initialization block, but unfortunately due to how scoping works with them, I couldn't get them to work. I have also tried to create a constructor within the class and initialize it there (after calling super()
), but it wasn't really obvious where the constructor of the inherited class was.
As for the use of flex tree APIs, I have tried using Tree.on
which led to a cyclic dependency, so I left it as is and added testing for unhydrated nodes and checked that they passed.
Would there be any other test cases where using the flex tree apis, may lead to errors that I should add test coverage for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to be initialized in the constructor without storing it into a variable
@@ -590,4 +590,28 @@ describe("ArrayNode", () => { | |||
); | |||
}); | |||
}); | |||
|
|||
describe("Iteration", () => { | |||
it("Concurrently iterating and editing should throw an error.", () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When testing that something throws, you want to put as little as possible in the assert.throws block so you can test that the correct code throws the assert, and not the constructor , hydrate etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed unnecessary code from assert.throws
block
describe("Iteration", () => { | ||
it("Concurrently iterating and editing should throw an error.", () => { | ||
assert.throws( | ||
// False positive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by this comment. To me this implies that the error detection here is a false positive, meaning its detecting an error when it should not. This case however should (and if this test passes) does error, so true positive for the detection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed incorrect comment
public *values(): Generator<TreeNodeFromImplicitAllowedTypes<T>> { | ||
const initialLastUpdatedStamp = this.#generationNumber; | ||
let index = 0; | ||
while (true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can have a for-loop that operates over two variables rather than one, if you prefer (and if the linter prefers - in the past it has complained about while (true)
loops, though I'm not sure if that's still the case). IMO the for loop makes it more clear that the two variables are progressing together in lockstep.
while (true) { | |
for (let i = 0, value = this.at(0); i < this.length && value !== undefined; i++, value = this.at(i)) { |
Though, I am curious - under what circumstances is value
undefined? If we only iterate up to length
, and we know that length isn't going to fluctuate during the iteration, then aren't we guaranteed that value
is always defined? Since we don't allow undefined
in our arrays as legitimate elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reasoning behind the while(true)
was that if we want to throw an error when we detect that the array has been edited during the iteration.
For example (one of the test cases in this PR), if we used the for-loop above, and deleted a bunch of the array elements during the iteration, the this.length
will have changed and value
could be undefined
. This unfortunately exits the for loop instead of making the check and throwing the error within the loop.
Would it be better to store the "original" array length into a local variable in the values
method, and using this length in the for-loop while checking value === undefined
and i >= this.length
(which could be different from the original array length) and throw the error there instead of using the while(true)
loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, interesting. Yes, I suppose you could save the array length. I think you could also just be a little more careful about where you do the assert. The only time that the concurrent modification can happen is at the yield
statement. So, instead of checking for the modification at the start of the loop, do it right after the yield
. Then, you can write your loop however you want (e.g. as a for-loop, even using this.length
) and you'll be fine.
One other thing this made me realize is that the generator code only runs lazily when the first item from the generator is requested. So, the baseline generation number that you save in initialLastUpdatedStamp
is not necessarily the generation number when values()
is called. Rather, it is the generation number when the iterator returned by values()
is first read. So, a user could e.g. do this:
const values = myArray.values();
myArray.removeRange();
for (const v of values) {
// ... No error
}
Basically, it allows modifications if the user has created, but not read from, the generator. Is that... okay? Seems maybe inconsistent to me? If you want to prevent that, you need to have values() not be a generator, but set initialLastUpdatedStamp
and then pass it to another iterator (which could be a generator).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to for-loop for readability, added separate *generateValues
method to pass in the correct generationNumber
, and added test case for calling values, editing the array, and iterating
⯅ @fluid-example/bundle-size-tests: +1.73 KB
Baseline commit: 381a32f |
* This function is only created to initialize the event listener to increment the generationNumber. | ||
* However, it should not be called anywhere else, as it will unregister generation number updates. | ||
*/ | ||
#initializeGenerationNumberEvent = getFlexNode(this).on("nodeChanged", () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand the doc comment correctly, this is an event registration that is never unregistered, right? Why have the initializeGenerationNumberEvent
variable at all, then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added constructor to CustomArrayNodeBase
and removed initializeGenerationNumberEvent
variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Craig mentioned that we should double check with you and add some comments justifying that
- This would not cause issues because of how registration goes for raw to flex nodes (I did add some test cases with unhydrated nodes)
- Never unregistering won't cause any memory leaks
Do you happen to know if there would be anything we should mention in the comments for calling this in the constructor?
getFlexNode(this).on("nodeChanged", () => { this.#generationNumber += 1; });
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of a few weeks ago we will copy events from unhydrated nodes to hydrated upon insertion, so that should be okay. As for unregistering, I believe the registration will live so long as the anchornode stays alive. I don't think we have a way to dispose (in practice) flex nodes currently. If we did, you'd want to do the deregistration there. But, since we're about to do a lot of work to change the flex layer, I wouldn't worry about it here.
throw new UsageError(`Concurrent editing and iteration is not allowed.`); | ||
} | ||
for ( | ||
let i = 0, value = this.at(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to call this.at(0)
if the array is empty? Does it throw, or does it just return undefined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think thats not safe. We should make sure we have and iteration test for the empty case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the this.at(0)
on an empty array returns undefined
. I also added an iteration test for the empty case
private *generateValues( | ||
initialLastUpdatedStamp: number, | ||
): Generator<TreeNodeFromImplicitAllowedTypes<T>> { | ||
if (initialLastUpdatedStamp < this.#generationNumber) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
if (initialLastUpdatedStamp < this.#generationNumber) { | |
if (initialLastUpdatedStamp !== this.#generationNumber) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with suggested change
for ( | ||
let i = 0, value = this.at(0); | ||
i < this.length && value !== undefined; | ||
i++, value = this.at(i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i++, value = this.at(i) | |
value = this.at(++i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with suggested change
} | ||
for ( | ||
let i = 0, value = this.at(0); | ||
i < this.length && value !== undefined; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i < this.length && value !== undefined; | |
i < this.length; |
value
will always be defined since we don't support sparse arrays, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did remove the && value !== undefined
, but typescript seems to be complaining in the for loop that we cannot yield undefined values (we can only return TreeNodeFromImplicitAllowedTypes<T>
because of our return type).
I updated to use null assertions, but wasn't sure if this was the best way to do it. Would it be better if I added an assert statement in the for loop instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline.
} | ||
for (let i = 0; i < this.length; i++) { | ||
yield this.at(i) ?? fail("Index is out of bounds"); | ||
if (initialLastUpdatedStamp < this.#generationNumber) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (initialLastUpdatedStamp < this.#generationNumber) { | |
if (initialLastUpdatedStamp !== this.#generationNumber) { |
To match the other one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with suggested change
assert.deepEqual(result, []); | ||
}); | ||
|
||
it("Iterates through the values of two different iterators", () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it("Iterates through the values of two different iterators", () => { | |
it("Iterates through the values of two concurrent iterators", () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with suggested change
values2.next(); | ||
}); | ||
|
||
it("Concurrently iterating and editing after inserting into an unhydrated array node errors.", () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that either of these "unhydrated" tests are actually testing anything interesting. Once the nodes are hydrated, they're hydrated. Whether they used to be unhydrated, or never existed in an unhydrated state, doesn't matter.
The interesting test would be to create an iterator for an unhydrated node, then hydrate it, and then check that the iterator keeps working. I expect that this will not work with the hydrate
function, because our hydrate
function doesn't actually hydrate unhydrated nodes :) I have a task to fix this but I've been unsuccessful so far.
In the meantime, you can use a real SharedTree from a TestTreeProvider to test this case. I'll go try to get hydration via hydrate
working in the meantime now that I know we have an actual use for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated test case to:
- create an unhydrated array node, create an iterator, initialize a tree with the unhydrated node (to hydrate it) and checks that the iterator works
- make an edit to the node, and checks that it throws a usage error during iteration
## Description This PR adds a custom `values()` function to `arrayNode` so that we throw an error when we concurrently iterate and edit the array.
Description
This PR adds a custom
values()
function toarrayNode
so that we throw an error when we concurrently iterate and edit the array.