Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tree-comparison demo #17735

Merged
merged 3 commits into from
Oct 12, 2023
Merged

Conversation

ChumpChief
Copy link
Contributor

@ChumpChief ChumpChief commented Oct 12, 2023

In #17568 I modified the inventory-app demo to explore usage of both legacy and new SharedTree API surface directly. @CraigMacomber suggested doing this separate from the existing demo - this PR is that separate demo.

The content is now closer to the demo that we use in the version-migration set (which will be nice, since ultimately this demo will be the starting point of the migration-shim demo to demo tree migration). This also lets it explore tree modifications like insert/remove (whereas the inventory-app is only modifying payload), and additional eventing scenarios as a result.

As before, I've marked with REV: comments where I'd particularly like feedback or hit interesting issues.

Screenshot 2023-10-11 at 7 02 48 PM

AB#5657

@ChumpChief ChumpChief requested review from msfluid-bot and a team as code owners October 12, 2023 02:10
@github-actions github-actions bot added area: examples Changes that focus on our examples dependencies Pull requests that update a dependency file base: main PRs targeted against main branch labels Oct 12, 2023
@ChumpChief ChumpChief merged commit 954adff into microsoft:main Oct 12, 2023
@ChumpChief ChumpChief deleted the TreeComparisonDemo branch October 12, 2023 18:15
Copy link
Contributor

@DLehenbauer DLehenbauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving incremental commit so the SharedTree team can help maintain the demo during API changes.

@ChumpChief
Copy link
Contributor Author

Per @DLehenbauer 's suggestion, getting a quick incremental checkin on this so it can keep pace with API changes - will still take feedback and iterate on the demo of course!

const factory = new TypedTreeFactory({
// REV: I'm not exactly sure why a validator should be passed here? Like what it's used for,
// so it's hard to know what a "correct" choice would be as a result.
jsonValidator: typeboxValidator,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc comment here:

   /**
 * JSON Validator which should be used to validated persisted data SharedTree handles.
 * See {@link noopValidator} and {@link typeboxValidator} for out-of-the-box implementations.
 *
 * SharedTree users are encouraged to use a non-trivial validator (i.e. not `noopValidator`) whenever
 * reasonable: it gives better fail-fast behavior when unexpected encoded data is found, which reduces
 * the risk of further data corruption.
 */

To be fair, that's missing the negatives and why it's not just auto-included: using a nontrivial validator comes with smallish but noticeable runtime overhead (10-20%) and bundle size (varies based on the one you use, but if your app already uses a JSON schema validator you can re-use that and avoid paying such a bundling cost)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that but the problem is I don't know what it's validating - like what does "validate persisted data SharedTree handles" mean. Is it validating the snapshot, the object-type payloads in the tree, serialized IFluidHandles stored in the tree, etc. Is it JSON that I'm providing in some way (like in my stored data I would guess?) or is it JSON that is out of my control (e.g. some wire format used by SharedTree)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
definition: "inventoryItem",
traits: {
// REV: I tried adding an explicit "ID" trait here and using that for tracking,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the end result of re-using node ID rather than explicitly creating a new field is probably fine here (haven't though about the details, just thinking birds-eye).

However, re: second part of the comment here: you should be able to retrieve and read information from the deleted nodes by querying the "before" TreeView (rather than "after", or your current view) passed to your viewChanged callback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this some more and decided a more real-world scenario would have a known "inventory ID" for each item (unrelated to node id), so I now specify an ID here.

Good pointer on using "before", I hadn't even realized it was a similar type to currentView and had completely ignored both "before" and "after" after doing the .delta(). Latest revision uses this in the removal handling.

.get<IFluidHandle<LegacySharedTree>>(legacySharedTreeKey)!
.get();

// REV: Not exactly sure why a checkout is required to get "viewChange" events, seems like it should be able
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comes down to layering. In the end what checkout gives you architecturally is a bit half-baked in legacy shared-tree. At some point there were goals toward a more comprehensive branching story, and checkout meant to fit into that puzzle, but we never ended up investing in it too fully.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems not ideal, but I'm guessing we're not making further changes to legacy shared tree. I converted this to just a normal comment explaining why we're using a checkout in latest revision.

const { changed, added, removed } = before.delta(after);
for (const quantityNodeId of changed) {
const quantityNode = this.tree.currentView.getViewNode(quantityNodeId);
// REV: This is annoying to iterate over since I can't filter until I've retrieved the node object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like a bit of a misplaced gripe IMO :). Getting node ids here means you can choose to view the node in the 'before' view OR the 'after' view relatively easily. I think the more annoying thing here is:

// This event handler fires for any change to the tree, so it needs to handle all possibilities (change, add, remove).

which tends to lead to pretty hard to understand inval/code (even for your simple app here, the inval logic here is not straightforward to read and I'm sure it was not obvious to write either)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, for example .delta() could return something more like:

interface NodeCentricDelta {
    added: TreeViewNode[];
    removed: TreeViewNode[];
    changed: { before: TreeViewNode; after: TreeViewNode }[];
}

And then I could do nice stuff like

const addedInventoryItemNodes = added.filter((node) => node.definition === "inventoryItem");

Other options that could help might include e.g. registering for specifically adds/removes to the "inventoryItems" trait, or only registering for changes to nodes of definition "quantity", etc. to filter pre-handler. Feels like kind of a similar problem to DOM eventing - you can technically just register all your handlers on the root and deduce/filter to your intended target, but it's easier to register the handlers more locally.

The inval code is definitely a weak point, but most of it is just dancing around the node filtering and nodeId->node->nodeId->node dereferencing. And ultimately I still cheated - adds/removes still cause the view to just re-retrieve the whole inventory list (didn't want to think through how to communicate the index of the added node). I think React might smooth this over enough since I'm keeping the individual InventoryItem references stable, though I haven't taken the time to profile/confirm. I do think this highlights why the other tree examples are built using a "whole view is invalid, re-retrieve everything" event rather than granular invalidation, but since one of the main selling points of the tree is handling large data it would be nice to do the right thing in an example. TBH my inval logic for the new SharedTree is worse (it requires more deduction/assumptions about how the tree is mutating) :(

Updated comments to explain rather than complain :)

const builder = new SchemaBuilder({ scope: "inventory app", libraries: [leaf.library] });

const inventoryItem = builder.struct("Contoso:InventoryItem-1.0.0", {
// REV: I added an ID here because I didn't find a unique identifier on the node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unlike tree, tree2's editing system is not built on explicitly stamped ids on the nodes. This doesn't duplicate anything afaik. Including it in the schema as 'real life inventory items have ids' seems fine (though maybe write a note that that's the intent)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In March/April, there will actually be UUID identifiers for nodes, so this TODO will be true then :P

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I removed this REV comment in the latest set of changes (in main now) for the "real life inventory items have ids" reason.

Even if UUID identifiers are added to nodes in the future, I'd still probably recommend the usage principle I mention in my other comment for legacy ST: "The NodeId is not part of your schema so don't act like it is".

ChumpChief added a commit that referenced this pull request Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: examples Changes that focus on our examples base: main PRs targeted against main branch dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants