Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async mode in the data module #13056

Merged
merged 18 commits into from
Jan 15, 2019
Merged

Async mode in the data module #13056

merged 18 commits into from
Jan 15, 2019

Conversation

youknowriad
Copy link
Contributor

@youknowriad youknowriad commented Dec 21, 2018

This is a crazy experiment :)

The idea is to build an async mode into the data module. Allowing some part of the React Tree to declare them selves as "async" which basically means low priority.

If a component is in async mode, it won't react synchronously to store changes, instead it will wait for all the browser's high-priority work (typing, rendering...)

In its current state, this PR declares the whole block list (the content of the post) as low priority, which means this part won't rerender synchronously when you type for instance but the chrome (save buttons, sidebar...) will rerender in sync.

I think this may introduce some small bugs we need to track and see how to fix (slot/fill maybe others).

That said, the feeling of responsiveness is highly improved with this PR. For instance you can try with this post https://gist.github.com/florianbrinkmann/4e9e4d23eaefb8b23484badddd848196 (convert it to blocks first) and compare the feeling with master.

Refs #11782

@youknowriad youknowriad added the [Type] Performance Related to performance efforts label Dec 21, 2018
@youknowriad youknowriad self-assigned this Dec 21, 2018
@youknowriad youknowriad requested review from sgomes, gziolo, aduth and a team December 21, 2018 11:17
@youknowriad youknowriad force-pushed the try/data-module-async-mode branch from 7317fb3 to 8b9d156 Compare December 21, 2018 11:27
@gziolo
Copy link
Member

gziolo commented Dec 22, 2018

I think this proposal makes a lot of sense. In particular, the way this async mode is introduced seems to be reasonable (as far as I understood these changes). I read it as this async mode is applied to all unselected blocks when there is no multi-selection mode enabled.

I guess that the failing test/e2e/specs/multi-block-selection.test.js test might be a legitimate one given that we introduce the change when the check for multi-selection is applied to the blocks list component.

This makes me think, whether we should also explore the opposite approach at some point, make everything async by default and mark as sync only those branches of virtual DOM as sync where it is essential to provide a compelling UX.

By the way, communication in React Native between JavaScript and native code is async by design, but they are changing it to allow sync in cases where UI feedback needs to be immediate. It sounds familiar to what we have in case of Gutenberg :)

@youknowriad youknowriad force-pushed the try/data-module-async-mode branch 2 times, most recently from 0acd916 to 14b2cbe Compare January 7, 2019 13:31
@youknowriad youknowriad changed the title WIP: Try async mode in the data module Async mode in the data module Jan 7, 2019
Copy link
Contributor

@sgomes sgomes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR, @youknowriad! I like the approach, and it generally looks good, apart from a few minor comments I added.

Let me know when you feel this is ready for some perf measurements and I'll take some numbers!

@@ -40,6 +40,9 @@ describe( 'Block with a meta attribute', () => {
await insertBlock( 'Test Meta Attribute Block' );
await page.keyboard.type( 'Meta Value' );

// Wait for all async blocks to update
await new Promise( ( resolve ) => setTimeout( resolve, 500 ) );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this does what the comment says it does.

Does the guarantee that all blocks are updated come from the fact that we're waiting for the current cycle of the event loop to end and the next one to begin? If so, you can use a timeout of zero and skip waiting for 500ms. Otherwise, those 500ms offer no guarantee of completion and a different approach will be needed to ensure this always works.

I realise this is just a test, but I could see this turning into a hard-to-debug flaky test in the future :-/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In async mode, each single component is updated in its own cycle. which mean I don't know the exact number of cycles required here. But yeah, maybe there's a better way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we force components to update synchronously in a test environment?

This feels very concerning to me as is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could it but I personally prefer not to because it creates a difference in behavior for tests we may not be aware of. For example, I did find a lot of "bugs" when I originally implemented the async mode thanks to the e2e tests which we won't catch if we use sync mode in tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, maybe it's worth having a separate test case which could target and specifically verify some expected difference between synchronous and asynchronous modes, or run all / a subset of tests under each, or abstract the problematic interactions under a "safe" variant which handles the expected delay.

What I don't think we should start doing is arbitrarily littering some tests with randomly chosen timeouts which may or may not be sufficient time to complete the asynchronous task (fragile), and may or may not needlessly delay the total runtime of the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can rewrite the tests slightly to avoid it.

Select the block and then check it's value because selecting a block triggers the sync mode.

const nextComponent = waitingList.shift();
componentsMap.get( nextComponent )();
componentsMap.delete( nextComponent );
window.requestAnimationFrame( runWaitingList );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure a rAF is what you want here; this doesn't look like it should be tied to the screen's refresh rate.

What's the goal behind this wait? If you're just looking to defer things until the next run of the event loop, I'd use a simple lodash defer or the equivalent setTimeout() with a timeout of zero.

Do you instead want to make sure it doesn't all happen in one go and only when the browser is otherwise free to process things? If so, I'd use requestIdleCallback instead, which is only executed when the browser has spare time, with a polyfill or fallback of some sort for browsers that don't support it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think React's async mode uses requestAnimationFrame as well. I'm not certain what's the best approach yet but the idea is that when you're typing or moving the mouse or resizing, I want to leave the priority to the browser to do the most critical things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a good use-case for requestIdleCallback, then. Which as far as I can tell is what React's async mode uses as well, but I may have been looking at old code.

The motivation against using requestAnimationFrame is that you'd be doing non-animation work when the browser is getting ready to create a frame. This could lead to one of two things:

  • If the work involved in each of these is complex, you could blow the 13ms budget and cause jank.
  • If the work involved in each of these is not complex, you could be wasting time by taking numberOfItemsInQueue * 13ms to process the whole queue, when you could potentially be processing it all in much less time.

requestIdleCallback instead schedules things to be done when the browser expects it will have at least 50ms (I believe), which gives you a bigger budget for complex changes and allows for running multiple chunks of work in succession without waiting for another cycle if they're small enough. This could help in keeping changes coherent, too, instead of potentially having one component update per frame.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like requestIdleCallback is not supported by the browsers we need, and I'm not certain it has polyfills.

React uses requestAnimationFrame to provide some kind of polyfill.

https://github.com/facebook/react/blob/653bc582f94e955b175d42836479ee8b9eeeeeaf/packages/scheduler/src/Scheduler.js

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be OK using rIC where supported and just having a polyfill as simple as rIC = ( ...args ) => requestAnimationFrame( ...args ) elsewhere, but I'll leave this up to you. As long as the numbers improve, that's what matters :)

Copy link
Contributor

@sgomes sgomes Jan 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While your code is good, I still have more confidence personally in using requestAnimationFrame for this reason:

It seems that in your code the queue will be cleared by batch which mean it will have two consequences:

  • While executing a single batch (I guess it's 50ms duration), we can't perform incoming higher-priority work (typing)
  • After each batch, we'll wait another 50ms for the next batch.

Anyway I updated using rAF.

No problem, I'll test as-is (using rAF) :) Again, if the improvements are there and it works correctly, it's already a win!

I just want to clarify that higher-priority work isn't an issue in the rIC case; that's the point of checking timeRemaining() after each item, and moving onto a new batch if there's no time left. We're trusting the browser to tell us there's no time left if more important work needs to be done.

As for waiting 50ms, yes, that's true, the wait is longer than the 13ms of a rAF. However, timing is often about waiting for the right time to do something, even if it takes a little longer. That's exactly what rIC tries to do, by using timing knowledge that's internal to the browser. Besides, in the rAF case, you'll incur the 13ms penalty between every item, rather than 50ms between batches in rIC. And since you expect each component's withSelect to be fast (which I assume to mean significantly less than 13ms), that likely means it'll be slower in aggregate, since the average batch would only need to have around 4 items for the rIC case to be faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarifications :) I'm convinced now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, thank you! :D I should be able to get you some numbers today! :)

Copy link
Member

@jsnajdr jsnajdr Jan 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

React uses requestAnimationFrame to implement polyfill for requestIdleCallback, which would otherwise be exactly the function they want to call.

The goal is to run code as soon as possible after a frame is painted. That's achieved with:

  1. Scheduling a callback with requestAnimationFrame. But that alone schedules the callback right before a frame is painted, the opposite of what we want to achieve. That's why this callback just posts a message to the window.
  2. Then there is a window.onmessage handler that does the real work. If a message is posted inside requestAnimationFrame, it's guaranteed to be processed ASAP, but not before the frame is painted. It'll be the first thing the browser will do after painting! Mission accomplished.

requestAnimationFrame callback, i.e., running code before frame paint, is useful for writing to DOM effectively, without causing layout trashing.

Running code after paint is useful for effective reading from DOM, without causing layout trashing.

React Hooks introduce new APIs for these purposes. useEffect is ran before paint and is good for DOM writing, and useLayoutEffect is ran after paint and is good for DOM reading. Or is it the other way around? I'm not sure 😄

Calypso has an afterLayoutFlush utility function that follows very closely the React implementation to schedule stuff after paint. But it uses setTimeout(0) instead of postMessage, because I didn't know the hairy details a year ago. setTimeout(0) has the disadvantage that it can skip many frames before running, e.g., when scrolling.

packages/data/src/components/with-select/index.js Outdated Show resolved Hide resolved
@youknowriad
Copy link
Contributor Author

@sgomes I feel this is ready for performance measurements

@sgomes
Copy link
Contributor

sgomes commented Jan 8, 2019

I'm still figuring out how best to measure things here, but I'm seeing a definite improvement in terms of responsiveness. In the old code, there's a big monolithic set of changes that causes a huge 2.4 second frame, in my test case (typing in a large document with a 6x CPU throttling multiplier):

image

In the new code, the work doesn't take as long (1.7s), but more importantly it's split up into multiple frames, which allow for more visual feedback and thus higher responsiveness:

image

This evidence is mostly anecdotal, as I'd still like to find a good way of running repeated tests and getting some more accurate numbers out. Not sure how I'll do that yet, but it'll likely involve Puppeteer :)

@jsnajdr
Copy link
Member

jsnajdr commented Jan 9, 2019

When reading the patch, this is how I understand it: when the store changes and listeners are fired, the listeners registered by AsyncModeConsumer are not called synchronously, but are deferred to a later event loop tick. Is that right?

That's almost exactly what setState in React Concurrent Mode will do after it reaches a stable version and starts to be used in Gutenberg. That's both good news (it validates this approach) and also bad news: this PR will become obsolete once React starts doing the same thing internally.

@youknowriad
Copy link
Contributor Author

@jsnajdr Actually, the principles are the same but there's a small and subtle difference here that makes a big difference comparing to React Async Mode and won't be solved by it.

The main performance problem while the number of components grow in a React/Redux application is not the rerendering of all the components as most likely these components already have logic (pure and similar) to avoid rerendering them if the props/state don't change. That's exactly what we already have in Gutenberg. So if I type in a block, there's only the current block that rerenders with some chrome components but the other blocks don't.

The main issue comes from the fact that we're obliged to run the selectors (even if memoized and fast) for all the components and all the blocks to determine the next props and whether we should rerender or not. And the selectors calls happens before setState call which means the async rendering of React won't have any effect here.

@sgomes
Copy link
Contributor

sgomes commented Jan 10, 2019

This feels like a good solution, given the constraints, and I agree with @youknowriad that this solves issues at a different layer than what React's async mode does. I completely agree with the assertion that the issue in Gutenberg is not that components update too often, but rather that too many selectors are running too often, and that can only be solved at the data layer.

It's unfortunate, though. The more I learn about Redux (and particularly the way it interacts with React), the narrower the scope of use-cases that it seems to be an optimum solution for feels.

Gutenberg has already needed a number of non-trivial architectural workarounds, beyond the expected selector performance tweaking: multiple stores; a custom implementation of the React bindings that only runs selectors when the state tree mutates (withSelect); tweaks to the data structure to avoid performance issues; and now a new layer of "state" management on top of Redux, so that some components only get their selectors run some of the time.

And this is by no means a "write and forget" thing. It will take constant vigilance to make sure all these pieces keep functioning optimally :-/

@jsnajdr
Copy link
Member

jsnajdr commented Jan 11, 2019

@youknowriad Thanks for the clarification! I was wondering whether the performance sensitive part is running of the selectors, or the actual rerender. React will help only with the latter of course 😞

Do I understand correctly that the main goal of this change is to make typing more responsive?

Doing a little profiling (in the Calypso version) I see a lot of things is happening on every keystroke. The isEditedPostBeingScheduled selector creates moment date objects, the "Preview" button in Calypso integration even forces full content serialization on every keystroke, because of the getEditedPostAttribute( 'content' ) selector. (:wave: @Copons)

The Calypso Classic editor, the one that uses TinyMCE, isolates the content change on typing inside the TinyMCE component and synchronizes the content to Redux only later, when the user stops typing. Using debounce.

Would that be an option for typing inside a block, too? Dispatching an UPDATE_BLOCK_ATTRIBUTES action to the core/editor on every keystroke seems to be inefficient. Many things can happen in core/editor, many listeners listen for changes, but a single keystroke rarely changes anything the listeners are interested in. Can we have a separate reducer or store for every block? Or handle the typing in local state?

@sgomes
Copy link
Contributor

sgomes commented Jan 11, 2019

@jsnajdr: FYI, I submitted a trial PR adding debouncing to input: #12592. Or rather, throttling, although as I explain there, I'm open to changing that to debouncing.

<AsyncModeProvider
key={ 'block-' + clientId }
value={
selectedBlockClientId !== clientId &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this value representing? For clarity and legibility, it'd be deserving of being assigned a variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It represents when we enable/disable the async mode, I can add a clarifying comments. The idea is that selected blocks are sync, the others are async.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, from how I interpret it, it represents whether the rendered block occurs within the current selection, something akin to a isBlockInSelection variable (or even selector, though I don't think this can help here as written).

{ map( blockClientIds, ( clientId, blockIndex ) => {
return (
<AsyncModeProvider
key={ 'block-' + clientId }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we had it previously, but I wonder why it matters to have the key be prefixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't

packages/priority-queue/README.md Outdated Show resolved Hide resolved
packages/priority-queue/README.md Show resolved Hide resolved
packages/priority-queue/src/index.js Show resolved Hide resolved
packages/priority-queue/src/index.js Show resolved Hide resolved
@@ -40,6 +40,9 @@ describe( 'Block with a meta attribute', () => {
await insertBlock( 'Test Meta Attribute Block' );
await page.keyboard.type( 'Meta Value' );

// Wait for all async blocks to update
await new Promise( ( resolve ) => setTimeout( resolve, 500 ) );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we force components to update synchronously in a test environment?

This feels very concerning to me as is.

@youknowriad youknowriad force-pushed the try/data-module-async-mode branch from 57dc295 to 6b4a1b8 Compare January 15, 2019 17:16
packages/priority-queue/src/index.js Outdated Show resolved Hide resolved
packages/priority-queue/README.md Show resolved Hide resolved
<AsyncModeProvider
key={ 'block-' + clientId }
value={
selectedBlockClientId !== clientId &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, from how I interpret it, it represents whether the rendered block occurs within the current selection, something akin to a isBlockInSelection variable (or even selector, though I don't think this can help here as written).

@youknowriad youknowriad merged commit 3c5fd22 into master Jan 15, 2019
@youknowriad youknowriad deleted the try/data-module-async-mode branch January 15, 2019 18:45
@youknowriad
Copy link
Contributor Author

😮

}
};

const flush = ( element ) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why it didn't occur to me previously, but: In this context, wouldn't a flush method be expected to run the callback immediately? With this implementation, it acts more like a "delete" than a "flush".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the call was being done regardless but on the caller side. Having it here could simplify things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Type] Performance Related to performance efforts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants