[Performance] Report Actions Storage Paging #4506

kidroca · 2021-08-08T15:34:47Z

If you haven’t already, check out our contributing guidelines for onboarding and email [email protected] to request to join our Slack channel!

Overview

The entire conversation of a single Report is stored under a single object that can grow infinitely
When a report is opened this "infinite" information is loaded at once from storage
When the conversation updates the whole conversation is rewritten to storage instead only the delta
Users with more data in storage will have degraded performance compared to users with less data in storage

What performance issue do we need to solve?

memory consumption
read/write times

What is the impact of this on end-users?

boot time
chat switch time
as locally stored data grows performance degrades

List any benchmarks that show the severity of the issue

Steps

Have a big chat (1000+ messages will be great)
Open it and scroll to the very beginning
Switch to a different chat
Place a debugger breakpoint in ReportActionView#render
Switch back to the big chat (debugger will pause)
Inspect props.reportActions - they will contain the conversation to the very beginning (1000+ messages)
- if you re-launch and init on this chat - all 1000+ messages will be read from storage to memory as well

Sample (Web)

442 report actions loaded
Crude estimate of 315kB of data

Conversation updates

Each time the reportActions update it will cause a massive object to be rewritten to storage. Example:

Use the same big chat from before
Add a breakpoint in Onyx#merge
Write a message in the compose box and send it
Follow the merge process with the debugger when the key is reportActions_{ReportID}
1. Onyx merge is called with a single action item
2. To merge this single item to the collection - it retrieves the entire collection (1000+ items)
3. Then merges the item
4. Then writes back the entire collection to storage. This includes converting it to JSON string
This actually happens twice
- the first time the action is written with a sequence number like 1628430535421518
- then this number gets normalised to something like 442
- this again causes a rewrite for the entire collection

Sample Web

Onyx#merge called with a single action item

Onyx#merge reads the entire collection, merges the item and writes the merged result back to storage

The same thing would happen when new messages arrive, edits are made, or as you scroll to the past
Though we have a check that skips a write to storage if the written collection is structurally the same as the merged changes
This does not skip the merge and comparing the 1000+ messages to find there is no change

Proposed solution (if any)

Instead of storing the entire conversation under a single key/object split the actions to predictable chunks based on their reportID and sequence number

Example using a chunk size of 50 messages

Simplified code

const chunkID = Math.floor(action.sequenceNumber / 50);
const chunk = get(`reportActions_${reportID}_${chunkID}`)

Sequence numbers should be normalised to indexes like 1,2,3 - this already happens

Now when a message is added/edited you can find the chunk that it belongs to and update only that chunk

This also means that when a report is loaded only the most recent chunk is committed to memory. The report general data should have enough information to find the most recent chunkID.
Then as the user scrolls to the past we attempt to load a chunk (or page) of data from network and from storage as well. Changes from network would now be applied to the specific chunk and not the entire collection

looks for calls like Onyx.merge(${ONYXKEYS.COLLECTION.REPORT_ACTIONS}${reportID}, indexedData);

List any benchmarks after implementing the changes to show impacts of the proposed solution (if any)

Note: These should be the same as the benchmarks collected before any changes.
TBD

Platform:

Where is this issue occurring?

Web
iOS
Android
Desktop App
Mobile Web

Version Number: 1.0.83-0
Logs: https://stackoverflow.com/c/expensify/questions/4856
Notes/Photos/Videos: Any additional supporting documentation
Expensify/Expensify Issue URL:

View all open jobs on Upwork

The text was updated successfully, but these errors were encountered:

MelvinBot · 2021-08-08T15:34:50Z

Triggered auto assignment to @MonilBhavsar (Engineering), see https://stackoverflow.com/c/expensify/questions/4319 for more details.

kidroca · 2021-08-08T15:40:56Z

ATM the problem is amplified by the logic described here: #4027 (comment)

Since we have a connection that loads all report action collections and they can grow as much as storage allows:

if we have 100 chats with 1000 messages stored locally for each
then we're loading 100,000 messages in memory during init

cc @marcaaron

marcaaron · 2021-08-09T18:30:02Z

Thanks for creating the issue and apologies as it might be some time before I properly review this.

Since we have a connection that loads all report action collections and they can grow as much as storage allows

If the problem is that we are loading too many report actions when the app inits we can try to defer that or look into ways to stop doing it entirely. I'd really like to explore some strategies around this before trying to fix this in Onyx.

kidroca · 2021-08-10T19:26:48Z

If the problem is that we are loading too many report actions when the app inits we can try to defer that or look into ways to stop doing it entirely

This is a secondary problem.

The primary problems:

When you open a chat. We're loading the entire chat history at once - let's say you have a year old conversation with 10,000+ messages in async storage - all of them are loaded when you switch to it (or init on it)
Then when you receive or send messages - to insert the message in the correct index all 10,000+ messages are read (this is the content of a single reportsActions_123 key), the message added, then all 10,000+ written back to storage

marcaaron · 2021-08-10T19:56:50Z

OK, I think we are talking about two different problems. "secondary" and "primary" might not be a good way to think about this as it implies one is related to the other and giving this proposal a closer look I see they are not the same.

Problem 1: Too Many Report Actions keys are loaded when the app inits

Loading too many report actions when the app inits is a problem because it takes a long time to read all of those keys. While, not perfect, I have done a basic test here to show this. So my suggestion is to look into ways to reduce the need to read all of these keys when the app inits.

Problem 2: Having a large report as the default report when the app inits causes performance issues

This problem seems important and the solution is interesting - but really only affects users who scroll back on a large report or remain logged in for a very long time. I'm having a hard time evaluating the severity or urgency of this. Usually when I test performance I'm not scrolling back on chats or staying logged in for a long time - so there are likely gains to be had before we even start looking into chunking the report actions.

kidroca · 2021-08-10T22:12:04Z

Correct they are related - if there's no other way but to read the entire chat history opening any chat, then reading all the chats during init makes the problem N times worse

You're skipping Problem 3: The entire chat is re-written for each new message I write or receive

if there's something else that updates report actions it would also do it in this rewrite all fashion. For example each time I scroll to the past and load a page from the backend it would rewrite my entire chat

The least we could do to address "Problem 3" is to use AsyncStorage.merge which would allow to merge a single action to the collection instead of rewriting the collection as Onyx.merge does. AsyncStorage.merge does not even need to read the collection while Onyx.merge does

Is it really too early for this?
I've been hearing Expensify would store an infinite amount of information locally and the only infinite keys at the moment are chat histories.
It's not really urgent, but since it would somewhat change architecture (not necessarily Onyx), if it's to be done, I think, we might as well do it early instead of too late

The chunks idea is something that will work with simple storages like AsyncStorage and MMKV
Such a storage might turn out to be insufficient down the road

If a fully featured database is use, with a traditional cursor and page size api, then yes implementing the current proposal might be a waste of time

marcaaron · 2021-08-11T00:37:11Z

Is it really too early for this?

It can be a hard question to answer, but one guiding Expensify principle that might help to remember is that we try to always think about how this will affect actual customers. Not every apparent problem requires an immediate solution and we're always thinking about the best place to put our limited energy.

With that idea in mind, can we continue to optimize android boot time and switching from one chat to another? Once that stuff is reasonably fast we can revisit the less urgent things. You've done some great work and these are solid insights. But we're so close to finishing what we've started and I think we just need to focus up on the critical stuff.

I think a good time to revisit this would be sometime after:

Chat switching happens 300ms faster on Android
Android app cold boots just as fast as iOS

Then we can worry about what happens when the initial report has 1000 messages instead of 50 ?

MelvinBot · 2021-09-03T19:00:28Z

This issue has not been updated in over 15 days. eroding to Monthly issue.

P.S. Is everyone reading this sure this is really a near-term priority? Be brave: if you disagree, go ahead and close it out. If someone disagrees, they'll reopen it, and if they don't: one less thing to do!

MelvinBot · 2021-11-16T19:00:07Z

@kidroca, this Monthly task hasn't been acted upon in 6 weeks; closing.

If you disagree, feel encouraged to reopen it -- but pick your least important issue to close instead.

kidroca added Daily KSv2 Engineering labels Aug 8, 2021

MelvinBot assigned MonilBhavsar Aug 8, 2021

marcaaron added the Planning Changes still in the thought process label Aug 9, 2021

marcaaron unassigned MonilBhavsar Aug 9, 2021

marcaaron added Weekly KSv2 and removed Daily KSv2 labels Aug 9, 2021

kidroca mentioned this issue Aug 27, 2021

Cap Software - External Contributors August 2021 #4850

Closed

9 tasks

MelvinBot removed the Weekly KSv2 label Sep 3, 2021

MelvinBot added the Monthly KSv2 label Sep 3, 2021

kadiealexander added the n6-hold label Sep 16, 2021

mallenexpensify removed the n6-hold label Nov 16, 2021

MelvinBot added the Overdue label Nov 16, 2021

MelvinBot closed this as completed Nov 16, 2021

MelvinBot added Not a priority and removed Overdue labels Nov 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance] Report Actions Storage Paging #4506

[Performance] Report Actions Storage Paging #4506

kidroca commented Aug 8, 2021

MelvinBot commented Aug 8, 2021

kidroca commented Aug 8, 2021

marcaaron commented Aug 9, 2021

kidroca commented Aug 10, 2021

marcaaron commented Aug 10, 2021

kidroca commented Aug 10, 2021 •

edited

Loading

marcaaron commented Aug 11, 2021

MelvinBot commented Sep 3, 2021

MelvinBot commented Nov 16, 2021

[Performance] Report Actions Storage Paging #4506

[Performance] Report Actions Storage Paging #4506

Comments

kidroca commented Aug 8, 2021

Overview

What performance issue do we need to solve?

What is the impact of this on end-users?

List any benchmarks that show the severity of the issue

Steps

Sample (Web)

Conversation updates

Sample Web

Onyx#merge called with a single action item

Onyx#merge reads the entire collection, merges the item and writes the merged result back to storage

Proposed solution (if any)

Example using a chunk size of 50 messages

Simplified code

List any benchmarks after implementing the changes to show impacts of the proposed solution (if any)

Platform:

MelvinBot commented Aug 8, 2021

kidroca commented Aug 8, 2021

marcaaron commented Aug 9, 2021

kidroca commented Aug 10, 2021

marcaaron commented Aug 10, 2021

kidroca commented Aug 10, 2021 • edited Loading

marcaaron commented Aug 11, 2021

MelvinBot commented Sep 3, 2021

MelvinBot commented Nov 16, 2021

kidroca commented Aug 10, 2021 •

edited

Loading