Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueObservation notifies duplicate initial value when started from a DatabasePool #937

Closed
pocketpixels opened this issue Mar 9, 2021 · 9 comments

Comments

@pocketpixels
Copy link
Contributor

What did you do?

I just updated from GRDB 4 + GRDBCombine to GRDB 5.
It was surprisingly painless and the only code change I had to make was to replace the fetchOnSubscription() calls with scheduling: .immediate when creating the Combine publishers.

What did you expect to happen?

I expected identical notification behavior as before. That is, one initial value notification right away, and then one notification for each observed value change after that.

What happened instead?

The initial value is being sent twice.

Dealing with the duplicate notification is trivial.
However the fact that the SQL query is executed twice can be a potentially significant performance regression.
(In my application this is an issue on the watch)

Environment

GRDB flavor(s): GRDB
GRDB version: 5.5.0
Installation method: CocoaPods
Xcode version: 12.5 beta 3
Swift version: 5
Platform(s) running GRDB: iOS and watchOS
macOS version running Xcode:

@groue
Copy link
Owner

groue commented Mar 9, 2021

Hello @pocketpixels,

This is a known behavior of the GRDB 5 DatabasePool that I'm still struggling to avoid.

It is not strictly a bug, because the documentation carefully says that ValueObservation can notify duplicates: apps should be ready for such stuttering.

But just as you say, this can create performance issues. And this is just weird. And once you notice it, you can't forget it. Meh.

There is a benefit, though. When you start a ValueObservation on a DatabasePool, you are now guaranteed to get the initial value shortly, even if there is a long write transaction in the background. With GRDB 4, observations could only start after all pending writes were completed, and this could create a really bad user experience. Slow writes happen, for example, in apps that sync a lot of data with a remote server, and those could not observe and quickly display data on screen until those long writes would complete (#601, addressed by #736).

So for one win (butter-smooth observations), we get one loss (double notification of the initial value).

There is already a fix, for custom SQLite builds compiled with the SQLITE_ENABLE_SNAPSHOT option. GRDB can then avoid the double initial notification. This option is unfortunately not available on the stock SQLite version that ships with iOS and watchOS. Well, it's there, but unusable, as cruelly revealed by #846.

The difficulty is that when you start a ValueObservation:

  1. An initial value is fetched as soon as possible from a reader SQLite connection, regardless of concurrent writes. In your case, this initial fetch is synchronous (.immediate).
  2. The observation then waits for a slot in the writer SQLite connection
  3. From the writer SQLite connection, it can hook on SQLite commits and spot changes that impact the observed region. From now on, no change will be missed: the observation has opened its eyes.
  4. Between 1 and 3, the database could have been modified, and nobody was aware of it.
  5. Hence the security double fetch, just in case we missed some changes.

On step (4), if we can prove for sure that between the initial read in a reader connection (1), and the setup of the commit hooks on the writer connection (3), no change was performed, then we can avoid the double notification.

It's possible with SQLITE_ENABLE_SNAPSHOT.

Without this option, I could never manage to do it. Some serious extra SQLite mojo is needed 😅

@groue
Copy link
Owner

groue commented Mar 9, 2021

(In my application this is an issue on the watch)

I currently have no real workaround.

You can try DatabaseQueue, which does not suffer from this problem. But since DatabaseQueue serializes all database accesses, your initial value won't be notified until all pending database accesses are completed (beware slow concurrent reads and writes).

Another solution is to keep a long living observation, so that the cost of the initial double fetch happens only once in the lifetime of your application.

YMMV

@pocketpixels
Copy link
Contributor Author

pocketpixels commented Mar 9, 2021

Hi Gwendal,
thank you very much for the detailed reply and discussion. Much appreciated!

Could it be possible to start with step 2, check if the writer connection is idle (with nothing queued up) and if yes, synchronously execute the initial fetch on that (doing the check and conditional scheduling as one atomic operation)? And perform step 1 only if the writer connection is not idle?

@pocketpixels
Copy link
Contributor Author

Switching to the DatabaseQueue is probably the way to go for my application.
I sync with CloudKit and the initial sync can be large and slow. However I am not keeping a GRDB write transaction open while downloading info from CloudKit, I only write a batch of changes to the database after completing the Cloudkit fetch for it. So the writes should be quick and reads being temporarily blocked should not be a major issue.

@groue
Copy link
Owner

groue commented Mar 11, 2021

Could it be possible to start with step 2, check if the writer connection is idle (with nothing queued up)

There's no such API in DispatchQueue.

Generally speaking, observations that perform their initial fetch from the writer queue are off the table. Some users would complain that such an observation postpones writes, in a noticeable fashion when the fetch is slow. The purpose of DatabasePool is to allow parallel reads and writes.

So I'm still looking after another technique.

Switching to the DatabaseQueue is probably the way to go for my application.

Yes, please give it a try.

Nevertheless, I hope we can make progress on the topic.

A few days ago I stumbled upon SQLITE_FCNTL_DATA_VERSION and had great hopes. Unfortunately, this would not give the expected result.

The great difficulty is to tell if the database is different when seen from one SQLite connection (the reader that performs the initial fetch), and another SQLite connection (the writer where observation really starts). The SQLITE_ENABLE_SNAPSHOT compile-time option enables sqlite3_snapshot_cmp, whose purpose is precisely to compare connection-independent databases snapshots. This is the perfect solution, at the SQLite level (as reliable as one can dream), and it works (when GRDB links to a custom SQLite build).

Maybe, as you suggest, we can acquire the same level of robustness with DispatchQueue juggling at the GRDB level. Good ideas are always welcome!

@groue
Copy link
Owner

groue commented Mar 11, 2021

I'll keep this issue open until we find a fix. It's not strictly a bug, but it's a real inconvenience.

@groue groue added the help wanted Extra attention is needed label Mar 11, 2021
@groue groue changed the title Duplicate initial value notifications with ValueObservation.publisher( .immediate) ValueObservation notifies duplicate initial value when started from a DatabasePool Mar 11, 2021
groue added a commit that referenced this issue Mar 13, 2021
groue added a commit that referenced this issue Mar 13, 2021
@groue
Copy link
Owner

groue commented Mar 13, 2021

I'll keep this issue open until we find a fix. It's not strictly a bug, but it's a real inconvenience.

I just can't leave issues open, especially during a lockdown: #940

@groue
Copy link
Owner

groue commented Jan 14, 2022

This issue can be solved with a custom SQLite build, with the SQLITE_ENABLE_SNAPSHOT flag.

I reported to Apple the need for these apis (FB9793771 - Expose SQLite APIs enabled by the SQLITE_ENABLE_SNAPSHOT compilation option).

I'm not aware of any other solution. When a ValueObservation fetches its initial value without waiting for a write access, we need a way to tell if the initial read was performed on the same database state than the one found when the transaction observer is installed. SQLITE_ENABLE_SNAPSHOT makes it possible to answer this question. Without this flag, ValueObservation must perform a second fetch, just in case database would have been changed.

I'm closing this issue with the "needs revisiting" flag, because there's not much that can be done at this point.

@groue groue closed this as completed Jan 14, 2022
@groue groue added enhancement needs revisiting This issue was closed but it has unfinished business and removed help wanted Extra attention is needed labels Jan 14, 2022
@groue
Copy link
Owner

groue commented Aug 25, 2022

#1248 is the final nail

@groue groue removed the needs revisiting This issue was closed but it has unfinished business label Aug 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants