Skip to content
This repository has been archived by the owner on Feb 16, 2023. It is now read-only.

Matching Observer creation, like IntersectionObserver #24

Closed
dmurph opened this issue May 12, 2016 · 20 comments
Closed

Matching Observer creation, like IntersectionObserver #24

dmurph opened this issue May 12, 2016 · 20 comments

Comments

@dmurph
Copy link
Collaborator

dmurph commented May 12, 2016

Observer creation should more closely resemble current observers in the web. We want to match something like IntersectionObserver.

This proposal only go over changes to the creation, we assume that the changes callback and behavior there is the same as in the Explainer.

Objectives

Example use cases:

  1. A UI element that's syncing it's state to the database.
  2. JS working synchronizing DB changes to a webserver.
  3. Using an observer to maintain an in-memory cache.
  4. Observing when object store changes to do custom refresh logic.

We need to make sure we can always do the following:

  1. Know exactly when our observing starts in the transaction chain. This means we should be able to read in data and start an observer, and not lose any changes in between the two that the observer doesn't see. This is important in sync services as outlined below.
  2. Listen to multiple object stores at the same time, so that a transaction with both object stores will generate just one change event.

Wants:

  1. Minimize possible web developer error cases and bugs.
  2. Minimize code amount & complexity.
  3. Defaults that are more performant. Examples: require the developer to always specify object stores (tables) that they are observering. Only send the key of the change by default to reduce memory used.

Proposal

I'll show examples first.

UI Element

Let's say we're a polymer or angular webapp, and we have databinding working. We rely on the guarantee that the observer will start immediately after the transaction it was created with.

// uiComponent contains logic for updating it's own data.
var uiComponent = this;
// This function updates our UI component when the database is changed.
var updateUICallback = function(changes) {
  changes.records.get('users').forEach(function(record) {
    switch(record.type) {
      case 'clear':
        uiComponent.clear();
        break;
      case 'add':
        uiComponent.addUser(change.key, change.value);
        break;
      case 'put':
        uiComponent.updateUser(change.key, change.value);
        break;
      case 'delete':
        uiComponent.removeUser(change.key);
        break;
    }
  });
}
// Observer creation. We want to include the values,
// as we'll always use them to populate the UI.
var observer = new IndexedDBObserver(updateUICallback, { values: true });
// Create or transaction for both reading the table and attaching the observer.
var txn = db.transaction('users', 'readonly');
// We'll start seeing changes after 'txn' is complete.
observer.observe(db, txn);
// Now we read in our initial state for the component.
var usersTable = txn.objectStore('users');
var request = usersTable.getAll();
request.onsuccess = function() {
  request.result.forEach(function(user) {
    uiComponent.addUser(user.id, user);
  });
}
txn.oncomplete = function() {
  console.log('component initialized and observer started');
}

Server sync worker

We want to synchronize changes to a webserver. Let's say we're storing descrete transformations, so it's only add operations. We rely on the guarantee that the observer will start immediately after the transaction it was created with.

// We are bad developers and don't batch our network calls.
// We just send them whenever there's a change.
var sendChangesToWebserver = function(changes) {
  var changesToSend = [];
  changes.records.get('transformations').forEach(function(record) {
    if (record.type != 'add') return;
    changesToSend.push(record.value);
  });
  sendChangesToNetwork(changesToSend);
}
var observer = new IndexedDBObserver(sendChangesToWebserver,
    { onlyExternal: true, values: true });
// Make our network call to populate the transformations.
getTranformsFromNetwork(function(transforms) {
  var txn = db.transact('transformations', 'readwrite');
  observer.observe(db, txn);
  var os = txn.objectStore('transformations');
  transforms.forEach(function(transform) {
    os.put(transform);
  });
  txn.oncomplete = function() {
    console.log('Database is initialized and we are syncing changes');
  }
});
// Listen for changes from network.
// Note: our observer won't see these changes, as it has onlyExternal: true.
this.onNetworkChanges = function(changesFromNetwork) {
  var txn = db.transact('transformations', 'readwrite');
  var os = txn.objectStore('transformations');
  transforms.forEach(function(transform) {
    os.put(transform);
  });
}

Maintaining an in-memory data cache

IDB can be slow due to disk, so it's a good idea to have an in-memory cache. Note that we don't include values here, as we want to optimize our memory usage by reading in the cache in a larger batch, and at an opportune time. We rely on the guarantee that the observer will start immediately after the transaction it was created with.

// let's assume our cache batches together reads once we get enough changes
// or a timeout occurs. So we want to give it changes, and let it optionally read
// in a bunch of data.
var usersCache = this.cache;
var updateUsersCache = function(changes) {
  usersCache.addChanges(changes.records.get('users'));
  usersCache.maybeResolveChanges(changes.transaction);
}
var observer = new IndexedDBObserver(updateUsersCache, { transaction: true });
var txn = db.transaction('users', 'readonly');
// Attach our observer.
var optionsMap = new Map();
optionsMap.put('users', { ranges: IDBKeyRange.bound(0, 1000)});
observer.observe(db, txn, optionsMap);
// Read initial contents of the cache.
var os = txn.objectStore('users');
var readRequest = os.getAll(IDBKeyRange.bound(0, 1000), 50);
readRequest.onsuccess = function() {
  var users = readRequest.result;
  usersCache.addUsers(users);
}
txn.oncomplete = function() {
  console.log("Cache initialized and listening");
}

Custom refresh logic

If we just want to know when an object store has changed. This isn't the most efficient, but this might be the 'starting block' websites use to transition to observers, as at some point they would read the database using a transaction to update their UI.

Note: This has been updated as per this comment below.

// We just grab the transaction and give it to our UI refresh logic.
var refreshDataCallback = function(changes) {
  refreshDataWithTransaction(changes.transaction);
}
// We disable records, so we just get the callback without any data.
// We ask for the transaction, which guarentees we're reading the current
// state of the database and we won't miss any changes.
var observer = new IndexedDBObserver(
    refreshDataCallback, { records: false, transaction: true });
observer.observe(db, db.transact('users', 'readonly'));

Summary

Other than the two step construction, we've split up the options. All options except ranges have moved to a flat options object passed into the observer creation. We could even just make the Map values for observer.observe(db, txn, map) be the ranges array itself, instead of an object that contains the 'ranges' attribute.

I'm also having the database and transaction required arguments. In almost every use case, the person needs the guarentee that the observation starts after the read or write to the database. Otherwise changes can be leaked, where the observer didn't see them yet or the observer sees a change that we thought it wouldn't.

Possible changes / thoughts

  1. It's a little cumbersome to have a transaction. It would be nice to remove, but we don't want people to be able to just listen to the whole database by default. We could allow the second argument to be an array of object store names, but we don't want overloaded functions.
  2. We could remove the db argument and just use the connection from the transaction. This changes the current observer pattern in the web platform though, which ties the lifetime of the observer to the first argument. Since we're tieing to the database connection, then this makes sense to leave in there.
  3. The map of objectstore options is a little weird... maybe somehow we could decompose these options into separate observe calls? This seems way more confusing though. I think moving all options but the ranges option to the observer construction simplifies this, and then we can just have it be a map<string, sequence> instead of the object as the value.
@szager-chromium
Copy link

The second argument to observer.observe() should probably be a dict. As features creep in, using a flat list of arguments will get gross.

@bfgeek
Copy link

bfgeek commented May 16, 2016

What values are in the record?

I see so far type, key, value , I assume there is also objectStore and db if the Observer is being used over multiple databases / objectStores?

@bfgeek
Copy link

bfgeek commented May 16, 2016

Why use a IDBTransaction for observing? (kinda sounds like you are just observing the effects from performing that transation :-P )

@dmurph
Copy link
Collaborator Author

dmurph commented May 16, 2016

@bfgeek
See the current (explainer)[https://github.com/dmurph/indexed-db-observers/blob/gh-pages/EXPLAINER.md] for more details about the change callback.

changes: {
  db: <object>, // The database connection object. If null, then the change
                // was from a different database connection.
  transaction: <object>, // A readonly transaction over the object stores that
                         // this observer is listening to. This is populated when
                         // includeTransaction is set in the options.
  records: Map<string, Array<object>> // The changes, outlined below.
}

Example records map:

{'objectStore1' => [{type: "add", key: IDBKeyRange.only(1), value: "val1"},
                    {type: "add", key: IDBKeyRange.only(2), value: "val2"},
                    {type: "put", key: IDBKeyRange.only(4), value: "val4"},
                    {type: "delete", key: IDBKeyRange.bound(5, 6, false, false)}],
 'objectStore2' => [{type: "add", key: IDBKeyRange.only(1), value: "val1"},
                    {type: "add", key: IDBKeyRange.only(2), value: "val2"}]}

@szager-chromium I think that's going to be the debate. I think we should require a database connection and a transaction, and then have the dict after that. This does two things:

  • Filters the object stores, as we're using the ones in the transaction
  • Tells us exactly when we're starting the observation.

I'll make it more clear that this is the intent above.

@dmurph
Copy link
Collaborator Author

dmurph commented May 16, 2016

@bfgeek for the transaction:

We pretty much always want to be able to read or write to the database right before we start observing. See the examples, in order to have our UI be correct, or for us to not miss a change to the web server, or miss a change in our cache, we need to know our pre-observation state. Otherwise we can either:

  1. Miss a record between initial read/write and our observer starting.
  2. Accidentally observe a write that we didn't want.

We also want to build in object store filtering, so the user can't just observe all changes to all object stores by default, as this could cause a lot of performance issues as the easy & default option. Including the transaction does this.

@bevis-tseng
Copy link

Is it a typo to set onlyExternal to true in the example of Server sync worker when onlyExternal means "only listen for changes from other db connections"?

@dmurph
Copy link
Collaborator Author

dmurph commented May 18, 2016

No, that makes it so that worker doesn't receive notifications for it's own changes (for example, if the webserver sends it changes). I didn't put that part in the example, I'll do that now. Instead it only receives changes from other connections, which would be real tabs or other workers.

Is that confusing? What did you think it was doing?

@bevis-tseng
Copy link

My bad. You have mentioned that clearly in the comment:
// Note: our observer won't see these changes, as it has onlyExternal: true.

@dmurph
Copy link
Collaborator Author

dmurph commented May 19, 2016

oh, I just added that, it wasn't there before (sorry for the confusion). I'm glad you had that question though, it means it's a slightly confusing feature.

@bevis-tseng
Copy link

Will there be a potential issue in 'Custom refresh logic' if it is requested while another 'write' transaction is ongoing?
According to current approach and the transaction behaviour in current indexedDB spec, the read-transaction of the observer will be completed only after write transaction is complete and the observer will be started after the read-transaction is complete.

Then, there will be a timing issue to the developers that the change in the write transaction can not be listened by this observer.

@dmurph
Copy link
Collaborator Author

dmurph commented May 20, 2016

Hm, yeah I wasn't thinking about the 'completion' time of the readonly transaction, especially if it's empty. That doesn't just complete automatically because there's no requests?

I was imagining the 'custom refresh' logic would do some sort of it's own transaction to read the current state. But this would definitley miss things, like you talked about.

I'll update that example to use the 'transaction' feature, and have the call read from that transaction. I think that makes a lot more sense for this example, and shows off that feature. How does that sound?

@bevis-tseng
Copy link

bevis-tseng commented May 23, 2016

IMHO, it won't be complete because it haven't been started yet if a write transaction with overlapping scope is ongoing according to the note in IDB spec [1].
"
Generally speaking, the above requirements mean that any transaction which has an overlapping scope with a read/write transaction and which was created after that read/write transaction, can't run in parallel with that read/write transaction.
"
However, the behaviour of test script at the bottom of this comment varies among different browsers:
For Chrome/Safari,
the read_txn.oncomplete will be done immediately before "rq_add2.oncomplete" is done.
For Firefox/Edge,
the read_txn.oncomplete will be done after write_txn.oncomplete is done.

If we'd like to make the 'custom refresh' logic works while another write transaction is ongoing, the spec in IDB [1] might have to specify the rule more precisely in this specific scenario.
If not, it will be nice to update the example with the read operation before observing the changes.
Besides, a precise warning of this limitation or more detailed explanation is also important to the observer developer. :-)

[1] http://w3c.github.io/IndexedDB/#transaction-concept

Test script of the "readonly" transaction order if there is no operation:

var db;
var write_txn, read_txn;
var rq_add1, rq_add2;
var objStore;
var open_rq = indexedDB.open("testDB");
open_rq.onupgradeneeded = function(e) {
    db = e.target.result;
      objStore = db.createObjectStore("store");
}

open_rq.onsuccess = function(event) {
    db = event.target.result;
    write_txn = db.transaction("store", "readwrite");
    write_txn.oncomplete = function() {
        console.log("write_txn.oncomplete");
    };

    objStore = write_txn.objectStore("store");

    rq_add1 = objStore.put({ animal: "Unicorn" }, 1);
    rq_add1.onsuccess = function() {
        console.log("rq_add1.oncomplete");
        read_txn = db.transaction("store", "readonly");
        read_txn.oncomplete = function() {
            console.log("read_txn.oncomplete");
        };
        rq_add2 = objStore.put({ animal: "Horse" }, 2);
        rq_add2.onsuccess = function() {
            console.log("rq_add2.oncomplete");
        }
    }
}

@dmurph
Copy link
Collaborator Author

dmurph commented May 23, 2016

I see. Do you have any suggestions? I'm thinking:

  1. Update the example to do a (fake?) read initially, so it's not empty.
  2. We could specify something about empty transactions that have an observer added, and make them perform consistently.
  3. We could fix the spec to make all browsers consistent here.
  4. or just document this inconsistant behavior?

@inexorabletash
Copy link
Member

@bevis-tseng - thanks for the note about the transaction ordering issue - I've filed https://bugs.chromium.org/p/chromium/issues/detail?id=614530 for the Blink issue.

@bevis-tseng
Copy link

So, it seems to be an ordering issue in Blink implementation.
For the IDB observer registered later than an ongoing write-transaction, we expected that the observeration will be started after the write-transaction is done, right?

Or are we still going to specify something new in IDB Spec for the transaction used by the observer, when observer in IDB is supported?

If this is taken into consideration, we have to specify this carefully about identify an empty transaction for observation, because some web developer will put the operations of a transaction in the promise which makes thing more complicated to identify an empty transaction.

@dmurph
Copy link
Collaborator Author

dmurph commented Jun 7, 2016

It seems like there isn't any specific problems w/ this change, and the transaction ordering problem is an issue in the old spec as well. I'll start modifying the spec to match this new construction method (and include these examples) and create a new issue for the transaction ordering

@dmurph
Copy link
Collaborator Author

dmurph commented Jun 7, 2016

I think w3c/IndexedDB#77 should be sufficient. New construction sematics merged, so I'm closing this.

@dmurph dmurph closed this as completed Jun 7, 2016
@inexorabletash
Copy link
Member

Spec update: w3c/IndexedDB@4ca50ad
Test PR: web-platform-tests/wpt#3154

@cmumford
Copy link

I may have missed this, but should observers be called in the order in which they registered (i.e. called observe())?

@szager-chromium
Copy link

On Tue, Jun 28, 2016 at 12:43 PM Chris Mumford [email protected]
wrote:

I may have missed this, but should observers be called in the order in
which they registered (i.e. called observe())?

Yes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants