implement 3.0 api #91

romange · 2022-06-04T02:34:11Z

we will see that in besides some debugging/configuration/bitops commands we have only TOUCH command that needs to be implemented. Dragonfly implements its cache a bit different from Redis but it still has notion of "upgrading" the entry to a higher priority.
It could be that in dragonfly, touch and exist are almost identical. need to see if there are any differences in the output or actual behavior.

The text was updated successfully, but these errors were encountered:

romange · 2022-08-22T12:42:15Z

update: we need to finish 2.x commands, bits commands, watch/unwatch commands.

dranikpg · 2022-09-08T21:04:08Z

Lets start with watch & unwatch

How Redis does it

The whole management is really simple:

struct watchedKey stores basic info (key, db, client)
WATCH adds a watchedKey to the clients watch list c->watched_keys and a global table c->db->watched_keys, which stores a list of dependant clients for every key
When a key is updated, all clients in c->db->watched_keys[key] are marked as CLIENT_DIRTY_CAS, so the following EVAL aborts
db->watched_keys is just a regular redis dict, the same that is used for the main storate/expires

Some more thoughts

Instead of having another map, we can store a incremental version code for each key. The drawback here is that handling expires/deletes is a bit tricky. Besides, redis abort checks are constant O(1) and after the first watched key is updated, all further watches are ignored.
db->watched_keys dict is queried on each update, so 99% are just reads, 99% of which probably return nothing
The watch mechanism is in many ways similar to expiration. So I expect a viable solution to be similar to it (which is in turn similar to the redis approach). However I feel like adding a lot of unwanted complexity and overhead just for a minor feature.
This would mean adding another DashTable likewise to ExpireTable and updating it on writes.

What do you think?

romange · 2022-09-09T05:21:40Z

I do think it's a minor feature. You can see what we currently store inside a db, see DbTable class for that.
Specifically, we maintain trans_locks that we use to track the ongoing transaction locks. In fact, check in DbSlice for CheckLock , Acquire and Release that you could probably use to piggy-back and store some information about watched keys.

The only complexity I see is that storing connections directly inside db->watched_keys is complicated. It's doable (I do it in pubsub I think) but it's complicated. The reason for this is that DbTable can reside in a different thread and then if you store a reference to an object whose life is controlled by another thread, your life becomes complicated.

The simpler solution will be is that WATCH will indeed register some keys for tracking in the appropriate DbTables. And then during EVAL we will do an additional transactional "hop" that queries DbTables regarding the state of those keys.
also look at ConnectionState in server/conn_context.h - it contain transaction related info with ExecState and exec_body variables. I feel it's about time to group those within a dedicated substruct similarly to SubscribeInfo struct. You will add the watched keys into the new struct.

romange · 2022-09-09T05:41:51Z

Thinking about it a bit more - you will have to clean up DbTable's watched table upon connection closure - i.e. there will be some interaction between a connection lifetime and DbTable's state. So you can still store a pointer to connections inside a DbTable but you can only reference there thread-safe structures or, alternatively, just use it as a descriptor without dereferencing it.

dranikpg · 2022-09-09T07:17:02Z

And then during EVAL we will do an additional transactional "hop" that queries DbTables regarding the state of those keys

The purpose of DbTable's watched table is that EXEC doesn't have to query its own keys. However, it still has to unregister all its keys (it does on the first watch fail). What is more, EXEC directly checks for expirations of all of them 🤔 So in the end it has to query all keys at least once. It seems like with a few watched keys, the db->watch_keys has more overhead than speedup. Do you see the real reason they're using it? Solely for the fail-fast behavior? Redis already stores a last-modified timestamp (OBJECT IDLETIME) so memory savings are probably not the issue

you can only reference there thread-safe structures

Sure, I thought of adding some atomic flag or making exec_state atomic. If we need it at all (see question above)

romange · 2022-09-09T07:25:20Z

Lets forget about fast-fail, and complexity of going over all the keys right now. Assume that the watched list is short.

Are you saying that we can store all the info in connection and get done with it?
what will you store in the map? the dirty flag? i.e. key->bool mapping? key->unsigned mapping for versioning?
how another shard will know to update those keys in those connections?

dranikpg · 2022-09-09T07:35:26Z

how another shard will know to update those keys in those connections?

There is obviously no way for it

key->unsigned mapping for versioning

Yes. If we use simple version codes, we might miss a DEL-SET sequence. Instead, the last modified time with nanosecond granularity seems like an option. Then, on EXEC, we'll just check the time didn't change (so the key has no newer timestamp)

Besides, Redis doesn't even bother to use a map for the connections data. Watch checks are O(k).

Or is there some issue I'm not seeing through?

romange · 2022-09-09T07:38:33Z

So thread T1 holds a connection C that has a watchmap with key K belonging to a shard in thread T2.
Now a SET operation changes K in T2. How does it know to search for C and update K in c->watched?

romange · 2022-09-09T07:41:48Z

Forget multi-threading - how a single-threaded system will know to search for C ?

dranikpg · 2022-09-09T07:55:16Z

There is no double-sided dependency. The key just cares for itself, but it stores a timestamp with its own last update time (we'll have to add it at some point to support OBJECT IDLETIME).

WATCH just stores the timestamp TK of some key K at the time of call in C, meaning "I still expect K to have timestamp TK when I start EXEC". If TK changed since WATCH, then the key was modified. So the values in C don't change - they serve as kind of a snapshot to compare against on EXEC

romange · 2022-09-09T08:07:53Z

So you suggest storing the last modified timestamp for every entry in DF inside DbTable ? or just for watched keys inside DbTable?

dranikpg · 2022-09-09T08:11:17Z

Yes, for all. We can't store them selectively without query overhead

romange · 2022-09-09T08:14:48Z

So my problem with this is that for a pretty minor feature we need to add space overhead of 8 bytes per key.
Even for expiry (which is much more common) we complicated things so that the memory usage with be incremental with the feature. Please note that memory usage is the primary cost factor when using a memory store.

Therefore, I think trading implementation complexity for reducing the overhead in memory is well worth it. Regarding the OBJECT IDLETIME command - it's a debugging command, and we won't support it.

dranikpg · 2022-09-09T09:43:11Z

Oh, okay. That answers all of my questions above. If Dragonfly doesn't intend to support IDLETIME, then adding memory overhead just for WATCH isn't worth it.

romange · 2022-10-19T17:20:20Z

Closing as done. Will open a dedicated task for TOUCH

romange added this to the Post-Release milestone Aug 22, 2022

romange added the enhancement New feature or request label Aug 22, 2022

romange assigned dranikpg Sep 8, 2022

romange mentioned this issue Sep 12, 2022

Add WATCH and UNWATCH commands #279

Closed

This was referenced Sep 13, 2022

docs(api): Update api_status.md #285

Merged

Support MOVE command #286

Closed

Support SORT command #287

Closed

romange closed this as completed Oct 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement 3.0 api #91

implement 3.0 api #91

romange commented Jun 4, 2022

romange commented Aug 22, 2022

dranikpg commented Sep 8, 2022 •

edited

Loading

romange commented Sep 9, 2022 •

edited

Loading

romange commented Sep 9, 2022 •

edited

Loading

dranikpg commented Sep 9, 2022 •

edited

Loading

romange commented Sep 9, 2022

dranikpg commented Sep 9, 2022 •

edited

Loading

romange commented Sep 9, 2022

romange commented Sep 9, 2022

dranikpg commented Sep 9, 2022

romange commented Sep 9, 2022

dranikpg commented Sep 9, 2022

romange commented Sep 9, 2022 •

edited

Loading

dranikpg commented Sep 9, 2022

romange commented Oct 19, 2022

implement 3.0 api #91

implement 3.0 api #91

Comments

romange commented Jun 4, 2022

romange commented Aug 22, 2022

dranikpg commented Sep 8, 2022 • edited Loading

How Redis does it

Some more thoughts

romange commented Sep 9, 2022 • edited Loading

romange commented Sep 9, 2022 • edited Loading

dranikpg commented Sep 9, 2022 • edited Loading

romange commented Sep 9, 2022

dranikpg commented Sep 9, 2022 • edited Loading

romange commented Sep 9, 2022

romange commented Sep 9, 2022

dranikpg commented Sep 9, 2022

romange commented Sep 9, 2022

dranikpg commented Sep 9, 2022

romange commented Sep 9, 2022 • edited Loading

dranikpg commented Sep 9, 2022

romange commented Oct 19, 2022

dranikpg commented Sep 8, 2022 •

edited

Loading

romange commented Sep 9, 2022 •

edited

Loading

romange commented Sep 9, 2022 •

edited

Loading

dranikpg commented Sep 9, 2022 •

edited

Loading

dranikpg commented Sep 9, 2022 •

edited

Loading

romange commented Sep 9, 2022 •

edited

Loading