Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement 3.0 api #91

Closed
romange opened this issue Jun 4, 2022 · 15 comments
Closed

implement 3.0 api #91

romange opened this issue Jun 4, 2022 · 15 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@romange
Copy link
Collaborator

romange commented Jun 4, 2022

if we look at commands.json

jq -c 'to_entries[] | select(.value.since | test("^3\\..*"))' commands.json | grep -v '"cluster"' | grep -v '"geo"' | less

we will see that in besides some debugging/configuration/bitops commands we have only TOUCH command that needs to be implemented. Dragonfly implements its cache a bit different from Redis but it still has notion of "upgrading" the entry to a higher priority.
It could be that in dragonfly, touch and exist are almost identical. need to see if there are any differences in the output or actual behavior.

@romange
Copy link
Collaborator Author

romange commented Aug 22, 2022

update: we need to finish 2.x commands, bits commands, watch/unwatch commands.

@romange romange added this to the Post-Release milestone Aug 22, 2022
@romange romange added the enhancement New feature or request label Aug 22, 2022
@dranikpg
Copy link
Contributor

dranikpg commented Sep 8, 2022

Lets start with watch & unwatch

How Redis does it

The whole management is really simple:

  • struct watchedKey stores basic info (key, db, client)
  • WATCH adds a watchedKey to the clients watch list c->watched_keys and a global table c->db->watched_keys, which stores a list of dependant clients for every key
  • When a key is updated, all clients in c->db->watched_keys[key] are marked as CLIENT_DIRTY_CAS, so the following EVAL aborts
  • db->watched_keys is just a regular redis dict, the same that is used for the main storate/expires

Some more thoughts

  • Instead of having another map, we can store a incremental version code for each key. The drawback here is that handling expires/deletes is a bit tricky. Besides, redis abort checks are constant O(1) and after the first watched key is updated, all further watches are ignored.
  • db->watched_keys dict is queried on each update, so 99% are just reads, 99% of which probably return nothing
  • The watch mechanism is in many ways similar to expiration. So I expect a viable solution to be similar to it (which is in turn similar to the redis approach). However I feel like adding a lot of unwanted complexity and overhead just for a minor feature.
  • This would mean adding another DashTable likewise to ExpireTable and updating it on writes.

What do you think?

@romange
Copy link
Collaborator Author

romange commented Sep 9, 2022

I do think it's a minor feature. You can see what we currently store inside a db, see DbTable class for that.
Specifically, we maintain trans_locks that we use to track the ongoing transaction locks. In fact, check in DbSlice for CheckLock , Acquire and Release that you could probably use to piggy-back and store some information about watched keys.

The only complexity I see is that storing connections directly inside db->watched_keys is complicated. It's doable (I do it in pubsub I think) but it's complicated. The reason for this is that DbTable can reside in a different thread and then if you store a reference to an object whose life is controlled by another thread, your life becomes complicated.

The simpler solution will be is that WATCH will indeed register some keys for tracking in the appropriate DbTables. And then during EVAL we will do an additional transactional "hop" that queries DbTables regarding the state of those keys.
also look at ConnectionState in server/conn_context.h - it contain transaction related info with ExecState and exec_body variables. I feel it's about time to group those within a dedicated substruct similarly to SubscribeInfo struct. You will add the watched keys into the new struct.

@romange
Copy link
Collaborator Author

romange commented Sep 9, 2022

Thinking about it a bit more - you will have to clean up DbTable's watched table upon connection closure - i.e. there will be some interaction between a connection lifetime and DbTable's state. So you can still store a pointer to connections inside a DbTable but you can only reference there thread-safe structures or, alternatively, just use it as a descriptor without dereferencing it.

@dranikpg
Copy link
Contributor

dranikpg commented Sep 9, 2022

And then during EVAL we will do an additional transactional "hop" that queries DbTables regarding the state of those keys

The purpose of DbTable's watched table is that EXEC doesn't have to query its own keys. However, it still has to unregister all its keys (it does on the first watch fail). What is more, EXEC directly checks for expirations of all of them 🤔 So in the end it has to query all keys at least once. It seems like with a few watched keys, the db->watch_keys has more overhead than speedup. Do you see the real reason they're using it? Solely for the fail-fast behavior? Redis already stores a last-modified timestamp (OBJECT IDLETIME) so memory savings are probably not the issue

you can only reference there thread-safe structures

Sure, I thought of adding some atomic flag or making exec_state atomic. If we need it at all (see question above)

@romange
Copy link
Collaborator Author

romange commented Sep 9, 2022

Lets forget about fast-fail, and complexity of going over all the keys right now. Assume that the watched list is short.

Are you saying that we can store all the info in connection and get done with it?
what will you store in the map? the dirty flag? i.e. key->bool mapping? key->unsigned mapping for versioning?
how another shard will know to update those keys in those connections?

@dranikpg
Copy link
Contributor

dranikpg commented Sep 9, 2022

how another shard will know to update those keys in those connections?

There is obviously no way for it

key->unsigned mapping for versioning

Yes. If we use simple version codes, we might miss a DEL-SET sequence. Instead, the last modified time with nanosecond granularity seems like an option. Then, on EXEC, we'll just check the time didn't change (so the key has no newer timestamp)

Besides, Redis doesn't even bother to use a map for the connections data. Watch checks are O(k).

Or is there some issue I'm not seeing through?

@romange
Copy link
Collaborator Author

romange commented Sep 9, 2022

So thread T1 holds a connection C that has a watchmap with key K belonging to a shard in thread T2.
Now a SET operation changes K in T2. How does it know to search for C and update K in c->watched?

@romange
Copy link
Collaborator Author

romange commented Sep 9, 2022

Forget multi-threading - how a single-threaded system will know to search for C ?

@dranikpg
Copy link
Contributor

dranikpg commented Sep 9, 2022

There is no double-sided dependency. The key just cares for itself, but it stores a timestamp with its own last update time (we'll have to add it at some point to support OBJECT IDLETIME).

WATCH just stores the timestamp TK of some key K at the time of call in C, meaning "I still expect K to have timestamp TK when I start EXEC". If TK changed since WATCH, then the key was modified. So the values in C don't change - they serve as kind of a snapshot to compare against on EXEC

@romange
Copy link
Collaborator Author

romange commented Sep 9, 2022

So you suggest storing the last modified timestamp for every entry in DF inside DbTable ? or just for watched keys inside DbTable?

@dranikpg
Copy link
Contributor

dranikpg commented Sep 9, 2022

Yes, for all. We can't store them selectively without query overhead

@romange
Copy link
Collaborator Author

romange commented Sep 9, 2022

So my problem with this is that for a pretty minor feature we need to add space overhead of 8 bytes per key.
Even for expiry (which is much more common) we complicated things so that the memory usage with be incremental with the feature. Please note that memory usage is the primary cost factor when using a memory store.

Therefore, I think trading implementation complexity for reducing the overhead in memory is well worth it. Regarding the OBJECT IDLETIME command - it's a debugging command, and we won't support it.

@dranikpg
Copy link
Contributor

dranikpg commented Sep 9, 2022

Oh, okay. That answers all of my questions above. If Dragonfly doesn't intend to support IDLETIME, then adding memory overhead just for WATCH isn't worth it.

@romange
Copy link
Collaborator Author

romange commented Oct 19, 2022

Closing as done. Will open a dedicated task for TOUCH

@romange romange closed this as completed Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants