Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recognizing hot keys #257

Closed
romange opened this issue Aug 25, 2022 · 4 comments · Fixed by #951
Closed

recognizing hot keys #257

romange opened this issue Aug 25, 2022 · 4 comments · Fixed by #951
Labels
enhancement New feature or request

Comments

@romange
Copy link
Collaborator

romange commented Aug 25, 2022

Motivation: https://blog.box.com/introducing-memsniff-robust-memcache-traffic-analyzer

for large scale deployments, caching teams would like to learn about hot keys in real-time so that they could handle them in a special way.

Currently, teams develop sniffers! (see the link) to do so. It's not very elegant way and very CPU intensive. We could integrate it into DF and provide native support for this.

@romange romange added the enhancement New feature or request label Aug 25, 2022
@romange
Copy link
Collaborator Author

romange commented Aug 25, 2022

Algorithms that might help:
https://www.usenix.org/conference/atc18/presentation/gong
https://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-get.cgi/2016/CS/CS-2016-01.pdf

we should choose the simpler one, not necessarily the most sophisticated one.

@romange
Copy link
Collaborator Author

romange commented Sep 3, 2022

after reading these paper I think it's better to start with HeavyKeeper.
We even do not need to maintain a min-heap data structure because query complexity is not the issue here - we need to focus on fast updates.

@Super-long
Copy link
Contributor

I have read the heavykeeper paper and am implementing a heavykeeper structure that does not use min-heap.

The basic idea for dragonfly to introduce hotspot awareness is to set one heavykeeper per proactor and perform sorting to find the hottest few keys when hotspot information is needed.

I intend to do the following.
step1. complete and test a heavykeeper structure without min-heap (work in progress)
step2. think about how heavykeeper is used in each proactor of dragonfly?
step3. what can each proactor do when it senses a hotspot?
step4. How to aggregate hotspot information from all proactors when dragonfly users need hotspot information?
step5. do we need to persist hotspot information to provide hotspot history query function?

First I will focus on the implementation and testing of heavykeeper, and continue to dive into dragonfly's code, subsequent questions may need to be discussed by community members.

This is not something that can be done quickly, so I think these ideas will appear in many PRs that

@romange
Copy link
Collaborator Author

romange commented Oct 27, 2022

@Super-long please join our Discord server https://discord.gg/HsPjXGVH85 and say hello. I will add you to our #dev channel.

romange added a commit that referenced this issue Feb 20, 2023
Part of the heavy keeper algo, required for #257.

Signed-off-by: Roman Gershman <[email protected]>
romange added a commit that referenced this issue Feb 20, 2023
Part of the heavy keeper algo, required for #257.
Also see #446 for the initial (abandoned) PR.

Signed-off-by: Roman Gershman <[email protected]>
romange added a commit that referenced this issue Feb 21, 2023
Part of the heavy keeper algo, required for #257.
Also see #446 for the initial (abandoned) PR.

Signed-off-by: Roman Gershman <[email protected]>
romange added a commit that referenced this issue Feb 21, 2023
Part of the heavy keeper algo, required for #257.
Also see #446 for the initial (abandoned) PR.

Signed-off-by: Roman Gershman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants