-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: collection size #3844
chore: collection size #3844
Conversation
tests/dragonfly/seeder_test.py
Outdated
|
||
keys = await async_client.keys() | ||
assert (await async_client.llen(keys[0])) == 1 | ||
assert len(await async_client.lpop(keys[0])) == 10_000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question:
if element_size_ratio=1/2
then data_size is 10k ** (1/2)
== 100 and variance is 1. So dsize=100
. My question is, why do we xor
the dsize
? That is: LG_funcs.esize = math.ceil(dsize ^ delement_ratio)
which is how many elements a given type should contain (that is llen(keys[0])
).
So to sumamrize:
- Why do we
xor
this ?LG_funcs.esize = math.ceil(dsize ^ delement_ratio)
- Why do we express the number of elements per set via all of this? Can't we just be specific on how many elements we want of a given size each ?
There is something I am missing so I am asking here 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It's not xor, it's power 🙃
- If it's too difficult and fragile, no one will use it properly. It's just a 0/1 slider: 0 means smallest possible elements, 1 means biggest possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For 1. power is **
not caret ^
which is xor
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I brainfarted. It's LUA not python 😮💨 🤦
Now it all makes sense. Ignore my blindness....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol @dranikpg we make the comment at exact same time. I did not read your link but somehow I noticed and then you replied at the exact same moment I figured and reply 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM maybe wait for Adi?
tests/dragonfly/seeder/__init__.py
Outdated
@@ -79,13 +79,15 @@ def __init__( | |||
data_size=100, | |||
variance=5, | |||
samples=10, | |||
element_size_ratio=1 / 3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we instead use element count?
When writing a tests case I want to define the total size of the datastructor and the number of elements it will have , each element size will be total size / element count
I find element count param much more intuitive than element size ratio which defines the element size which will be total size ^ element size ratio
7ce25b5
to
76fc7aa
Compare
Updated to collection size parameter 🎩 |
Signed-off-by: Vladislav Oleshko <[email protected]>
76fc7aa
to
9eda2e4
Compare
Fixes #3840
Added element_size_ratio parameter to both static and dynamic seeder
data size ~ data volume, element size = data size ^ ratio, element count = data volume / element size