Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(tools): Add Zipfian cache testing tool #640

Merged
merged 6 commits into from
Jan 7, 2023

Conversation

devangjhabakh
Copy link
Contributor

Signed-off-by: Ubuntu [email protected]

Adds a cache testing tool that produces values according to a Zipfian distribution to test cache, as per issue #253.

parser.add_argument('-c', '--count', type=int, default=100000, help='total number of incrby operations')
parser.add_argument('-u', '--uri', type=str, default='redis://localhost:6379', help='Redis server URI')
parser.add_argument('-a', '--alpha', type=int, default=1.0, help='alpha value being used for the Zipf distribution')
parser.add_argument('-n', '--number', type=int, default=30, help='the number of values to be used in the distribution')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's worth noting here that it we really want to see the number of misses go up, we'll have to fill our cache, which would mean that we'll need a far higher number.

distMap = [x / zeta[-1] for x in zeta]

# Generate an array of uniform 0-1 pseudo-random values:
u = np.random.random(numSamples)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that you took the implementation from here:
https://stackoverflow.com/questions/31027739/python-custom-zipf-number-generator-performing-poorly

it's fine, but I am worried about the memory usage of the this program - we will need to generate traces of hundreds of millions/billions of records to cause evictions. To solve this, we should convert rand_zipf to a generator, yielding
batches of np.random.random(numSamples) each time and then we can generate a batch each time repeating the preprocessor step every time I call rand_zipf. if you know what I am talking about please do the adjustments otherwise leave as is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed the numSamples bottleneck by rewriting the code to look something like this:

    # Calculate Zeta values from 1 to n:
    tmp = np.power( np.arange(1, n+1), -alpha )
    zeta = np.r_[0.0, np.cumsum(tmp)]

    # Store the translation map:
    distMap = [x / zeta[-1] for x in zeta]

    # Generate values form the distribution based on 0-1 pseudo-random values

    for i in range(numSamples):
        yield np.searchsorted(distMap, np.random.random())

However, it's worth nothing that there exists another significant bottleneck -- the distMap. It's likely going to also be a huge data structure (if we really want to start seeing cache misses, that is). I think it might be a bit more challenging to remove that bottleneck, but it may be possible (I still have to spend some more time thinking about how, if at all, it can be done). Do you think I should spend time on it? Because another workaround to forcing misses out of the cache is to simply limit the amount of memory it is allowed, too.

Copy link
Collaborator

@romange romange Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually thought to leave numSamples batch as is but instead to have:

while True:
   u = np.random.random(numSamples)
    v = np.searchsorted(distMap, u)
    samples = [t-1 for t in v]
    yield samples

and pass pipeline length as numSamples.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not worry about distMap right now. yes it's big but I think it's smaller than total number of writes we gonna send to the server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I understand. Shall fix accordingly!


distribution_values = rand_zipf(args.number, args.alpha, args.count)
for idx, val in enumerate(distribution_values):
r.incrby(str(val), 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

following my previous comment, we should use pipelining capabilities of Redis/DF and send data in batches.
Otherwise, sending a billion items will take lots of time. Can you please, add a pipeline length parameter p ?

Finally, I see you read statistics from info() which is fine, but when I think of it, the idea of using incr does not bring additional value because you do not read its response to know if it was hit or miss. In that case, we should change it to set <key> <val> NX which will allow us to increase the value length and put pressure on the server to evict items. val can be a constant string of specified length (-d argument),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can add pipelining!

I was also thinking about how incr doesn't tell us whether the item was a hit or a miss. There is no response regarding hit/miss from redis-py, at least. I think the value length increases sound like a great idea, I'll change it to that!

Copy link
Collaborator

@romange romange Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a small correction - value length should be constant. I did not mean to append values - but just use a constant
val = 'x' * args.d for the "set key val" command. The thought is that with larger values it will be easier to add memory pressure on the server and we will need less keys, less samples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also want to support multiprocessing/multiple workers? Shall I add a -w for the same?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can avoid multi-processing for now.

@romange
Copy link
Collaborator

romange commented Jan 4, 2023

Thanks! Overall looks good! There are some comments that should allow us to make the tool more usable.

@romange
Copy link
Collaborator

romange commented Jan 4, 2023 via email

Signed-off-by: Ubuntu <[email protected]>
@devangjhabakh devangjhabakh force-pushed the djj/cache_testing_tool branch from fcf852e to 3b50043 Compare January 5, 2023 12:57
@devangjhabakh
Copy link
Contributor Author

Hey @romange I seem to have implemented pipelining and it works on my end (albeit for some reason, on my machine, the pipelined version seems to be running slower than the non-pipelined)
Can you take a look and see if the changes seem okay? This is my first time working with redis pipeline so I may have made a mistake.

total_count = 0
for idx, values in enumerate(distribution_values_generator):
total_count += len(values)
p = r.pipeline()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add transaction=False

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, that sped things up significantly. Thanks!

tools/cache_testing.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@romange romange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

@romange romange merged commit f4457be into dragonflydb:main Jan 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants