Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarking application. No test clients yet. #629

Conversation

Yury-Fridlyand
Copy link
Collaborator

@Yury-Fridlyand Yury-Fridlyand commented Nov 22, 2023

Issue #, if available:
This PR contains the benchmarking application required by #511.
To reduce PR size, there are no testable clients included.
Part 1 of 4.

Description of changes:
Follow up for #494 and #505.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

ClientName.JEDIS, ClientName.JEDIS_ASYNC, ClientName.LETTUCE, ClientName.LETTUCE_ASYNC
// ClientName.LETTUCE,
// ClientName.LETTUCE_ASYNC,
ClientName.BABUSHKA_ASYNC, ClientName.BABUSHKA,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's the default configuration - lets leave it only with ClientName.BABUSHKA_ASYNC. remove all others

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default we have in other languages is to run all clients if no clients are specified.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 842a222 - changed to all clients

ChosenAction.GET_NON_EXISTING, new ArrayList<>(),
ChosenAction.SET, new ArrayList<>());
int iterationIncrement = iterationCounter.getAndIncrement();
int clientIndex = iterationIncrement % clients.size();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not to set it only once, inside the while loop?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is set outside the loop due to debug logging (off by default) outside the loop - see block on lines 185-189

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not want to execute the loop. For example, if increment size < tasks.
Or, more realistically, some tasks don't get executed until later in the run, and may not receive an increment index until there's no more work to be done. We see this when running 1000 tasks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acarbonetto - so, if we don't get into the loop - why do we need to set client_index at all? and regarding tasks not being used at all - how many tasks do you see that actually do the work?
@Yury-Fridlyand debugging the client index doesn't have any meaning as the client index is suppose to change and isn't fixed per task

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the client_index code from here: https://github.com/aws/babushka/blob/main/benchmarks/node/node_benchmark.ts#L66 so that we uniformly distribute clients to tasks.

regarding tasks not being used at all - how many tasks do you see that actually do the work?

It seems to cap the number of tasks used at just over 100. I think this number is based on the number of processors or threads available to the event loop. So when we run this with 100 tasks, they are all used fairly uniformly. When we run with 1000 tasks, only about 100 of the tasks get any work (distributed somewhat uniformly over those 100 tasks).

? () ->
((AsyncClient) client)
.asyncGet(generateKeySet())
.get(ASYNC_OPERATION_TIMEOUT_SEC, TimeUnit.SECONDS)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapping async tasks with timeouts usually has significant impact on the performance. We should configure timeout inside the client and skip this timeout wrapping in the benchmark. We compare async clients to sync clients, and the sync isn't being wrapped with a timeout, so the benchmark should minimize these differences.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good. I remove timeout here and will add client timeouts in scope of #632

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 842a222.

@@ -209,6 +204,7 @@ public enum ClientName {
LETTUCE("Lettuce"),
LETTUCE_ASYNC("Lettuce async"),
BABUSHKA_ASYNC("Babushka async"),
BABUSHKA("Babushka"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this client? let's remove clients which we won't benchmark or don't exist (sync lettuce, sync babushka, async jedis). I'm fine with Babushka being called Babushka instead of babushka async.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. We wanted to compare async Babushka vs async Lettuce and sync Babushka vs sync Lettuce.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 842a222

ClientName.JEDIS, ClientName.JEDIS_ASYNC, ClientName.LETTUCE, ClientName.LETTUCE_ASYNC
// ClientName.LETTUCE,
// ClientName.LETTUCE_ASYNC,
ClientName.BABUSHKA_ASYNC, ClientName.BABUSHKA,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default we have in other languages is to run all clients if no clients are specified.

@@ -248,13 +245,15 @@ public RunConfiguration() {
concurrentTasks = new int[] {100, 1000};
clients =
new ClientName[] {
// ClientName.BABUSHKA_ASYNC,
ClientName.JEDIS, ClientName.JEDIS_ASYNC, ClientName.LETTUCE, ClientName.LETTUCE_ASYNC
// ClientName.LETTUCE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why leave the commented out code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 842a222

};
host = "localhost";
port = 6379;
clientCount = new int[] {1};
clientCount = new int[] {1, 2};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these aren't the defaults used in other languages benchmarks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 842a222

@@ -248,13 +245,15 @@ public RunConfiguration() {
concurrentTasks = new int[] {100, 1000};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing 1, 10

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 842a222

}

private static double stdDeviation(List<Long> latencies, Double avgLatency) {
double stdDeviation =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 842a222

List<Long> latencies = entry.getValue();

if (latencies.size() == 0) {
results.put(action, new LatencyResults(0, 0, 0, 0, 0, 0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this situation should be impossible. why silently ignore this error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 842a222

System.out.println("Total hits: " + totalHits);
}

public static void testClientSetGet(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this in a single, huge, very indented function? can it be split into subroutines?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored in 28e5fe0


var clientName = clients.get(0).getName();

System.out.printf(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please log data size too

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 842a222

iterationIncrement + 1, iterations, clientIndex + 1, clientCount);
}

var actions = getActionMap(clients.get(clientIndex), dataSize, async);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you create a new map on every iteration, instead of creating a single map once per client?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clientIndex is changing every loop iteration

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then pass clientIndex or client as an argument the functions in the map, and reuse the map.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in f7dbd3d.

@@ -18,6 +19,10 @@ dependencies {
implementation 'io.lettuce:lettuce-core:6.2.6.RELEASE'
implementation 'commons-cli:commons-cli:1.5.0'
implementation group: 'org.apache.commons', name: 'commons-lang3', version: '3.13.0'
implementation group: 'com.google.code.gson', name: 'gson', version: '2.10.1'

compileOnly 'org.projectlombok:lombok:1.18.30'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure we need lombok just for ConnectionSettings. We can talk about it, or consider removing it later if we don't use lombok throughout.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lombok removed in 842a222


long DEFAULT_TIMEOUT_MILLISECOND = 1000;

Future<T> asyncConnectToRedis(ConnectionSettings connectionSettings);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if asyncConnectToRedis is defined here, why are we extending Client? Seems like we should be reusing connectToClient instead.
note: this means we will have to update and separate our JavaClient into an async and sync client.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 842a222. We don't test (don't benchmark) connection - so all clients do a sync connect.

return (Math.floor(Math.random() * SIZE_SET_KEYSPACE) + 1) + "";
}

public interface Operation {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can make this private

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - it is used in IT (#632)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should investigate this and find a better way to test then.

Signed-off-by: Yury-Fridlyand <[email protected]>
case JEDIS_ASYNC:
// run testClientSetGet on JEDIS pseudo-async client
System.out.println("Run JEDIS pseudo-async client");
break;
case LETTUCE:
// run testClientSetGet on LETTUCE sync client
System.out.println("Run LETTUCE sync client");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's an async client

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I deleted the wrong reference. Restored.

Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
try {
actions.get(action).go();
} catch (Exception e) {
// timed out - exception from Future::get
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Operator is equally unspecific, but we can can change that to only ignore some specific handled types. Shachar is right in that we need to report timeout errors from upstream.

}

public interface Operation {
void go() throws Exception;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can be more explicit instead of throwing Exceptions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 859c9e6

return (Math.floor(Math.random() * SIZE_SET_KEYSPACE) + 1) + "";
}

public interface Operation {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should investigate this and find a better way to test then.

ChosenAction.GET_NON_EXISTING, new ArrayList<>(),
ChosenAction.SET, new ArrayList<>());
int iterationIncrement = iterationCounter.getAndIncrement();
int clientIndex = iterationIncrement % clients.size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not want to execute the loop. For example, if increment size < tasks.
Or, more realistically, some tasks don't get executed until later in the run, and may not receive an increment index until there's no more work to be done. We see this when running 1000 tasks.

Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
"one of:"
+ " all|jedis|jedis_async|lettuce|lettuce_async|babushka_async|all_async|all_sync"
+ " [all]")
"one of: all|jedis|jedis_async|lettuce|lettuce_async|"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should remove all irrelevant clients

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d3f707f.

ALL("All"),
ALL_SYNC("All sync"),
ALL_ASYNC("All async");
JEDIS("Jedis"), // async
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async -> sync?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d3f707f.

Future<String> asyncGet(String key);

default <T> T waitForResult(Future<T> future) {
return waitForResult(future, DEFAULT_TIMEOUT_MILLISECOND);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to wrap the results with timeouts

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a protection against infinite hangs. If something happen, a CI run won't finish and will block running tests for other clients.
Please confirm that I can remove it.


default <T> T waitForResult(Future<T> future, long timeout) {
try {
return future.get(timeout, TimeUnit.MILLISECONDS);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

try {
return future.get(timeout, TimeUnit.MILLISECONDS);
} catch (TimeoutException ignored) {
return null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want to ignore errors

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d3f707f.

}
}

System.out.println();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prettify the printed output - it adds blank lines between test reports to ease reading.

Copy link
Contributor

@shachlanAmazon shachlanAmazon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no serious design issues.
@barshaul leaving to you for final verifications and merging once all remaining comments are fixed.

// run testClientSetGet on JEDIS sync client
System.out.println("Run JEDIS sync client");
// run testClientSetGet on JEDIS pseudo-sync client
System.out.println("Run JEDIS async client");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you log Jedis as async? there's no such thing, and we shouldn't create it.
Babushka & Lettuce are async, Jedis is sync. that's all we need.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, it was uncleaned reference to that client. It is already deleted.
Fixed in d3f707f.

ChosenAction.GET_EXISTING, new ArrayList<>(),
ChosenAction.GET_NON_EXISTING, new ArrayList<>(),
ChosenAction.SET, new ArrayList<>());
int iterationIncrement = iterationCounter.getAndIncrement();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you increment this outside of the loop? wouldn't it make more sense to just increment it at the beginning of the while loop instead of before the loop AND inside the loop?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d3f707f.

int clientIndex = iterationIncrement % clients.size();

if (debugLogging) {
System.out.printf(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notice that logging the client index here is misleading, since the client index changes in the loop.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d3f707f.

var actions = getActionMap(clients.get(clientIndex), dataSize, async);
// operate and calculate tik-tok
Pair<ChosenAction, Long> result = measurePerformance(actions);
if (result != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when would this be null?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! It is never happen now.
Fixed in d3f707f.

@@ -51,15 +52,21 @@ public static Pair<ChosenAction, Long> measurePerformance(Map<ChosenAction, Oper
long before = System.nanoTime();
try {
actions.get(action).go();
} catch (Exception e) {
// timed out - exception from Future::get
} catch (TimeoutException ignored) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why ignore timeouts? if the timeouts are set to a generous value, such as 1 second, the benchmark shouldn't timeout. if it does, it's probably an error.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 859c9e6.

actions.put(
client ->
() -> {
if (async) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: why drop the ternary operator syntax?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java does not accept this in void expressions. I was also disappointed.
We ignore results of get/set, Operation interface defines a void function.

"one of:"
+ " all|jedis|jedis_async|lettuce|lettuce_async|babushka_async|all_async|all_sync"
+ " [all]")
"one of: all|jedis|jedis_async|lettuce|lettuce_async|"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just all |jedis|lettuce|babushka
(if you also want all sync / all async that's fine, but I don't see the benefit there)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, thank you.
Fixed in d3f707f.

ALL("All"),
ALL_SYNC("All sync"),
ALL_ASYNC("All async");
JEDIS("Jedis"), // async
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sync

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d3f707f.

});
}

public static Map<ChosenAction, Function<Client, Operation>> getActionMap(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: instead of a function returning a function, Operation can just take a Client as an argument, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh!
Fixed in d3f707f.

Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
@barshaul
Copy link
Collaborator

barshaul commented Dec 5, 2023

@Yury-Fridlyand Please pull latest changes & resolve conflicts, squash the commits and merge

@Yury-Fridlyand Yury-Fridlyand merged commit b5fc593 into valkey-io:main Dec 6, 2023
2 checks passed
@Yury-Fridlyand Yury-Fridlyand deleted the integ-java-client-milestone-1-benchmarking branch December 6, 2023 03:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
java issues and fixes related to the java client
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants