Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concurrent-api: save the timestamp of the SingleToFuture.get() calls #3051

Merged

Conversation

bryce-anderson
Copy link
Contributor

Motivation:

It can be very difficult to know if a thread is slow or completely stuck without taking multiple thread dumps and comparing them.

Modifications:

Try to mitigate this by saving the time stamp of the last blocking .get() call. This should then show up in heap dumps and we can at least compare them to see if they've all started blocking at about the same time or if there is a big spread. The latter would suggest they are truly stuck.

Motivation:

It can be very difficult to know if a thread is slow or completely
stuck without taking multiple thread dumps and comparing them.

Modifications:

Try to mitigate this by saving the time stamp of the last blocking
`.get()` call. This should then show up in heap dumps and we can
at least compare them to see if they've all started blocking at
about the same time or if there is a big spread. The latter would
suggest they are truly stuck.
@@ -45,6 +45,9 @@ abstract class SourceToFuture<T> implements Future<T> {
@Nullable
private volatile Object value;

// The timestamp of the last `.get()` call. This is intended to help with debugging stuck threads via heap dumps.
private long lastGetTimestamp;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me if this needs some stronger ordering semantics or not. I imagine there is some sort of store guarantee from latch.await() but I'm not certain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current approach lgtm, we should not have cases when multiple threads trying to invoke get() at the same future. Most cases call get() once and if they need to "retry" (in case of get timeout), they likely gonna do it from the same thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative implementation might be a bit more desirable: #3052. The benefits of that PR are that each call to future.get() is tracked but only if we're using the BlockingUtils, which should narrow down the usage to blocking client and server operations. It technically an allocate another object (which is likely a drop in the ocean) but that object should be trivial to stack allocate so chances are good it doesn't have any allocation overhead.

@@ -45,6 +45,9 @@ abstract class SourceToFuture<T> implements Future<T> {
@Nullable
private volatile Object value;

// The timestamp of the last `.get()` call. This is intended to help with debugging stuck threads via heap dumps.
private long lastGetTimestamp;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current approach lgtm, we should not have cases when multiple threads trying to invoke get() at the same future. Most cases call get() once and if they need to "retry" (in case of get timeout), they likely gonna do it from the same thread.

@@ -91,7 +94,12 @@ public final boolean isDone() {
public final T get() throws InterruptedException, ExecutionException {
final Object value = this.value;
if (value == null) {
latch.await();
lastGetTimestamp = System.currentTimeMillis();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on System.currentTimeMillis() vs System.nanoTime()? I don't think for debugging precision matters, but I'm hesitant with possibility of System.currentTimeMillis() to return t2 < t1 for two consecutive calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only using it as a timestamp so I don't think we need monotonic guarantees.

@bryce-anderson bryce-anderson changed the title concurent-api: save the timestamp of the SingleToFuture.get() calls concurrent-api: save the timestamp of the SingleToFuture.get() calls Aug 29, 2024
@bryce-anderson
Copy link
Contributor Author

This should be mutually exclusive with an alternative, #3052.

@bryce-anderson bryce-anderson merged commit 249efc7 into apple:main Sep 4, 2024
11 checks passed
@bryce-anderson bryce-anderson deleted the bl_anderson/save-timestamp-on-block branch September 4, 2024 23:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants