-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
async_hooks: add currentResource #21313
Conversation
Remove the need for the destroy hook in the basic APM case.
528f321
to
15ffe72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you want is a per-Environment
stack (possibly implemented as a JS Array
) where you can push/pop the resources and their async IDs?
In particular, it seems like this would replace the async ID stack in that case, and follow its current semantics wrt unhandled exceptions
@@ -357,6 +360,8 @@ function emitAfterScript(asyncId) { | |||
if (async_hook_fields[kAfter] > 0) | |||
emitAfterNative(asyncId); | |||
|
|||
setCurrentResource(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have nested emitBefore(…, res1)
– emitBefore(…, res2)
– emitAfter(…)
– emitAfter(…)
, then after the first emitAfter
we want the current resource to be res1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, shouldn't setCurrentResource
be pushCurrentResource
/popCurrentResource
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have nested emitBefore(…, res1) – emitBefore(…, res2) – emitAfter(…) – emitAfter(…), then after the first emitAfter we want the current resource to be res1
I thought the same. However I struggled to execute user code after emitAfter
. Can you add an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work:
const async_hooks = require("async_hooks");
const fs = require("fs");
let indent = 0;
function init(asyncId, type, triggerAsyncId, resource) {
const eid = async_hooks.executionAsyncId();
const indentStr = ' '.repeat(indent);
fs.writeSync(1, `${indentStr}${type}(${asyncId}): trigger: ${triggerAsyncId} execution: ${eid}\n`);
}
function before(asyncId) {
const indentStr = ' '.repeat(indent);
fs.writeSync(1, `${indentStr}before: ${asyncId}\n`);
indent += 2;
}
function after(asyncId) {
indent -= 2;
const indentStr = ' '.repeat(indent);
fs.writeSync(1, `${indentStr}after: ${asyncId}\n`);
}
function destroy(asyncId) {
const indentStr = ' '.repeat(indent);
fs.writeSync(1, `${indentStr}destroy: ${asyncId}\n`);
}
async_hooks.createHook({ init, before, after, destroy }).enable();
class MyResource extends async_hooks.AsyncResource {
constructor() {
super("XXX");
}
}
const res = new MyResource();
setImmediate(() => {
const indentStr = ' '.repeat(indent);
fs.writeSync(1, `${indentStr}between before\n`);
res.runInAsyncScope(() => {
const indentStr = ' '.repeat(indent);
fs.writeSync(1, `${indentStr}inner \n`);
});
fs.writeSync(1, `${indentStr}between after\n`);
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only happening with runInAsyncScope? That would be easy to fix without adding a full stack where we push/pop.
Adding a full stack is also a serious refactoring of the async_hooks machinery, something that this PR currently avoids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mcollina We do have a full stack in place currently, though?
Is this only happening with runInAsyncScope?
It can happen with all mechanisms that trigger async hooks.
void AsyncWrap::EmitBefore(Environment* env, double async_id, | ||
v8::Local<v8::Object> resource) { | ||
v8::Local<v8::Context> context = env->isolate()->GetCurrentContext(); | ||
context->SetEmbedderData(CURRENT_RESOURCE_FIELD, resource); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async_hooks are currently implemented as an per-Environment, not a per-Isolate or per-Context thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's one thing that we want to change (soonish). cc @hashseed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d assume we want it to be a per-Isolate thing then, not per-Context like this currently is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, I’d prefer not to have this PR change it only halfway. If we do switch towards per-Isolate state (which I agree is a good idea), we should move all of async_hooks over to that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we'll switch towards per-Context for the PromiseHook
as well. The fact that this is Isolate-wide was a design mistake on our end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mcollina Using embedder fields doesn’t seem simpler than storing it on the Environment
… but I guess that depends on Benedikt’s reasoning here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @bmeurer already mentioned, we eventually want to reach a state where we have the hooks per context and not per isolate. Storing on the isolate also opens up potential optimization in the future, e.g. if we expose a JS function that can access embedder slots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storing on the context opens up optimization potential 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
erm. yeah. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like an odd choice on V8’s side, since the µtask queue is also per-Isolate … unless you also want to migrate that?
But even so, I think for Node.js per-Environment would still make more sense?
// If the user's callback failed then the after() hooks will be called at the | ||
// end of _fatalException(). | ||
Emit(env, async_id, AsyncHooks::kAfter, | ||
env->async_hooks_after_function()); | ||
|
||
context->SetEmbedderData(CURRENT_RESOURCE_FIELD, v8::Null(isolate)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there was an uncaught exception that was handled by an uncaughtException
handles, this isn’t reached. For the async ID stack, we currently handle that case by using clear_async_id_stack()
in the C++ code/clearAsyncIdStack()
on the JS side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mcollina I think this only becomes observable after we clean up the push/pop thing, I’ll do it then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM. Not so happy with the inconsistent naming.
const common = require('../common.js'); | ||
const { | ||
createHook, | ||
currentResource, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be named executionAsyncResource
for consistency?
@@ -25,6 +25,7 @@ const { | |||
emitBefore, | |||
emitAfter, | |||
emitDestroy, | |||
currentResource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above, how about using executionAsyncResource
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! A few nits (in addition to the other feedback by others).
.listen(common.PORT) | ||
.on('listening', function() { | ||
|
||
bench.http({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would prefer a benchmark that doesn't use http
. It'll make it hard to notice incremental improvements to this code because the spread and overhead of http is so high.
@@ -27,6 +27,10 @@ | |||
#include "v8.h" | |||
#include "v8-profiler.h" | |||
|
|||
// TODO(mcollina): should this be moved in a more public space? | |||
// embedders will not be able to use this field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A general nit for comments: start with a capital and end with .
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the calls to emitBefore() and emitAfter() in class AsyncResource needs to be also adapted to pass the current resource.
It's not directly related to the changes of this PR but I think it should be noted. To my understanding the current implementation of |
@@ -0,0 +1,165 @@ | |||
'use strict'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO some comments explaining what this benchmark is testing will help readers going forward.
|
||
var cr = currentResource(); | ||
if (cr) { | ||
resource[cls] = cr[cls]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this necessarily pertinent to the intent of the benchmark here, but I'll call it out nonetheless: If I'm reading correctly, this results in every resource sharing the same object instance for cls
, and every setCLS
call will just overwrite the state
field. For correctness here, each resource needs a copy (e.g., resource[cls] = {...cr[cls]}
), or you need to maintain a "list to the root" (e.g., resource[cls] = {props:{}, parent: cr[cls]}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per conversation, here's an example of where values would be over-written:
function f0() {
setCLS("a");
setTimeout(function f1() {
setCLS("b");
setTimeout(function f2() {
getCLS(); // expect "b", will get "c"
}, 100);
}, 10);
setTimeout(function f3() => {
getCLS(); // expect "a", will get "b"
setCLS("c");
setTimeout(function f4() {
getCLS(); // get "c"
}, 25);
}, 25);
};
f();
Order of execution is f0
, f1
, f3
, f4
, f2
.
But f0->f1->f2
is one path of the tree, and another indepdent path is f0->f3->f4
. Values visible on one path should not be visible on the path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think branch containment really matters for this test. We just need to verify that an object can be passed through the tree via attachment to the current resource in some form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think branch containment really matters for this test.
Agreed, but it came up on the diag call & I offered to post example code. :)
|
||
const server = createServer(function(req, res) { | ||
currentResource()[sym] = { state: req.url }; | ||
setTimeout(function() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function() {
-> () => {
?
// Validate the ids. An id of -1 means it was never set and is visible on the | ||
// call graph. An id < -1 should never happen in any circumstance. Throw | ||
// on user calls because async state should still be recoverable. | ||
validateAsyncId(asyncId, 'asyncId'); | ||
validateAsyncId(triggerAsyncId, 'triggerAsyncId'); | ||
|
||
setCurrentResource(resource); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert here is that currentResource === undefined || currentResource === null
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to red-X this, just want to make sure that we do iron out the issues that are being pointed out here (since this is already getting approvals)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I would certainly advise against this, as there are some fundamental issues.
We make no guarantees that the resource
object is unique for every async operation. Our documentation even says you shouldn't depend on this:
In some cases the resource object is reused for performance reasons, it is thus not safe to use it as a key in a WeakMap or add properties to it.
CLS should properly follow the triggerAsyncId
graph and not the executeAsyncId
graph. Following the executeAsyncId
is almost guaranteed to make you lose context at some point. Certainly, all the APM implementations I have seen, follow the triggerAsyncId
at some point.
There is also the option for following the resolve
graph in case of promises, which again this doesn't really consider.
Not a fundamental issue: The resource
objects MUST be pushed to a stack in emitBefore
for this to work. Doing this could lead to some unexpected garbage collection or at least inefficient escape analysis. I don't really think it is the case, as emitAfter
is guaranteed to empty the stack in each tick, but I think it is worth thinking about.
All In all, I think this is a pretty bad idea. The first point makes it logically invalid. The second point makes me doubt if it will be useful.
As a side note, this is how you benchmark this:
./node benchmark/scatter.js benchmark/async_hooks/current-resource-vs-destroy.js | tee scatter.csv
cat scatter.csv | Rscript benchmark/scatter.R --xaxis method --category type --plot scatter-plot.png
method type rate confidence.interval
async current-resource 5227.464 41.83307
async destroy 4934.202 36.62712
callbacks current-resource 6469.581 59.48585
callbacks destroy 6251.789 65.31779
See also my |
I've looked for this type of object reuse in Node.js but I didn't see any with different asyncId. Could you point me to it?
I both agree and disagree with this sentence. Consider the adapted example from our docs: const async_hooks = require('async_hooks');
async_hooks.createHook({
init(asyncId, type, triggerAsyncId) {
const eid = async_hooks.executionAsyncId();
const r = async_hooks.executionAsyncResource();
fs.writeSync(
1, `${type}(${asyncId}): trigger: ${triggerAsyncId} execution: ${eid} resource: ${r}\n`);
}
}).enable();
require('net').createServer((conn) => {}).listen(8080);
// $ nc localhost 8080
//
// Output
// TCPSERVERWRAP(5): trigger: 1 execution: 1 resource: null
// TickObject(6): trigger: 5 execution: 1 resource: null
// TCPWRAP(7): trigger: 5 execution: 0 resource: null
// TickObject(8): trigger: 7 execution: 7 resource: [object TCP]
// TickObject(9): trigger: 8 execution: 8 resource: [object Object]
// TickObject(10): trigger: 8 execution: 8 resource: [object Object] When the While avoiding Are there any other cases that I'm missing?
Look at https://github.com/elastic/apm-agent-nodejs/blob/a3131d526639c5434628f28b49f5ef2c083cc4e9/lib/instrumentation/async-hooks.js (cc @watson). It's important to note that https://github.com/GoogleCloudPlatform/cloud-trace-nodejs/blob/ebd33d802e5ab758e4317b547bdc846707f9dfd5/src/cls/async-hooks.ts follows the triggerId (cc @ofrobots). @ofrobots will |
I'm pretty sure the HTTPParser is reused. Making the context of that wrong will affect all contexts in a HTTP server and also requests. I don't think it is possible to avoid without making new HTTPParser instances every time. And for sure doing that will be a much greater performance penalty. |
Yep. Reused if there are any available, created if there aren't. |
Here it is: Lines 458 to 473 in 1744205
I think this was an interesting experiment. However, I don't think it's possible to provide a generic enough API for this to be correct and useful. It might be possible to use the concept to implement a crude form of CLS directly in core. This will solve both problems, as we could erase the state after use and pass along the "cls" state together with the |
@mcollina There are so many variations of CLS one can implement that it makes my head spin. I think it would be more valuable to look into optimizing the
|
that would likely be problematic if the |
@mcollina You mean collect the ones for which the handlers ran during |
I don't think that is an issue. |
Regarding reuse of resources: it may happen that one more reuse will be added in #22951 as the callback passed by user is used as resource and this is allowed several times. |
As discussed in nodejs/diagnostics#248 and nodejs#21313, reusing HTTPParser resource is a blocker for landing `require('async_hooks').currentAsyncResource()`, since reusing an async resource interanlly would make it infeasible to use them reliably in WeakMaps. Two suggestions came up: have a wrapper around the HTTPParser which we would use as the async resource, or stop reusing HTTPParser in our HTTP server/client code. This commit implements the latter, since we have a better GC now and reusing HTTPParser might make sense anymore. This also will avoid one extra JS->C++ call in some cases, which should improve performance for these cases.
Change resource being used, previously HTTParser was being reused. We are now using IncomingMessage and ClientRequest objects. The goal here is to make the async resource unique for each async operatio Refs: nodejs#24330 Refs: nodejs/diagnostics#248 Refs: nodejs#21313 Co-authored-by: Matheus Marchini <[email protected]>
Change resource being used, previously HTTParser was being reused. We are now using IncomingMessage and ClientRequest objects. The goal here is to make the async resource unique for each async operatio Refs: #24330 Refs: nodejs/diagnostics#248 Refs: #21313 Co-authored-by: Matheus Marchini <[email protected]> PR-URL: #25094 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Benedikt Meurer <[email protected]> Reviewed-By: Anna Henningsen <[email protected]>
As discussed in nodejs/diagnostics#248, nodejs#21313 and https://docs.google.com/document/d/1g8OrG5lMIUhRn1zbkutgY83MiTSMx-0NHDs8Bf-nXxM/preview reusing the resource object is a blocker for landing a resource based async hooks API and get rid of the promise destroy hook. This PR ensures that HttpAgent uses the a new resource object in case the socket handle gets reused.
As discussed in nodejs/diagnostics#248, #21313 and https://docs.google.com/document/d/1g8OrG5lMIUhRn1zbkutgY83MiTSMx-0NHDs8Bf-nXxM/preview reusing the resource object is a blocker for landing a resource based async hooks API and get rid of the promise destroy hook. This PR ensures that HttpAgent uses the a new resource object in case the socket handle gets reused. PR-URL: #27581 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Rich Trott <[email protected]>
As discussed in nodejs/diagnostics#248, #21313 and https://docs.google.com/document/d/1g8OrG5lMIUhRn1zbkutgY83MiTSMx-0NHDs8Bf-nXxM/preview reusing the resource object is a blocker for landing a resource based async hooks API and get rid of the promise destroy hook. This PR ensures that HttpAgent uses the a new resource object in case the socket handle gets reused. PR-URL: #27581 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Rich Trott <[email protected]>
Remove the need for the destroy hook in the basic APM case.
APM and continuation local storage users can use only the init hook.
This lead to a 5% improvement in throughput. in the attached benchmark:
I've also measured this against a simple Hapi v17 server and it leads to a 15% improvement.
cc @nodejs/diagnostics @bmeurer @MayaLekova
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes