Explicit keying between function code object and function call event #23

lachrist · 2021-06-12T19:28:48Z

How comes the location field of function code objects is not required but only recommended? I though it was used to match against function call events? While the path might be recovered from the position of the function code object inside classMap, its line number is definitely lost.

The text was updated successfully, but these errors were encountered:

kgilpin · 2021-06-12T22:01:15Z

Sometimes the location is not available - for example for native code functions in Ruby. I’m not sure what you mean by line number being lost.

…

On Sat, Jun 12, 2021 at 3:29 PM Laurent Christophe ***@***.***> wrote: How comes the location field of function code objects is not required but only recommended? I though it was used to match against function call events? While the path might be recovered from the position of the function code object inside classMap, its line number is definitely lost. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#23>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAVC6YU7EKOZHXCMZQ33SLTSOYPXANCNFSM46TAJ55Q> .

lachrist · 2021-06-13T06:00:18Z

I'm just trying to picture an algorithm capable of linking a function code object without the location field to a function call event.
This might be a good place to scratch an itch I have since the beginning and propose a more explicit link.
I see two options.

The first options consists in adding an id field to function code object which should be unique to the entire classMap.
Advantages:

Make explicit the link between functions code objects and function call events:
With the current spec there exists multiple reasonable ways to key function code objects.
With this change, the key is made explicit.
It will help produce (clients) and consume (viewers) appmaps correctly.
I'm sure this would also help newcomers better understand the appmap spec.
Avoid redundancy and reduce appmap size:
The fields method_id, defined_class, path and lineno are no longer required.
If present, they could be used to verify that the function_id is correct.
Simpler detection of function call event:
Currently, the viewer must answer (one of) the following questions to decide wether an event is a function call event.
Does it have some other event specific fields (sql_query, http_client_request, http_server_response)?
Does it have function call specific fields (method_id, defined_class, path, lineno)?
After this change, the view would simply have to query for the presence of a single field: function_id.
The ability the make explicit misses:
Currently, the viewer creates a function code object if it could not find one that matches the infos provided by the function call event.
This kind of silent recovery made it harder for me to solve my flawed appmaps.
By providing null to function_id we could make explicit that we do not have a corresponding function call object.
In that case, the older function call specific fields (method_id, defined_class, path, lineno) would be used to generate one.
I must say that I'm not sure why this silent recovery exists in the first place.

Example:

{
  "version": "1.7.0",
  "metadata": {},
  "classMap": [{
    "type": "package",
    "name": "main.js",
    "children": [{
      "type": "class",
      "name": "C",
      "children": [{
        "type": "function",
        "name": "f",
        "id": 123
      }]
    }]
  }],
  "events": [{
    "event": "call",
    "id": 456,
    "thread_id": 123456789,
    "function_id": 123
  }, {
    "event": "return",
    "id": 789,
    "thread_id": 123456789,
    "parent_id": 456
  }]
}

The second options is similar but would use a JSON pointer as unique id.
This eliminates the need for an id field in function code objects and is probably even more explicit.
However creating a hash map for function code objects is slightly more complex as it must keep track of the JSON path while it is visiting the classMap.
Example:

{
  "version": "1.7.0",
  "metadata": {},
  "classMap": [{
    "type": "package",
    "name": "main.js",
    "children": [{
      "type": "class",
      "name": "C",
      "children": [{
        "type": "function",
        "name": "f",
        "id": 123
      }]
    }]
  }],
  "events": [{
    "event": "call",
    "id": 456,
    "thread_id": 123456789,
    "function_id": "/classMap/0/children/0/children/0"
  }, {
    "event": "return",
    "id": 789,
    "thread_id": 123456789,
    "parent_id": 456
  }]
}

kgilpin · 2021-06-13T13:33:36Z

In the Event handler there’s no available data that could be used as the event id. The event handler knows the location and the method id, that’s basically all. That’s why it’s used as the id.

…

On Sun, Jun 13, 2021 at 2:00 AM Laurent Christophe ***@***.***> wrote: I'm just trying to picture an algorithm capable of linking a function code object without the location field to a function call event. This might be a good place to scratch an itch I have since the beginning and propose a more explicit link. I see two options. The first options consists in adding an id field to function code object which should be unique to the entire classMap. Advantages: - *Make explicit the link between functions code objects and function call events:* With the current spec there exists multiple reasonable ways to key function code objects. With this change, the key is made explicit. It will help produce (clients) and consume (viewers) appmaps correctly. I'm sure this would also help newcomers better understand the appmap spec. - *Avoid redundancy and reduce appmap size:* The fields method_id, defined_class, path and lineno are no longer required. If present, they could be used to verify that the function_id is correct. - *Simpler detection of function call event:* Currently, the viewer must answer (one of) the following questions to decide wether an event is a function call event. Does it have some other event specific fields (sql_query, http_client_request, http_server_response)? Does it have function call specific fields (method_id, defined_class, path, lineno)? After this change, the view would simply have to query for the presence of a single field: function_id. - *The ability the make explicit misses:* Currently, the viewer creates a function code object if it could not find one that matches the infos provided by the function call event. This kind of silent recovery made it harder for me to solve my flawed appmaps. By providing null to function_id we could make explicit that we do not have a corresponding function call object. In that case, the older function call specific fields (method_id, defined_class, path, lineno) would be used to generate one. I must say that I'm not sure why this silent recovery exists in the first place. Example: { "version": "1.7.0", "metadata": {}, "classMap": [{ "type": "package", "name": "main.js", "children": [{ "type": "class", "name": "C", "children": [{ "type": "function", "name": "f", "id": 123 }] }] }], "events": [{ "event": "call", "id": 456, "thread_id": 123456789, "function_id": 123 }, { "event": "return", "id": 789, "thread_id": 123456789, "parent_id": 456 }] } The second options is similar but would use a JSON pointer <https://datatracker.ietf.org/doc/html/rfc6901> as unique id. This eliminates the need for an id field in function code objects and is probably even more explicit. However creating a hash map for function code objects is slightly more complex as it must keep track of the JSON path while it is visiting the classMap. Example: { "version": "1.7.0", "metadata": {}, "classMap": [{ "type": "package", "name": "main.js", "children": [{ "type": "class", "name": "C", "children": [{ "type": "function", "name": "f", "id": 123 }] }] }], "events": [{ "event": "call", "id": 456, "thread_id": 123456789, "function_id": "/classMap/0/children/0/children/0" }, { "event": "return", "id": 789, "thread_id": 123456789, "parent_id": 456 }] } — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#23 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAVC6375OMTC4EFK2MRSQLTSRCP5ANCNFSM46TAJ55Q> .

kgilpin · 2021-06-13T13:35:20Z

Maybe the method object id … not sure

…

On Sun, Jun 13, 2021 at 9:33 AM Kevin Gilpin ***@***.***> wrote: In the Event handler there’s no available data that could be used as the event id. The event handler knows the location and the method id, that’s basically all. That’s why it’s used as the id. On Sun, Jun 13, 2021 at 2:00 AM Laurent Christophe < ***@***.***> wrote: > I'm just trying to picture an algorithm capable of linking a function > code object without the location field to a function call event. > This might be a good place to scratch an itch I have since the beginning > and propose a more explicit link. > I see two options. > > The first options consists in adding an id field to function code object > which should be unique to the entire classMap. > Advantages: > > - *Make explicit the link between functions code objects and function > call events:* > With the current spec there exists multiple reasonable ways to key > function code objects. > With this change, the key is made explicit. > It will help produce (clients) and consume (viewers) appmaps > correctly. > I'm sure this would also help newcomers better understand the appmap > spec. > - *Avoid redundancy and reduce appmap size:* > The fields method_id, defined_class, path and lineno are no longer > required. > If present, they could be used to verify that the function_id is > correct. > - *Simpler detection of function call event:* > Currently, the viewer must answer (one of) the following questions to > decide wether an event is a function call event. > Does it have some other event specific fields (sql_query, > http_client_request, http_server_response)? > Does it have function call specific fields (method_id, defined_class, > path, lineno)? > After this change, the view would simply have to query for the > presence of a single field: function_id. > - *The ability the make explicit misses:* > Currently, the viewer creates a function code object if it could not > find one that matches the infos provided by the function call event. > This kind of silent recovery made it harder for me to solve my flawed > appmaps. > By providing null to function_id we could make explicit that we do > not have a corresponding function call object. > In that case, the older function call specific fields (method_id, > defined_class, path, lineno) would be used to generate one. > I must say that I'm not sure why this silent recovery exists in the > first place. > > Example: > > { > "version": "1.7.0", > "metadata": {}, > "classMap": [{ > "type": "package", > "name": "main.js", > "children": [{ > "type": "class", > "name": "C", > "children": [{ > "type": "function", > "name": "f", > "id": 123 > }] > }] > }], > "events": [{ > "event": "call", > "id": 456, > "thread_id": 123456789, > "function_id": 123 > }, { > "event": "return", > "id": 789, > "thread_id": 123456789, > "parent_id": 456 > }] > } > > The second options is similar but would use a JSON pointer > <https://datatracker.ietf.org/doc/html/rfc6901> as unique id. > This eliminates the need for an id field in function code objects and is > probably even more explicit. > However creating a hash map for function code objects is slightly more > complex as it must keep track of the JSON path while it is visiting the > classMap. > Example: > > { > "version": "1.7.0", > "metadata": {}, > "classMap": [{ > "type": "package", > "name": "main.js", > "children": [{ > "type": "class", > "name": "C", > "children": [{ > "type": "function", > "name": "f", > "id": 123 > }] > }] > }], > "events": [{ > "event": "call", > "id": 456, > "thread_id": 123456789, > "function_id": "/classMap/0/children/0/children/0" > }, { > "event": "return", > "id": 789, > "thread_id": 123456789, > "parent_id": 456 > }] > } > > — > You are receiving this because you commented. > > > Reply to this email directly, view it on GitHub > <#23 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAAVC6375OMTC4EFK2MRSQLTSRCP5ANCNFSM46TAJ55Q> > . >

lachrist · 2021-06-13T15:57:32Z

So I'm not sure how the other clients work.
But for JavaScript, I rewrite every single functions to insert hook at the entrance (call) of the function and at its exit (return / throw).
As I'm performing this instrumentation, I register a function code object to the classMap.
So as I'm inserting the entrance hook, I can easily fetch the unique key that have been talking about and insert it as a literal.
An instrumented function would then look like this:

function f (x) {
  const $id = $.eventID += 1;
  $.record({
    event: "call",
    id: $id,
    function_id: 123 // Literal computed during instrumentation
  });
  try {
    ... // Do the original operations here
  } finally {
    $.record({
      event: "return",
      id: $.eventID += 1,
      parent_id: $id
    });
  }
}

If I understand you correctly, the other clients rely on builtin reflection mechanisms after the code has been loaded?
Either the reflection is invoked when functions are created and this is similar to my case.
Either the reflection is invoked when functions are actually applied.
In that case, I would hoop that you have access to the reference of the function being called (at least in the JS proxy API it is the case).
If you have that reference, maintaining a mapping from reference to unique id should do the trick right?

Now I think that I better understand the design decision of keying based on the combination of path, lineno, defined_class, method_id.
Nonetheless, I would argue that the explicit keying that I'm championing offers a more language / instrumentation agnostic option.

Feel free to close the issue if you think that this should not be a priority discussion right now.

kgilpin · 2021-06-13T18:38:33Z

Yes I think you’re right. Let’s keep this open.

…

On Sun, Jun 13, 2021 at 11:57 AM Laurent Christophe < ***@***.***> wrote: So I'm not sure how the other clients work. But for JavaScript, I rewrite every single functions to insert hook at the entrance (call) of the function and at its exit (return / throw). As I'm performing this instrumentation, I register a function code object to the classMap. So as I'm inserting the entrance hook, I can easily fetch the unique key that have been talking about and insert it as a literal. An instrumented function would then look like this: function f (x) { const $id = $.eventID += 1; $.record({ event: "call", id: $id, function_id: 123 // Literal computed during instrumentation }); try { ... // Do the original operations here } finally { $.record({ event: "return", id: $.eventID += 1, parent_id: $id }); }} If I understand you correctly, the other clients rely on builtin reflection mechanisms after the code has been loaded? Either the reflection is invoked when functions are created and this is similar to my case. Either the reflection is invoked when functions are actually applied. In that case, I would hoop that you have access to the reference of the function being called (at least in the JS proxy API it is the case). If you have that reference, maintaining a mapping from reference to unique id should do the trick right? Now I think that I better understand the design decision of keying based on the combination of path, lineno, defined_class, method_id. Nonetheless, I would argue that the explicit keying that I'm championing offers a more language / instrumentation agnostic option. Feel free to close the issue if you think that this should not be a priority discussion right now. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#23 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAVC63NLGYIKX7IUGRVOY3TSTIPPANCNFSM46TAJ55Q> .

lachrist · 2021-06-14T09:14:04Z

I actually got a real world example of clash in the juice shop.

Output of the validation tool:

Invalid appmap:
detected a function code object clash in the classMap: {
  type: 'function',
  source: null,
  labels: [ [length]: 0 ],
  comment: null,
  name: '()',
  location: 'build/lib/startup/restoreOverwrittenFilesWithOriginals.js:14',
  static: false
}

Faulty line:

// build/lib/startup/restoreOverwrittenFilesWithOriginals.js:14
const exists = (path) => access(path).then(() => true).catch(() => false);

Faulty appmap segment:

[{
  "type": "class",
  "name": "arrow#508",
  "bound": false,
  "children": [
    {
      "type": "function",
      "source": null,
      "labels": [],
      "comment": null,
      "name": "()",
      "location": "build/lib/startup/restoreOverwrittenFilesWithOriginals.js:14",
      "static": false
    }
  ]
},
{
  "type": "class",
  "name": "arrow#509",
  "bound": false,
  "children": [
    {
      "type": "function",
      "source": null,
      "labels": [],
      "comment": null,
      "name": "()",
      "location": "build/lib/startup/restoreOverwrittenFilesWithOriginals.js:14",
      "static": false
    }
  ]
}]

As soon as multiple anonymous functions are created on the same line, we have a function code object clash.
Even if I provide unique name for anonymous code objects, we agreed to put those on the "fake" parent class.
And as far as I understand, the parent class is not used for keying.
N.B.: Enabling function code entity to have children would remove the need for the "fake" parent class and solve the issue.

lachrist changed the title ~~Location of function code object only recommended?~~ Explicit keying between function code object and function call event Jun 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explicit keying between function code object and function call event #23

Explicit keying between function code object and function call event #23

lachrist commented Jun 12, 2021

kgilpin commented Jun 12, 2021 via email

lachrist commented Jun 13, 2021

kgilpin commented Jun 13, 2021 via email

kgilpin commented Jun 13, 2021 via email

lachrist commented Jun 13, 2021

kgilpin commented Jun 13, 2021 via email

lachrist commented Jun 14, 2021

Explicit keying between function code object and function call event #23

Explicit keying between function code object and function call event #23

Comments

lachrist commented Jun 12, 2021

kgilpin commented Jun 12, 2021 via email

lachrist commented Jun 13, 2021

kgilpin commented Jun 13, 2021 via email

kgilpin commented Jun 13, 2021 via email

lachrist commented Jun 13, 2021

kgilpin commented Jun 13, 2021 via email

lachrist commented Jun 14, 2021