Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit keying between function code object and function call event #23

Open
lachrist opened this issue Jun 12, 2021 · 7 comments
Open

Comments

@lachrist
Copy link
Contributor

How comes the location field of function code objects is not required but only recommended? I though it was used to match against function call events? While the path might be recovered from the position of the function code object inside classMap, its line number is definitely lost.

@kgilpin
Copy link
Contributor

kgilpin commented Jun 12, 2021 via email

@lachrist
Copy link
Contributor Author

I'm just trying to picture an algorithm capable of linking a function code object without the location field to a function call event.
This might be a good place to scratch an itch I have since the beginning and propose a more explicit link.
I see two options.

The first options consists in adding an id field to function code object which should be unique to the entire classMap.
Advantages:

  • Make explicit the link between functions code objects and function call events:
    With the current spec there exists multiple reasonable ways to key function code objects.
    With this change, the key is made explicit.
    It will help produce (clients) and consume (viewers) appmaps correctly.
    I'm sure this would also help newcomers better understand the appmap spec.
  • Avoid redundancy and reduce appmap size:
    The fields method_id, defined_class, path and lineno are no longer required.
    If present, they could be used to verify that the function_id is correct.
  • Simpler detection of function call event:
    Currently, the viewer must answer (one of) the following questions to decide wether an event is a function call event.
    Does it have some other event specific fields (sql_query, http_client_request, http_server_response)?
    Does it have function call specific fields (method_id, defined_class, path, lineno)?
    After this change, the view would simply have to query for the presence of a single field: function_id.
  • The ability the make explicit misses:
    Currently, the viewer creates a function code object if it could not find one that matches the infos provided by the function call event.
    This kind of silent recovery made it harder for me to solve my flawed appmaps.
    By providing null to function_id we could make explicit that we do not have a corresponding function call object.
    In that case, the older function call specific fields (method_id, defined_class, path, lineno) would be used to generate one.
    I must say that I'm not sure why this silent recovery exists in the first place.

Example:

{
  "version": "1.7.0",
  "metadata": {},
  "classMap": [{
    "type": "package",
    "name": "main.js",
    "children": [{
      "type": "class",
      "name": "C",
      "children": [{
        "type": "function",
        "name": "f",
        "id": 123
      }]
    }]
  }],
  "events": [{
    "event": "call",
    "id": 456,
    "thread_id": 123456789,
    "function_id": 123
  }, {
    "event": "return",
    "id": 789,
    "thread_id": 123456789,
    "parent_id": 456
  }]
}

The second options is similar but would use a JSON pointer as unique id.
This eliminates the need for an id field in function code objects and is probably even more explicit.
However creating a hash map for function code objects is slightly more complex as it must keep track of the JSON path while it is visiting the classMap.
Example:

{
  "version": "1.7.0",
  "metadata": {},
  "classMap": [{
    "type": "package",
    "name": "main.js",
    "children": [{
      "type": "class",
      "name": "C",
      "children": [{
        "type": "function",
        "name": "f",
        "id": 123
      }]
    }]
  }],
  "events": [{
    "event": "call",
    "id": 456,
    "thread_id": 123456789,
    "function_id": "/classMap/0/children/0/children/0"
  }, {
    "event": "return",
    "id": 789,
    "thread_id": 123456789,
    "parent_id": 456
  }]
}

@kgilpin
Copy link
Contributor

kgilpin commented Jun 13, 2021 via email

@kgilpin
Copy link
Contributor

kgilpin commented Jun 13, 2021 via email

@lachrist
Copy link
Contributor Author

So I'm not sure how the other clients work.
But for JavaScript, I rewrite every single functions to insert hook at the entrance (call) of the function and at its exit (return / throw).
As I'm performing this instrumentation, I register a function code object to the classMap.
So as I'm inserting the entrance hook, I can easily fetch the unique key that have been talking about and insert it as a literal.
An instrumented function would then look like this:

function f (x) {
  const $id = $.eventID += 1;
  $.record({
    event: "call",
    id: $id,
    function_id: 123 // Literal computed during instrumentation
  });
  try {
    ... // Do the original operations here
  } finally {
    $.record({
      event: "return",
      id: $.eventID += 1,
      parent_id: $id
    });
  }
}

If I understand you correctly, the other clients rely on builtin reflection mechanisms after the code has been loaded?
Either the reflection is invoked when functions are created and this is similar to my case.
Either the reflection is invoked when functions are actually applied.
In that case, I would hoop that you have access to the reference of the function being called (at least in the JS proxy API it is the case).
If you have that reference, maintaining a mapping from reference to unique id should do the trick right?

Now I think that I better understand the design decision of keying based on the combination of path, lineno, defined_class, method_id.
Nonetheless, I would argue that the explicit keying that I'm championing offers a more language / instrumentation agnostic option.

Feel free to close the issue if you think that this should not be a priority discussion right now.

@kgilpin
Copy link
Contributor

kgilpin commented Jun 13, 2021 via email

@lachrist lachrist changed the title Location of function code object only recommended? Explicit keying between function code object and function call event Jun 14, 2021
@lachrist
Copy link
Contributor Author

I actually got a real world example of clash in the juice shop.

Output of the validation tool:

Invalid appmap:
detected a function code object clash in the classMap: {
  type: 'function',
  source: null,
  labels: [ [length]: 0 ],
  comment: null,
  name: '()',
  location: 'build/lib/startup/restoreOverwrittenFilesWithOriginals.js:14',
  static: false
}

Faulty line:

// build/lib/startup/restoreOverwrittenFilesWithOriginals.js:14
const exists = (path) => access(path).then(() => true).catch(() => false);

Faulty appmap segment:

[{
  "type": "class",
  "name": "arrow#508",
  "bound": false,
  "children": [
    {
      "type": "function",
      "source": null,
      "labels": [],
      "comment": null,
      "name": "()",
      "location": "build/lib/startup/restoreOverwrittenFilesWithOriginals.js:14",
      "static": false
    }
  ]
},
{
  "type": "class",
  "name": "arrow#509",
  "bound": false,
  "children": [
    {
      "type": "function",
      "source": null,
      "labels": [],
      "comment": null,
      "name": "()",
      "location": "build/lib/startup/restoreOverwrittenFilesWithOriginals.js:14",
      "static": false
    }
  ]
}]

As soon as multiple anonymous functions are created on the same line, we have a function code object clash.
Even if I provide unique name for anonymous code objects, we agreed to put those on the "fake" parent class.
And as far as I understand, the parent class is not used for keying.
N.B.: Enabling function code entity to have children would remove the need for the "fake" parent class and solve the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants