Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin API for tracing #342

Merged
merged 105 commits into from
Feb 27, 2024
Merged

Plugin API for tracing #342

merged 105 commits into from
Feb 27, 2024

Conversation

petervdonovan
Copy link
Contributor

@petervdonovan petervdonovan commented Jan 30, 2024

This PR turns tracing into a separately-compiled library instead of a part of the core runtime.

  • pass all tests on platforms other than macOS
  • get tracing tools working again
  • separate APIs provided by core and APIs provided by plugins
  • check for consistency in compile arguments at initialization time
  • prepare RFC
  • define and implement platform API exposed to plugins
  • define and implement logging API exposed to plugins
  • allow plugins to be provided via lingo (EDIT: This will be a separate effort and should not block this PR)
    • [ ] improve lingo test framework
    • [ ] write a failing test
    • [ ] pass the failing test

Update: I have decided that there is no reason not to review the internal RFC for the tracing plugin before reviewing this PR. The internal RFC is short and probably quick to read and provides explanation about what is going on in this PR.

@lhstrh lhstrh changed the title Create a plugin API for tracing Plugin API for tracing Jan 30, 2024
I want the metadata available as early as possible. The start time we
cannot do much about, but with fed id, we can know as soon as we want.
@petervdonovan petervdonovan force-pushed the tracing-refactoring branch 3 times, most recently from 78f300c to ef3077a Compare January 31, 2024 21:32
@petervdonovan
Copy link
Contributor Author

I think that this PR is once again ready for review. Changes since it was last reviewed are in the spirit of Christian's comment above. Because the platform directory had already had to be librarified earlier in the development of this branch, and because the logging API was already separate, I found it to be easiest to put those in their own directories already. I also decided to set a precedent of keeping separate api and impl directories, with different CMakeLists, to emphasize that they were different CMake targets; this is something that @cmnrd and I discussed on Tuesday and seemed to agree upon.

Of course this directory structure does not need to be final. We can keep iterating on it later, ideally using an RFC, when this issue rises back to the top of our priority list.

@petervdonovan
Copy link
Contributor Author

Here's a positive unintended consequence of the refactoring that Christian recommended: it seems like VS Code actually understands our code now because it shows syntax errors, including wrong field names, and documentation on hover.

image

I guess this is what we get for doing things the "right" way.

This is compile-time facts that do not change as the program runs, so
mutating it will only reduce debuggability.
The purpose of the prefix is to prevent name collisions.
The idea was that in order to be thread safe we needed to use a thread
id in order to access memory that is not shared with other threads. If
there are multiple environments than what was being done before this
commit had a race condition.
The no-arguments way of calling fedsd makes an assumption about the name
of the RTI trace file. This change makes that assumption correct again.
@petervdonovan petervdonovan marked this pull request as draft February 25, 2024 23:53
This is only a problem when the user calls LF API functions from a
thread that was not created by the runtime.
@petervdonovan petervdonovan marked this pull request as ready for review February 26, 2024 04:24
Copy link
Contributor

@cmnrd cmnrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! This is quite a significant refactoring, and I am amazed and glad to see that this has an immediate positive effect on the tooling.

I left a bunch of comments. Most of them are about known issues that I have documented in #360 or that we should still document as issues (but I am lacking the details). I also made a few small requests for changes, which I think should be addressable rather quickly. If not, feel free to ignore them. None of them are crucial.

Personally, I am not a fan of the build scripts that this PR adds, but I'll leave it to the main devs of this repo to decide. I think it is more important to try to educate people on how to interact with cmake than to provide simple scripts. In the case where you have to frequently switch between cmake configurations (like you probably did), I would recommend creating multiple build directories with descriptive names, initializing them once with the configuration that you want. Then you just cd into the build directory with the intended configuration and run make or cmake --build to (re)compile this configuration. This should work flawlessly, even if you update cmake scripts in your changes.

logging/api/logging_macros.h Show resolved Hide resolved
core/reactor.c Show resolved Hide resolved
core/reactor_common.c Show resolved Hide resolved
core/trace.c Outdated Show resolved Hide resolved
logging/api/logging.h Show resolved Hide resolved
low_level_platform/README.md Outdated Show resolved Hide resolved
low_level_platform/api/low_level_platform.h Outdated Show resolved Hide resolved
tag/api/tag.h Outdated Show resolved Hide resolved
test/Tests.cmake Show resolved Hide resolved
Copy link
Contributor

@cmnrd cmnrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like commenting does not override a prior "Request changes", so here is my approval.

@petervdonovan
Copy link
Contributor Author

Looks like this PR has 2 approvals, so I don't think more people are obligated to give another in-depth review.

I'm going to leave this open for at least 24 hours in case anyone else has comments, and also to give time for the companion PR lf-lang/lingua-franca#2192 to get approved, and then I'll merge this.

@erlingrj
Copy link
Collaborator

erlingrj commented Feb 26, 2024

Great work Peter! I am curious to understand the need for the two-level platform API. This should not block the merging, I am just interested to hear your thoughts.

It seems that the "higher-level" platform API does not really provide any functionality, and just calls the lower-level platform API. Is the problem you are solving the fact that we have these typedef pthread_t lf_thread_t etc, which are platform-dependent? And you don't really want the tracing plugin to care about whether we are using C11 or pthreads?

Another solution,which I think is more elegant, could be to define them in low_level_platform.h as:

typedef struct lf_thread_t {
  int id;
  void * priv
}

And so on,

Then each particular platform implementation can use the *priv field for their private data, which would be dynamically created at creation. This would make it so that we cannot just define lf_mutex_t and be done, but we always need to create it calling e.g. lf_mutex_create() which returns a lf_mutx_t * with allocated memory in the priv field, the priv field would then contain the pthread_mutex_t. A nice thing is that the pthread-details wont "leak" out of the platform source files. Currently we include pthread stuff in the platform header files and it is thus in scope for all sorts of files, even user-written reaction bodies which needs to include platform.h

This is the sort of inheritance they do in the Linux device driver implementations which I have just started learning about.

@petervdonovan
Copy link
Contributor Author

petervdonovan commented Feb 26, 2024

It seems that the "higher-level" platform API does not really provide any functionality, and just calls the lower-level platform API.

Correct. This is how I intended the "higher-level" platform API to remain for the foreseeable future -- just a wrapper around the lower-level platform API.

Is the problem you are solving the fact that we have these typedef pthread_t lf_thread_t etc, which are platform-dependent? And you don't really want the tracing plugin to care about whether we are using C11 or pthreads?

Yes, exactly.

A nice thing is that the pthread-details wont "leak" out of the platform source files. Currently we include pthread stuff in the platform header files and it is thus in scope for all sorts of files, even user-written reaction bodies which needs to include platform.h

Really I was fixated on getting an interface where you know what it will compile to, regardless of compilation options, without having to understand any preprocessor code. This "leaking" is not exactly the problem that I was trying to solve.

However, now that you mention it I think you're right, this is an important added benefit; furthermore, it is solved both by what you describe and by what I have done with low_level_platform.h and with platform.h. We have already had discussions and taken pains to contain the damage caused by windows.h, so this is an issue that I think we already collectively understand and care about, and yes, the same principle applies to containing the areas where pthreads stuff is in scope, too.

Then each particular platform implementation can use the *priv field for their private data, which would be dynamically created at creation. This would make it so that we cannot just define lf_mutex_t and be done, but we always need to create it calling e.g. lf_mutex_create() which returns a lf_mutx_t * with allocated memory in the priv field, the priv field would then contain the pthread_mutex_t.

I see this proposal as being very similar to what I have done with low_level_platform.h and platform.h. If I understand, the significant difference is that you seem to be proposing that:

  1. the mutex and thread representations that use void pointers to abstract the platform differences become the only representations that are made visible to the whole code base. So outside of the platform abstraction, none of the code would be able to see the differences, not even using sizeof, unless they dereference the void pointer and make assumptions about what is in the memory that it points to (which would be obviously wrong).
  2. we develop a habit of combining these void pointers with other metadata, like thread IDs, to improve code reuse.

Item 2 is obviously a generally good idea to do, in those cases where it makes sense. I would argue that item 2 is compatible with the route that I have already taken and is not dependent on item 1; we can always create structs in low_level_platform that have both fields whose size is platform-dependent and fields that we define that are not platform-dependent, and those structs will still overall be like the ones we use already and would have the disadvantages that are addressed by item 1. So item 2 seems like no big deal.

Item 1, however, does represent an important tradeoff. Last Monday I discussed this with Edward and Christian and I promised that my attempts to create this void* kind of modularity (where some interfaces don't depend on the preprocessor) would never force us to accept a negative impact on memory footprint at runtime. I promised that there would be an escape hatch.

The tradeoff is as follows. If the preprocessor-independent void* approach is the only approach, then the advantage is that the whole code base is brought up into a different and stronger kind of modularity that could force the simplification of a bunch of stuff, as well as making it easier for us to mix and match useful functionalities between the different runtimes, which could potentially break down siloing among LF contributors. In short, it would offer all the advantages of supporting both approaches (as I have done with low_level_platform and platform), but it would offer them in a way that is even better and cleaner and more strongly enforced. But the disadvantage is that when people are forced to use void*, then that can increase the number of allocations that they have to make, and it can increase memory fragmentation -- at least in principle.

Are these disadvantages practically significant? I haven't measured the memory footprint or the performance differences, but I also don't want to have to make those measurements, and I don't want to have to engage in a time-consuming discussion about those measurements. If we set the precedent that anything that could conceivably have overhead is provided with some sort of escape hatch that isn't constrained to have the void* kind of modularity, then sure, we get a little more mess in our code base, but we also liberate ourselves from having to worry and discuss too much about this issue. The worry and discussion can be contained to the specific cases where we want to actually implement a specific plugin, and we need to decide which platform interface to use.

@lhstrh lhstrh merged commit 4c4d4a1 into main Feb 27, 2024
14 of 29 checks passed
@lhstrh lhstrh deleted the tracing-refactoring branch February 27, 2024 05:48
@lhstrh lhstrh added the feature New feature label Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants