Plugin API for tracing #342

petervdonovan · 2024-01-30T03:28:59Z

This PR turns tracing into a separately-compiled library instead of a part of the core runtime.

pass all tests on platforms other than macOS
get tracing tools working again
separate APIs provided by core and APIs provided by plugins
check for consistency in compile arguments at initialization time
prepare RFC
define and implement platform API exposed to plugins
define and implement logging API exposed to plugins
allow plugins to be provided via lingo (EDIT: This will be a separate effort and should not block this PR)
- ~~[ ] improve lingo test framework~~
- ~~[ ] write a failing test~~
- ~~[ ] pass the failing test~~

Update: I have decided that there is no reason not to review the internal RFC for the tracing plugin before reviewing this PR. The internal RFC is short and probably quick to read and provides explanation about what is going on in this PR.

There are corresponding changes in main lingua-franca repo.

This is cheating because it assumes that the compiled binary is on the classpath.

Much code cleanup to come.

Not completely librarify it because the headers are all intermingled :( But at least tracing HelloWorld compiles!

Errors which only show up in CI...?

I want the metadata available as early as possible. The start time we cannot do much about, but with fed id, we can know as soon as we want.

petervdonovan · 2024-02-24T08:52:06Z

I think that this PR is once again ready for review. Changes since it was last reviewed are in the spirit of Christian's comment above. Because the platform directory had already had to be librarified earlier in the development of this branch, and because the logging API was already separate, I found it to be easiest to put those in their own directories already. I also decided to set a precedent of keeping separate api and impl directories, with different CMakeLists, to emphasize that they were different CMake targets; this is something that @cmnrd and I discussed on Tuesday and seemed to agree upon.

Of course this directory structure does not need to be final. We can keep iterating on it later, ideally using an RFC, when this issue rises back to the top of our priority list.

petervdonovan · 2024-02-25T20:55:58Z

Here's a positive unintended consequence of the refactoring that Christian recommended: it seems like VS Code actually understands our code now because it shows syntax errors, including wrong field names, and documentation on hover.

I guess this is what we get for doing things the "right" way.

This is compile-time facts that do not change as the program runs, so mutating it will only reduce debuggability.

The purpose of the prefix is to prevent name collisions.

The idea was that in order to be thread safe we needed to use a thread id in order to access memory that is not shared with other threads. If there are multiple environments than what was being done before this commit had a race condition.

The no-arguments way of calling fedsd makes an assumption about the name of the RTI trace file. This change makes that assumption correct again.

This is only a problem when the user calls LF API functions from a thread that was not created by the runtime.

cmnrd

Great! This is quite a significant refactoring, and I am amazed and glad to see that this has an immediate positive effect on the tooling.

I left a bunch of comments. Most of them are about known issues that I have documented in #360 or that we should still document as issues (but I am lacking the details). I also made a few small requests for changes, which I think should be addressable rather quickly. If not, feel free to ignore them. None of them are crucial.

Personally, I am not a fan of the build scripts that this PR adds, but I'll leave it to the main devs of this repo to decide. I think it is more important to try to educate people on how to interact with cmake than to provide simple scripts. In the case where you have to frequently switch between cmake configurations (like you probably did), I would recommend creating multiple build directories with descriptive names, initializing them once with the configuration that you want. Then you just cd into the build directory with the intended configuration and run make or cmake --build to (re)compile this configuration. This should work flawlessly, even if you update cmake scripts in your changes.

logging/api/logging_macros.h

core/reactor.c

core/reactor_common.c

core/trace.c

logging/api/logging.h

low_level_platform/README.md

low_level_platform/api/low_level_platform.h

tag/api/tag.h

test/Tests.cmake

cmnrd

Looks like commenting does not override a prior "Request changes", so here is my approval.

petervdonovan · 2024-02-26T18:55:31Z

Looks like this PR has 2 approvals, so I don't think more people are obligated to give another in-depth review.

I'm going to leave this open for at least 24 hours in case anyone else has comments, and also to give time for the companion PR lf-lang/lingua-franca#2192 to get approved, and then I'll merge this.

erlingrj · 2024-02-26T19:39:09Z

Great work Peter! I am curious to understand the need for the two-level platform API. This should not block the merging, I am just interested to hear your thoughts.

It seems that the "higher-level" platform API does not really provide any functionality, and just calls the lower-level platform API. Is the problem you are solving the fact that we have these typedef pthread_t lf_thread_t etc, which are platform-dependent? And you don't really want the tracing plugin to care about whether we are using C11 or pthreads?

Another solution,which I think is more elegant, could be to define them in low_level_platform.h as:

typedef struct lf_thread_t {
  int id;
  void * priv
}

And so on,

Then each particular platform implementation can use the *priv field for their private data, which would be dynamically created at creation. This would make it so that we cannot just define lf_mutex_t and be done, but we always need to create it calling e.g. lf_mutex_create() which returns a lf_mutx_t * with allocated memory in the priv field, the priv field would then contain the pthread_mutex_t. A nice thing is that the pthread-details wont "leak" out of the platform source files. Currently we include pthread stuff in the platform header files and it is thus in scope for all sorts of files, even user-written reaction bodies which needs to include platform.h

This is the sort of inheritance they do in the Linux device driver implementations which I have just started learning about.

petervdonovan · 2024-02-26T21:07:47Z

It seems that the "higher-level" platform API does not really provide any functionality, and just calls the lower-level platform API.

Correct. This is how I intended the "higher-level" platform API to remain for the foreseeable future -- just a wrapper around the lower-level platform API.

Is the problem you are solving the fact that we have these typedef pthread_t lf_thread_t etc, which are platform-dependent? And you don't really want the tracing plugin to care about whether we are using C11 or pthreads?

Yes, exactly.

A nice thing is that the pthread-details wont "leak" out of the platform source files. Currently we include pthread stuff in the platform header files and it is thus in scope for all sorts of files, even user-written reaction bodies which needs to include platform.h

Really I was fixated on getting an interface where you know what it will compile to, regardless of compilation options, without having to understand any preprocessor code. This "leaking" is not exactly the problem that I was trying to solve.

However, now that you mention it I think you're right, this is an important added benefit; furthermore, it is solved both by what you describe and by what I have done with low_level_platform.h and with platform.h. We have already had discussions and taken pains to contain the damage caused by windows.h, so this is an issue that I think we already collectively understand and care about, and yes, the same principle applies to containing the areas where pthreads stuff is in scope, too.

Then each particular platform implementation can use the *priv field for their private data, which would be dynamically created at creation. This would make it so that we cannot just define lf_mutex_t and be done, but we always need to create it calling e.g. lf_mutex_create() which returns a lf_mutx_t * with allocated memory in the priv field, the priv field would then contain the pthread_mutex_t.

I see this proposal as being very similar to what I have done with low_level_platform.h and platform.h. If I understand, the significant difference is that you seem to be proposing that:

the mutex and thread representations that use void pointers to abstract the platform differences become the only representations that are made visible to the whole code base. So outside of the platform abstraction, none of the code would be able to see the differences, not even using sizeof, unless they dereference the void pointer and make assumptions about what is in the memory that it points to (which would be obviously wrong).
we develop a habit of combining these void pointers with other metadata, like thread IDs, to improve code reuse.

Item 2 is obviously a generally good idea to do, in those cases where it makes sense. I would argue that item 2 is compatible with the route that I have already taken and is not dependent on item 1; we can always create structs in low_level_platform that have both fields whose size is platform-dependent and fields that we define that are not platform-dependent, and those structs will still overall be like the ones we use already and would have the disadvantages that are addressed by item 1. So item 2 seems like no big deal.

Item 1, however, does represent an important tradeoff. Last Monday I discussed this with Edward and Christian and I promised that my attempts to create this void* kind of modularity (where some interfaces don't depend on the preprocessor) would never force us to accept a negative impact on memory footprint at runtime. I promised that there would be an escape hatch.

The tradeoff is as follows. If the preprocessor-independent void* approach is the only approach, then the advantage is that the whole code base is brought up into a different and stronger kind of modularity that could force the simplification of a bunch of stuff, as well as making it easier for us to mix and match useful functionalities between the different runtimes, which could potentially break down siloing among LF contributors. In short, it would offer all the advantages of supporting both approaches (as I have done with low_level_platform and platform), but it would offer them in a way that is even better and cleaner and more strongly enforced. But the disadvantage is that when people are forced to use void*, then that can increase the number of allocations that they have to make, and it can increase memory fragmentation -- at least in principle.

Are these disadvantages practically significant? I haven't measured the memory footprint or the performance differences, but I also don't want to have to make those measurements, and I don't want to have to engage in a time-consuming discussion about those measurements. If we set the precedent that anything that could conceivably have overhead is provided with some sort of escape hatch that isn't constrained to have the void* kind of modularity, then sure, we get a little more mess in our code base, but we also liberate ourselves from having to worry and discuss too much about this issue. The worry and discussion can be contained to the specific cases where we want to actually implement a specific plugin, and we need to decide which platform interface to use.

edwardalee and others added 13 commits January 29, 2024 18:47

Start on refactoring tracing to make it extensible

206703e

Tracing compiles, but produces invalid files.

165723a

Fix compilation error for federated.

ad92c5a

Fix tracing function name

6dcbb38

Modify trace.h

314a1b3

Report time etc. correctly

d75c23d

Get minimal example working for default library.

7440a95

There are corresponding changes in main lingua-franca repo.

Get stubbed default tracing to compile.

ce80edb

This is cheating because it assumes that the compiled binary is on the classpath.

Bugfix in getting physical time.

3045b7e

First working example (HelloWorld, not federated)

9223bcd

Much code cleanup to come.

Semilibrarify platform.

c4424e5

Not completely librarify it because the headers are all intermingled :( But at least tracing HelloWorld compiles!

Update lingua-franca-ref.txt

7ce5397

Get RTI to build.

6eea01b

lhstrh changed the title ~~Create a plugin API for tracing~~ Plugin API for tracing Jan 30, 2024

petervdonovan force-pushed the tracing-refactoring branch from f009a26 to 6eea01b Compare January 30, 2024 20:59

petervdonovan added 8 commits January 30, 2024 13:20

Address more compiler errors.

d0640a1

Address more compiler errors.

dd57a0b

Errors which only show up in CI...?

Provide the start time.

66c07b8

Fix bug in getting logical time.

2b0838a

NEVER -> NEVER_TAG

722a000

Address more compiler errors.

9969ab8

Fix another compile error.

644384e

Fix compiler errors

706f721

petervdonovan force-pushed the tracing-refactoring branch from 48461fa to 706f721 Compare January 31, 2024 05:09

Vacillate about how to provide fed id

8f6c290

I want the metadata available as early as possible. The start time we cannot do much about, but with fed id, we can know as soon as we want.

petervdonovan force-pushed the tracing-refactoring branch 3 times, most recently from 78f300c to ef3077a Compare January 31, 2024 21:32

Try again to work on Windows

98b7df0

petervdonovan force-pushed the tracing-refactoring branch from ef3077a to 98b7df0 Compare January 31, 2024 21:34

Minor cleanups.

f5c780e

petervdonovan force-pushed the tracing-refactoring branch from 5f4ef6b to f5c780e Compare February 24, 2024 08:45

petervdonovan added 2 commits February 24, 2024 13:43

Fix Python compilation failure.

61200e9

Address a superficial FIXME

5b70836

petervdonovan added 5 commits February 25, 2024 13:10

Constify version info.

5c759fb

This is compile-time facts that do not change as the program runs, so mutating it will only reduce debuggability.

Add prefix to tracepoint function for uniformity.

cb35ed0

The purpose of the prefix is to prevent name collisions.

Fix serious bug in choosing trace buffer

8fc506c

The idea was that in order to be thread safe we needed to use a thread id in order to access memory that is not shared with other threads. If there are multiple environments than what was being done before this commit had a race condition.

Check compatibility with fedsd

d468f1f

The no-arguments way of calling fedsd makes an assumption about the name of the RTI trace file. This change makes that assumption correct again.

Fix compile error in previous commit

d5729ce

petervdonovan marked this pull request as draft February 25, 2024 23:53

Fix thread safety issue.

bc192e3

This is only a problem when the user calls LF API functions from a thread that was not created by the runtime.

petervdonovan force-pushed the tracing-refactoring branch from e27cf26 to bc192e3 Compare February 26, 2024 00:26

petervdonovan added 2 commits February 25, 2024 16:37

Fix error from silly separate testing build.

ffd28d3

Fix another segfault

ed9202c

petervdonovan marked this pull request as ready for review February 26, 2024 04:24

Allow to use INT_MAX if log level does not matter

f0c1454

cmnrd reviewed Feb 26, 2024

View reviewed changes

cmnrd approved these changes Feb 26, 2024

View reviewed changes

petervdonovan mentioned this pull request Feb 26, 2024

Version core runtime #366

Open

petervdonovan added 2 commits February 26, 2024 10:35

Address comments from code review

fc31bcc

Rename trace.c -> tracepoint.c

ed1d76b

Merge branch 'main' into tracing-refactoring

a39c105

lhstrh merged commit 4c4d4a1 into main Feb 27, 2024
14 of 29 checks passed

lhstrh deleted the tracing-refactoring branch February 27, 2024 05:48

lhstrh added the feature New feature label Apr 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plugin API for tracing #342

Plugin API for tracing #342

petervdonovan commented Jan 30, 2024 •

edited

Loading

petervdonovan commented Feb 24, 2024

petervdonovan commented Feb 25, 2024

cmnrd left a comment

cmnrd left a comment

petervdonovan commented Feb 26, 2024

erlingrj commented Feb 26, 2024 •

edited

Loading

petervdonovan commented Feb 26, 2024 •

edited

Loading

Plugin API for tracing #342

Plugin API for tracing #342

Conversation

petervdonovan commented Jan 30, 2024 • edited Loading

petervdonovan commented Feb 24, 2024

petervdonovan commented Feb 25, 2024

cmnrd left a comment

Choose a reason for hiding this comment

cmnrd left a comment

Choose a reason for hiding this comment

petervdonovan commented Feb 26, 2024

erlingrj commented Feb 26, 2024 • edited Loading

petervdonovan commented Feb 26, 2024 • edited Loading

petervdonovan commented Jan 30, 2024 •

edited

Loading

erlingrj commented Feb 26, 2024 •

edited

Loading

petervdonovan commented Feb 26, 2024 •

edited

Loading