Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wpilib] Tracer Overhaul #7099

Open
wants to merge 72 commits into
base: main
Choose a base branch
from
Open

Conversation

oh-yes-0-fps
Copy link
Contributor

@oh-yes-0-fps oh-yes-0-fps commented Sep 19, 2024

Example usage kinda

resolves #6583
resolves #609

@oh-yes-0-fps oh-yes-0-fps requested a review from a team as a code owner September 19, 2024 05:11
@oh-yes-0-fps oh-yes-0-fps requested a review from a team as a code owner September 19, 2024 05:12
@rzblue
Copy link
Member

rzblue commented Sep 19, 2024

I don't think Tracer should be a static global, as it is used in multiple places that shouldn't have mixed outputs.

@oh-yes-0-fps
Copy link
Contributor Author

oh-yes-0-fps commented Sep 19, 2024

I don't think Tracer should be a static global, as it is used in multiple places that shouldn't have mixed outputs.

It's being used differently here, it's similar to a flame graph with a cascading list of durations.
There are no mixed outputs, the tracer covers the whole robot and everything that happens during its runtime.
Also the tracer is more of ThreadLocal static and supports tracing multiple threads at once or restricting it to 1 thread.
It also has some tooling to attempt to isolate gc time from execution time but that can be turned off on a per thread basis.
image

@oh-yes-0-fps
Copy link
Contributor Author

If performance is a concern we can make it so after a certain number of nests it is sent to only datalog instead.

@rzblue
Copy link
Member

rzblue commented Sep 19, 2024

Ok, that's pretty cool.

A couple things:

  • Tracer is part of the public API, so we will need to do a deprecation cycle on the old interface

  • I think we should talk about how we want to handle telemetry like this; much like alerts we don't want it to be printed to the console, but it needs to be obvious how to access it. I also don't usually like tying things to NT directly, but maybe that's fine for now.

  • have you done any benchmarks on the rio? I doubt performance will be a significant regression from the current implementation but it'd be good to double check.

@oh-yes-0-fps
Copy link
Contributor Author

Performance wise it's quite good, we ran a version close to this all of the 2024 season based on some cursory testing it seems to use less than a ms per loop. The current implementation is more refined with 1/3 as many string allocs so it should be retested.

I don't like doing 1 topic per duration but there isn't really a way around it.

In terms of displaying to user im unsure of better ways to convey information. The watchdog will still print when overruns happen but not the epoch times. With glass, ascope and maybe elastic being packaged with wpilib I think teams should be expected to be able to use a dashboard to take full advantage of wpilib.

@oh-yes-0-fps
Copy link
Contributor Author

oh-yes-0-fps commented Sep 19, 2024

In terms of deprecation of the old tracer api, we could just name this something else while still removing tracer from watchdog. Profiler isn't a terrible name but also not great.

Whats the etiquette on just leaving the old api with deprecation warnings and just having all the functions no-op or smthn until we can remove it.

@rzblue
Copy link
Member

rzblue commented Sep 19, 2024

Whats the etiquette on just leaving the old api with deprecation warnings and just having all the functions no-op or smthn until we can remove it.

That's worse than removing it without deprecation.

@KangarooKoala
Copy link
Contributor

Generally the route for changing to a new API is that you make the new API with a different name and deprecate the old one (without doing any other changes to the old version)- The point of deprecation is that you can still use it without breakage. When it comes time to remove the old API (usually a year after deprecation), you completely remove it.
I can't think of any case where you'd want to change methods to no-op, since that's a silent breakage.

@oh-yes-0-fps
Copy link
Contributor Author

So I guess this is a broader question but what is the worry of changing the public API for something when adding it in a new major version? Especially something that is currently not used very much by users.

@calcmogul
Copy link
Member

calcmogul commented Sep 19, 2024

So I guess this is a broader question but what is the worry of changing the public API for something when adding it in a new major version?

Ostensibly, it lets teams migrate their code gradually over the course of a year rather than breaking it all at once. In my experience though, teams either fix the deprecation warnings immediately because they think they're errors, or they don't migrate until we force them to by removing the old API.

At least deprecation warnings provide guidance on how to upgrade though; immediate removal or API change gives no guidance whatsoever and can be frustrating, especially if we broke working code from the previous year.

I'm not sure what we'd name the new class instead though, because Tracer is the most obvious name.

@oh-yes-0-fps
Copy link
Contributor Author

This class's functionality is now static, it wouldn't be hard to make an instance of it to keep the functionality of the old tracer. It would just be confusing but using proper docs and deprecation warnings could help with that.

@oh-yes-0-fps oh-yes-0-fps changed the title [wpilibj] Tracer Overhaul [wpilib] Tracer Overhaul Sep 23, 2024
@oh-yes-0-fps
Copy link
Contributor Author

Is there any documentation for the time units Tracer publishes in?

Where would the best place to document those be? maybe it should even be placed in NT?

@Gold856
Copy link
Contributor

Gold856 commented Oct 3, 2024

Documentation in NT and the class doc would be good.

wpilibc/src/main/native/cpp/Tracer.cpp Outdated Show resolved Hide resolved
wpilibc/src/main/native/cpp/Tracer.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@Gold856 Gold856 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java has a code example, but C++ doesn't, for parity, C++ should have one too.

wpilibc/src/main/native/include/frc/Tracer.h Outdated Show resolved Hide resolved
wpilibj/src/main/java/edu/wpi/first/wpilibj/Tracer.java Outdated Show resolved Hide resolved
wpilibj/src/main/java/edu/wpi/first/wpilibj/Tracer.java Outdated Show resolved Hide resolved
wpilibj/src/main/java/edu/wpi/first/wpilibj/Tracer.java Outdated Show resolved Hide resolved
@github-actions github-actions bot added component: wpilibj WPILib Java component: wpilibc WPILib C++ component: command-based WPILib Command Based Library component: epilogue Annotation-based logging library labels Dec 18, 2024
@oh-yes-0-fps
Copy link
Contributor Author

Java has a code example, but C++ doesn't, for parity, C++ should have one too.

I don't know idomatic C++ code

* one can determine which parts of an operation consumed the most time.
* A Utility class for tracing code execution time. Will put info to
* NetworkTables under the "Tracer" table. The times outputted by the {@link
* Tracer} are in milliseconds.
Copy link
Contributor

@Gold856 Gold856 Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Tracer} are in milliseconds.
* Tracer} are in milliseconds.
* <p>Example inside {@code Robot.cpp}
*
* <pre><code>
*
* void Robot::RobotPeriodic() {
* frc2::CommandScheduler::GetInstance().Run();
* frc::Tracer::TraceFunc("MyVendorDep", [] { MyVendorDep::UpdateAll(); });
* }
* </code></pre>
*
* <p>Example inside a {@code Drive Subsystem}
*
* <pre><code>
* // Subsystem periodics are automatically traced
* void DriveSubsystem::Periodic() {
* for (auto module : modules) {
* frc::Tracer::TraceFunc("Module" + module.getName(), [this] { module.update(); });
* }
* frc::Tracer::TraceFunc("Gyro", [this] { gyro.update(); });
* }
* </code></pre>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

traceFunc should be TraceFunc and MyVendorDep::updateAll should be [] { MyVendorDep::UpdateAll(); }, but yep, that would work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: command-based WPILib Command Based Library component: epilogue Annotation-based logging library component: wpilibc WPILib C++ component: wpilibj WPILib Java
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add loop timing telemetry
6 participants