-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-enable coverage #16913
Re-enable coverage #16913
Conversation
b4fb7f6
to
843e26d
Compare
843e26d
to
93210e5
Compare
38b1b3a
to
0e3ca6a
Compare
ci/nightly/coverage.py
Outdated
[ | ||
"rust-cov", | ||
"export", | ||
"--format=lcov", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to export as lcov?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in the middle of experimenting with a bunch of ways of presenting the output. Originally I exported the JSON, which no tools seem to consume at all... so at least lcov output can be consumed by lcov's genhtml. Which I know is more useful than llvm-cov show's own HTML format which is completely unusable because it produces a giant, multi-hundred-megabyte HTML file. Suggestions appreciated though!
ci/nightly/coverage.py
Outdated
"rust-profdata", | ||
"merge", | ||
"-sparse", | ||
*Path("test/sqllogictest/coverage").glob("*.profraw"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure each profraw file has a unique name? environmentd and clusterd might be colliding with same name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the problem with the environment variable set in mzcompose; FYI, reviewing this right now is probably not very useful since I am working on it. Or you could take it over as it is if you want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nono, just wrote down my thoughts as I read it out of interest, not a real review.
I have a whole bunch of issues open regarding llvm-cov, but that was with C++, I'd be interested to see a coverage report and if things work well with Rust. I basically reimplemented |
@@ -29,6 +29,7 @@ junit_*.xml | |||
perf.data* | |||
**/*.rej | |||
**/*.orig | |||
**/*.profraw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can also add .profdata
0e3ca6a
to
9591786
Compare
@tokenrove Any update on this? I might try working on PR code coverage soon, and would take this PR as the basis for my work. |
Yes, I was going to suggest that I get rid of the commits that add this to nightly, and just get the other commits merged so mzcompose can be invoked with --coverage, and then you can take the rest. |
Alright, that would be fine. |
9591786
to
ba58d69
Compare
I haven't re-run the nightly so I haven't checked if this still does the right thing, but I'll do that now. |
instrument-coverage has been stable since Rust 1.60. I am removing the unresolved symbols warning hack for Nix because, like the commentors who closed the reference issues, it's unclear to me what would cause this, and I'd rather resolve it by fixing the actual issue. Looking at related issues like rust-lang/rust#92769, it's probably an artifact of the implementation in nightly, which hopefully is fixed now that it's stable.
Necessary to run rust-profdata and rust-cov for working with coverage data.
Without this, we end up mutating the same config again, which then causes docker compose to fail with: services.sqllogictest.environment array items[1,2] must be unique
ba58d69
to
dce2ab2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't affect anything except coverage, which is unused currently anyway. So fine by my to submit, then I'll start on https://github.com/MaterializeInc/materialize/issues/18340
The goal here is to avoid sqllogictest and clusterd from overwriting each-other's profraw outputs. %p is by PID, which one would expect to be sufficient, and %Nm allows up to 9 concurrent profiles for the same build-ID; perhaps the two together is overkill, but this seems like a reasonable starting point.
We can now use the stable Rust toolchain and hopefully this no longer suffers from the aforementioned OOMs. Just fast sqllogictests for now, with coverage in the raw JSON format until we figure out what we want to do with it.
So that one can conveniently locally run environmentd or sqllogictest with coverage.
4d741e3
to
402003b
Compare
Just to note: we don't get accurate results for clusterd currently because it gets killed by SIGTERM and libc's atexit handlers don't run, which would normally invoke |
Based on my experience |
Oof, good to know.
I considered this but signal handling in Rust is still a disaster. It should be sufficient to catch SIGTERM and then make sure the program exits so that I had hopes that rust-lang/rust#102900 would fix this since I get a ton of |
The %c problems are independent of Rust, see https://clang.llvm.org/docs/SourceBasedCodeCoverage.html
I tried enabling the Linux support for %c, but the results were nonsensical. I didn't spend more effort on getting it working. |
Right, but this error message is pertinent, as it prevents any profiling from happening: https://github.com/llvm/llvm-project/blob/0eabf59528f3c3f64923900cae740d9f26c45ae8/compiler-rt/lib/profile/InstrProfilingFile.c#L574 Maybe we need to add the In fact, this helps: diff --git a/misc/python/materialize/cli/run.py b/misc/python/materialize/cli/run.py
index 964c2e1ac..5069dca69 100644
--- a/misc/python/materialize/cli/run.py
+++ b/misc/python/materialize/cli/run.py
@@ -268,7 +268,9 @@ def _build(args: argparse.Namespace, extra_programs: list[str] = []) -> int:
features += ["tokio-console"]
env["RUSTFLAGS"] = env.get("RUSTFLAGS", "") + " --cfg=tokio_unstable"
if args.coverage:
- env["RUSTFLAGS"] = env.get("RUSTFLAGS", "") + " -Cinstrument-coverage"
+ env["RUSTFLAGS"] = (env.get("RUSTFLAGS", "")
+ + " -Cinstrument-coverage"
+ + " -Cllvm-args=-runtime-counter-relocation")
if args.features:
features.extend(args.features.split(","))
if features:
diff --git a/misc/python/materialize/mzbuild.py b/misc/python/materialize/mzbuild.py
index 08f6f0cda..c23c9b60d 100644
--- a/misc/python/materialize/mzbuild.py
+++ b/misc/python/materialize/mzbuild.py
@@ -232,6 +232,7 @@ class CargoBuild(CargoPreImage):
if rd.coverage:
self.rustflags += [
"-Cinstrument-coverage",
+ "-Cllvm-args=-runtime-counter-relocation",
]
if len(self.bins) == 0 and len(self.examples) == 0:
raise ValueError("mzbuild config is missing pre-build target") I'll open a followup PR to add this though perhaps it's a rustc bug that this doesn't get supplied. |
I mean I changed the LLVM/compiler-rt source code to enable bias variables on Linux, but they didn't work then. |
This has been languishing for a bit, and the "hard part" is figuring out exactly what llvm-cov invocations we want to best present the information, what tests should run in nightly to get useful coverage information, and so on. So I'm going to punt on these bits and aim to get the rest merged, which enables using mzcompose with --coverage and does demonstrate adding the fast sqllogictests with coverage to nightly, which is useful as a starting point to experiment with that coverage data. We can iterate on the rest.
Motivation
Checklist
This PR has adequate test coverage / QA involvement has been duly considered.
This PR evolves an existing
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way) and therefore is tagged with aT-proto
label.This PR includes the following user-facing behavior changes: