Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some logger issues #1739

Merged
merged 6 commits into from
Nov 27, 2024
Merged

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Nov 27, 2024

Description

This fixes a handful of issues uncovered in downstream CI after #1722.

The following are bugs introduced in #1722:

  • When using rmm from the build directory rather than an installation, the namespaced targets are not present so we must generate aliases.
  • The RMM_LOGGING_ASSERT macro is never used in rmm itself, so we didn't catch that it was still using the old version of the logger.

While fixing the above, I also uncovered that building fmt in this environment unearths a gcc bug.

The following are underlying issues uncovered by #1722:

  • spdlog's fmt CMake linkage is determined at build time. As a result, the conda package for spdlog is hardcoded to use fmt as a library (static or shared depends on what the fmt::fmt target winds up being when a consumer using spdlog finds fmt in CMake), which means that is propagated to all consumers of the librmm package via its CMake. This means that we often wind up with both fmt_header_only and fmt as link targets for many RAPIDS libraries. For now, this PR makes it so that if rapids_cpm_find(spdlog) does not find a copy of spdlog locally, the cloned version will use an external header-only fmt via rapids-cmake's logic, which ensures that packages like wheels do not export a libfmt or libspdlog dependency. However, in environments where rapids_cpm_find(spdlog) does find a preexisting package, we allow that package's fmt linkage to propagate. In conda environments, we know that this fmt linkage is to the library, so we keep fmt as part of rmm's runtime dependencies (by placing it in host and relying on the run export) so that libfmt is always available in environments using rmm.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vyasr vyasr added bug Something isn't working non-breaking Non-breaking change labels Nov 27, 2024
@vyasr vyasr self-assigned this Nov 27, 2024
Copy link

copy-pr-bot bot commented Nov 27, 2024

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions bot added CMake conda cpp Pertains to C++ code labels Nov 27, 2024
@vyasr
Copy link
Contributor Author

vyasr commented Nov 27, 2024

/ok to test

@github-actions github-actions bot added the ci label Nov 27, 2024
@vyasr
Copy link
Contributor Author

vyasr commented Nov 27, 2024

/ok to test

@vyasr vyasr changed the title Fix/logger issues Fix some logger issues Nov 27, 2024
@vyasr
Copy link
Contributor Author

vyasr commented Nov 27, 2024

/ok to test

@vyasr
Copy link
Contributor Author

vyasr commented Nov 27, 2024

We will need to resolve the fmt linkage issue above in a more robust manner in the future. I don't think there's a general solution without either 1) forcing spdlog to always be downloaded as part of the build of any package that depends on rmm for logging or 2) post facto direct modification of the spdlog_header_only target (i.e. rewriting the link libraries and compile defs directly). Neither is a particularly good option. We could make the first somewhat more palatable by having the download be controlled by some flag that is set to on by default (since that's what we'll want for everything in RAPIDS). I will prototype that after this PR is merged.

@vyasr
Copy link
Contributor Author

vyasr commented Nov 27, 2024

/ok to test

@vyasr vyasr marked this pull request as ready for review November 27, 2024 06:26
@vyasr vyasr requested review from a team as code owners November 27, 2024 06:26
@harrism
Copy link
Member

harrism commented Nov 27, 2024

The RMM_LOGGING_ASSERT macro is never used in rmm itself, so we didn't catch that it was still using the old version of the logger.

Not true. This macro is used in many places (search shows about 50!) in RMM. The problem is we don't build RMM debug in CI, and you probably didn't test the debug build locally.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am approving assuming that this is necessary unblock downstream CI, but I have some questions. This doesn’t seem to go as far as we wanted with removing fmt / spdlog.

@@ -17,6 +17,14 @@ function(find_and_configure_spdlog)

include(${rapids-cmake-dir}/cpm/spdlog.cmake)
rapids_cpm_spdlog(
# The conda package for fmt is hard-coded to assume that we use a preexisting fmt library. This
Copy link
Contributor

@bdice bdice Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it viable to remove conda package dependencies and always get spdlog/fmt as header-only deps via CPM? That way we should never interact with a local spdlog or local/shared fmt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it viable to remove conda package dependencies

No. The problem is that the logger is compiled not by rmm itself but by packages that build against rmm, and we cannot guarantee that spdlog or fmt will never exist in the environment in which that build is taking place. For example, if cuml has some other dependency that requires spdlog, then during cuml's conda build spdlog will be in the environment when its copy of the rmm logger is built. You can think of how the new build works as similar to what we do with gtest/gmock, with the additional complexity that the build is done by the user of rmm rather than rmm itself.

always get spdlog/fmt as header-only deps via CPM

Yes, with caveats. This is what I was alluding to in my message above where I said

We could make the first somewhat more palatable by having the download be controlled by some flag that is set to on by default (since that's what we'll want for everything in RAPIDS). I will prototype that after this PR is merged.

The big caveat is that we do not want to require this all the time, otherwise it will make offline builds of any package built against rmm impossible because you will have CPMAddPackage calls embedded in a dependency (the only workarounds would be to manually modify CPM-cloned code inside the build directory). We need to write the CMake so as to make this behavior possible to opt out of.

# pulled by CPM and installed as a part of the rmm packages. However,
# building against librmm still requires these headers. They are also
# added as a run requirement via the packages' run_exports.
# We need fmt here for now because the conda spdlog package is hard-coded
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we removed the spdlog package dependencies…? So this is an “ABI constraint” that is meant to align downstream packages that depend on spdlog, even though fmt is not used/linked as a shared dependency in librmm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your comment isn't correct, but I'm afraid I don't fully understand your explanation so I may not be able to address your confusion correctly. Basically what is happening is that (because rmm is an interface target) it becomes part of downstream packages build to find spdlog. When that package, say cudf, finds a conda-provided spdlog in the environment, that version of spdlog says to use the fmt shared library even if you use the spdlog_header_only target. Therefore, we actually do have an fmt shared library dependency. That is what I tried to explain in the PR description.

Does that help?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I'm sorry I just realized that there is a typo here. The first sentence should read "The conda package for spdlog is hard-coded to assume that we use a preexisting fmt library"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the spdlog conda package already has a dependency on fmt. Why not just let that dependency suffice? Why add it to rmm?

Copy link
Contributor

@bdice bdice Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And spdlog run-exports itself so two different consumers of spdlog will only be usable at runtime with compatible spdlog and fmt versions — and that’s enforced by the solver / dependency graph. https://github.com/conda-forge/spdlog-feedstock/blob/97ea289e6733b60e4c619e9cb1e0ebc329358936/recipe/meta.yaml#L18

I’m not convinced we need a fmt dependency here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized I never answered this question. The problem is that spdlog could be in the build environment as a transitive dependency even though we no longer have it as a dependency at all, resulting in linkage to fmt without spdlog or fmt being part of the runtime dependency list of our conda package (run exports aren't transitive). But yes, you could also fix this problem by simply putting spdlog back into the dependency list right now.

@vyasr
Copy link
Contributor Author

vyasr commented Nov 27, 2024

/merge

@rapids-bot rapids-bot bot merged commit d4066fa into rapidsai:branch-25.02 Nov 27, 2024
60 checks passed
@vyasr vyasr deleted the fix/logger_issues branch November 27, 2024 14:49
@vyasr
Copy link
Contributor Author

vyasr commented Nov 27, 2024

The RMM_LOGGING_ASSERT macro is never used in rmm itself, so we didn't catch that it was still using the old version of the logger.

Not true. This macro is used in many places (search shows about 50!) in RMM. The problem is we don't build RMM debug in CI, and you probably didn't test the debug build locally.

Thanks Mark, I was hasty in my assessment since I was trying to get everything unblocked. The reason for my confusion is that something in cudf was triggering the issue, which can be seen in this CI run. Evidently somehow debug asserts are being enabled when building a particular example. We should figure out why that is happening. I'll look into it (or ask for help while I look into the other bits).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ci CMake conda cpp Pertains to C++ code non-breaking Non-breaking change
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants