Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cc_test should respect exec_compatible_with for test action #23202

Open
keith opened this issue Aug 2, 2024 · 7 comments
Open

cc_test should respect exec_compatible_with for test action #23202

keith opened this issue Aug 2, 2024 · 7 comments
Labels
team-Configurability platforms, toolchains, cquery, select(), config transitions type: bug untriaged

Comments

@keith
Copy link
Member

keith commented Aug 2, 2024

Description of the bug:

If you have a simple constraint like this:

constraint_setting(name = "something")

constraint_value(
    name = "has_something",
    constraint_setting = ":something",
)

And you create a cc_test target that requires that setting:

cc_test(
    name = "test",
    srcs = ["test.cc"],
    exec_compatible_with = ["//:has_something"],
    target_compatible_with = ["//:has_something"],
)

and you're using remote exec with multiple platforms:

platform(
    name = "base-platform",
    constraint_values = [
        "@platforms//cpu:aarch64",
        "@platforms//os:linux",
    ],
)

platform(
    name = "something-platform",
    constraint_values = [
        "@platforms//cpu:x86_64",
        "@platforms//os:linux",
        ":has_something",
    ],
)

with:

build:remote --platforms=//:something-platform
build:remote --extra_execution_platforms=//:base-platform,//:something-platform

Building the cc_test target results in compilation happening on the something-platform, linking happening on base-platform and the test action most importantly happening on the base-platform.

It seems like theoretically this flag exists to help with this:

--experimental_add_exec_constraints_to_targets='//:test'=//:has_something

But this doesn't help in this case (unsurprisingly I think since you've already manually annotated the target with exec_compatible_with)

Passing --use_target_platform_for_tests does appear to fix this, but this flag has the downside that if you were to run 2 tests at once where not all required :has_something, they would both run on the something-platform instead of the base-platform, which is undesirable.

If you create a py_test (picked at random as another option) and create it the same way, it works as I'm expecting.

I assume this is because cc_test has exec_groups defined

exec_groups = {
"cpp_link": exec_group(toolchains = cc_helper.use_cpp_toolchain()),
# testing.ExecutionInfo defaults to an exec_group of "test".
"test": exec_group(toolchains = [config_common.toolchain_type(_CC_TEST_TOOLCHAIN_TYPE, mandatory = False)]),
},
, but I would still expect a manually passed exec_compatible_with to override these?

If I delete the test exec_group (and deal with the fallout of that) it fixes the issue. I guess I would assume that any exec_compatible_with would be merged with the exec_group value (which isn't even passed in this case)?

Which category does this issue belong to?

No response

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

No response

What is the output of bazel info release?

1acbd42 (or 7.x)

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.

No response

Have you found anything relevant by searching the web?

There are many threads on related topics here. Most about splitting the compilation / linking / testing across multiple exec groups, which I would also like for other cases. But I think this case is distinct.

Any other information, logs, or outputs that you want to share?

No response

@keith
Copy link
Member Author

keith commented Aug 2, 2024

Here's a failing test case demonstrating this issue: #23203

If i remove all exec_groups from cc_test it passes, which is what I expected behavior wise, that the passed exec_compatible_with / target_compatible_with would override / be combined with any from the default exec_groups.

Note that passing --use_target_platform_for_tests doesn't fix the test case either since the way it's currently configured nothing should be executed on platform_a just to make the assertions easier

@satyanandak satyanandak added the team-Rules-CPP Issues for C++ rules label Aug 5, 2024
@sluongng
Copy link
Contributor

sluongng commented Aug 5, 2024

cc: @katre

@fmeum
Copy link
Collaborator

fmeum commented Aug 7, 2024

If your use case really doesn't need to specify exec_compatible_with on a per-target level, you could register a toolchain with the required (global) constraint and a bit of undocumented logic as @bazel_tools//tools/cpp:test_runner_toolchain_type. (Edit: See this draft version of a guide: https://github.com/trybka/scraps/blob/master/cc_test.md)

Adding exec constrains per-target is still a use case that should be supported. I do think that having exec_compatible_with add to all exec groups would not be the correct solution though: If you have a (hypothetical) cc_test target that compiles 1M lines of GPU DSL and thus requires compilation to run on a machine with fast CPU cores, you wouldn't want the //:fast_cpu constraint to be picked up by the test action group, simply because you may not have machines that satisfy both //:fast_cpu and //:has_gpu. Similar behavior is already problematic with --incompatible_allow_tags_propagation affecting both test and compilation actions (I know, it was me who flipped it...).

Instead, it would seem more flexible to be able to add constraints to individual exec groups. Since Bazel doesn't offer an attribute type for dict[str, list[Label]], we can't just have this:

cc_test(
    ...,
    exec_compatible_with = [
        "//:fast_cpu",
    ],
    exec_group_exec_compatible_with = {
        "test": ["//:has_gpu"],
    },
)

But what we could probably have is this, where each exec group declared on the rule results in a new <exec group>_exec_compatible_with attribute added to the rule:

cc_test(
    ...,
    # Applies to the default (unspecified) exec group only.
    exec_compatible_with = [
        "//:fast_cpu",
    ],
    test_exec_compatible_with = [
        "//:has_gpu",
    ],
)

@keith
Copy link
Member Author

keith commented Aug 7, 2024

Seems like doing that toolchain setup would potentially work, given the lack of docs and public usages I was a bit wary when I saw that might be an option.

I agree that merging the exec_compatible_with with all exec_groups isn't ideal, and your suggested solution is the ideal, but also I think merging would still be preferred since you can't control those exec groups today without modifying the bazel source either AFAIUI?

Another option as a workaround is to stop using cc_test and replace it with a cc_binary and a custom test rule that just runs that binary as the test. Here's a minimal example:

def _binary_test_impl(ctx):
    output = ctx.actions.declare_file(ctx.label.name)
    ctx.actions.symlink(
        target_file = ctx.file.binary,
        output = output,
    )

    return [
        DefaultInfo(
            executable = output,
            runfiles = ctx.attr.binary[DefaultInfo].default_runfiles,
        ),
        ctx.attr.binary[RunEnvironmentInfo],
    ]

binary_test = rule(
    implementation = _binary_test_impl,
    attrs = {
        "binary": attr.label(allow_single_file = True),
    },
    test = True,
)

@comius comius added team-Configurability platforms, toolchains, cquery, select(), config transitions and removed team-Rules-CPP Issues for C++ rules labels Aug 23, 2024
@keith
Copy link
Member Author

keith commented Jan 17, 2025

FWIW I tried implementing a @bazel_tools//tools/cpp:test_runner_toolchain_type and it seems to suffer from the same issue that it can't affect only the test action's exec platform. You can influence the entire target by adding no-remote-exec = 1 to your target platform definition, but then the C++ source files also get compiled locally instead of remotely.

@fmeum
Copy link
Collaborator

fmeum commented Jan 18, 2025

FWIW I tried implementing a @bazel_tools//tools/cpp:test_runner_toolchain_type and it seems to suffer from the same issue that it can't affect only the test action's exec platform.

It should be able to do this since the toolchain requirement is declared only on the test exec group. Do you have an example you could share?

@keith
Copy link
Member Author

keith commented Jan 21, 2025

repro.zip

here's my example, you'll have to mess with remote exec settings a bit. AFAICT even though the toolchain is correctly selected:

INFO: ToolchainResolution: Performing resolution of @@bazel_tools//tools/cpp:test_runner_toolchain_type for target platform //:host_platform
      ToolchainResolution:   Toolchain //:cool_prod_linux_runner is compatible with target platform, searching for execution platforms:
      ToolchainResolution:     Incompatible execution platform //:remote-platform; mismatching values: has_gpu
      ToolchainResolution:     Compatible execution platform //:host_platform
      ToolchainResolution: Recap of selected @@bazel_tools//tools/cpp:test_runner_toolchain_type toolchains for target platform //:host_platform:
      ToolchainResolution:   Selected //:cool_prod_linux_runner to run on execution platform //:host_platform
INFO: ToolchainResolution: Target platform //:host_platform: Selected execution platform //:host_platform, type @@bazel_tools//tools/cpp:test_runner_toolchain_type -> toolchain //:cool_prod_linux_runner

Using aquery I still see the execution platform for the test action as the remote platform:

action 'Testing //:foo'
  Mnemonic: TestRunner
  Target: //:foo
  Configuration: k8-fastbuild
  Execution platform: //:remote-platform
  ActionKey: e610873afb0374a3bd906acd6d7fe2b6960f7f40a990bf320c25ae1a69cc93ef
  Inputs: [bazel-out/k8-fastbuild/bin/foo.runfiles, external/bazel_tools/tools/test/generate-xml.sh, external/bazel_tools/tools/test/test-setup.sh]
  Outputs: [bazel-out/k8-fastbuild/testlogs/foo/test.cache_status, bazel-out/k8-fastbuild/testlogs/foo/test.log, bazel-out/k8-fastbuild/testlogs/foo/test.outputs (TreeArtifact)]
  Command Line: (exec external/bazel_tools/tools/test/test-setup.sh \
    ./foo)
# Configuration: cbd01ce6488ab10fe4729b43a6c68c916ee2e33972315881127eb214c6fcb2cf
# Execution platform: //:remote-platform

And running the test ends up on the remote cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Configurability platforms, toolchains, cquery, select(), config transitions type: bug untriaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants