bazelbuild / bazel

a fast, scalable, multi-language and extensible build system

Home Page:https://bazel.build

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

incompatible_enable_cc_toolchain_resolution: Turn on toolchain resolution for cc rules

katre opened this issue · comments

Flag: --incompatible_enable_cc_toolchain_resolution
Available since: 0.23
Will be flipped in: 7.0.0

Description

C++ toolchain resolution is an improved mechanism that selects a proper C++ toolchain (compiler) based on a pair of target and execution platforms (think cross compilation from linux to mac). It’s an improvement over the old mechanism that sets the --crosstool_top flag. More information about platforms and toolchain resolution on https://bazel.build/reference/be/platforms-and-toolchains

Current status

  • C++ toolchain resolution has been in use at large repository at Google for a year without significant problems.
  • rules_apple and rules_swift lack toolchainization, however they work using platform_mapping
  • Bazel's C++ toolchain autoconfiguration already supports toolchain resolution.

Migration

If you're not using C++, or custom --crosstool_top, --cpu, or --compiler,
you don't use select on these options, you can stop reading now, there is
nothing to migrate for you.

Migration before the flag is flipped (only in case you're building for custom platforms, using custom C++ toolchains or using custom rules invoking cc_common.compile and cc_common.link):

  1. Add C++ platform definitions, register C++ toolchains
  2. Add missing C++ toolchain requirements to rules (Already added to OSS rules)
  3. Update blazerc configuration to use custom platforms
  4. Update Starlark configuration transitions to use platforms (ones affecting --cpu and --crostool_top, in cases platform_mappings isn’t used)
  5. Update or fix configurations (bazelrc) to use --platforms (where --cpu or --crosstool_top is used)

Migration after the flag is flipped:

  1. Replace cpu and os based selects with platform based selects (for example selects cc_target_os, target_cpu and other)
  2. Remove --crosstool_top, --cpu flags from bazelrcs and from Starlark configuration transitions
  3. Remove part of platform_mappings configuration needed for C++
  4. Remove _cc_toolchain implicit dependency

Additional details

1. Add C++ platform definitions, register C++ toolchains

Please use constraint_settings and constraint_values from
the canonical Bazel Platforms
Repository
(don't hesitate to propose
missing targets!). If there is a need for C++ specific constraints, feel free to
upload a PR to the rules_cc
repository. It is extremely important that the whole ecosystem uses the same
constraints, we can only reuse libraries and toolchains when we speak the same
language.

2. Add missing C++ toolchain requirements to rules

For Starlark rules owners who depend on C++ toolchain it will be necessary to
declare dependency on C++ toolchain type.

Before:

foo = rule(
    implementation = _foo_impl,
    attrs = {
        "_cc_toolchain": attr.label(default = Label("@bazel_tools//tools/cpp:current_cc_toolchain")),
    },
)

After:

foo = rule(
    implementation = _foo_impl,
    attrs = {
        "_cc_toolchain": attr.label(default = Label("@bazel_tools//tools/cpp:current_cc_toolchain")),
    },
    toolchains = use_cpp_toolchain(),
  )

See the docs and use @rules_cc//cc:find_cc_toolchain.bzl
(if using Bazel >= 0.27) or @bazel_tools//tools/cpp:toolchain_utils.bzl to locate current C++ toolchain (otherwise). Also see examples for general usage.

Note that this flag will supercede and replace the previous --enabled_toolchain_types flag, which was introduced before the incompatible change policy was formulated, and which was only ever used for the cc rules, anyway.

People who are interested: @hlopko, @lberki, @nlopezgi.

commented

Will native transitions like

splitOptions.get(BuildConfiguration.Options.class).cpu = androidOptions.cpu;
also need attention?

Yes, @aragos and I have been discussing a plan. We will probably not be able to flip this flag until we have a solution, so I am removing the "Will be flipped in" tag.

Please do not assign issues to more than one team

When testing with this flag on in rules_go at dd527c7d with Bazel 0.26.0rc8 on macOS, I'm seeing this error:

$ bazel build --incompatible_enable_cc_toolchain_resolution tests/legacy/examples/cgo:sub
> > Loading: 
Loading: 0 packages loaded
Analyzing: target //tests/legacy/examples/cgo:sub (0 packages loaded, 0 targets configured)
INFO: Analyzed target //tests/legacy/examples/cgo:sub (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
bazel: Entering directory `/private/var/tmp/_bazel_jayconrod/95575c789b578fe26b1744a32454af42/execroot/io_bazel_rules_go/'
[0 / 2] [Prepa] BazelWorkspaceStatusAction stable-status.txt
ERROR: /Users/jayconrod/go/src/github.com/bazelbuild/rules_go/tests/legacy/examples/cgo/BUILD.bazel:29:1: GoCompilePkg tests/legacy/examples/cgo/darwin_amd64_stripped/sub%/github.com/bazelbuild/rules_go/examples/cgo/sub.a failed (Exit 1) builder failed: error executing command bazel-out/host/bin/external/go_sdk/builder compilepkg -sdk external/go_sdk -installsuffix darwin_amd64 -src tests/legacy/examples/cgo/sub/floor.go -importpath ... (remaining 19 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
ld: framework not found UIKit
clang: error: linker command failed with exit code 1 (use -v to see invocation)
compilepkg: error running subcommand: exit status 1
bazel: Leaving directory `/private/var/tmp/_bazel_jayconrod/95575c789b578fe26b1744a32454af42/execroot/io_bazel_rules_go/'
Target //tests/legacy/examples/cgo:sub failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.488s, Critical Path: 0.29s
INFO: 0 processes.
FAILED: Build did NOT complete successfully
FAILED: Build did NOT complete successfully

It looks like -framework UIKit is being added to the C++ link flags. This target doesn't set link flags on its own, so I think this is coming from cc_common. Is that expected?

I'm investigating the UIKit issue.

Yay for surprises :) So the osx C++ toolchain doesn't yet work with platforms because of this TODO. @katre kindly volunteered to fix it :)

--action_env=BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1 cannot be used with --incompatible_enable_cc_toolchain_resolution.
I was running some additional tests with --incompatible_enable_cc_toolchain_resolution with RBE (although this does not have to do with RBE, I think), and I was unsure if --incompatible_enable_cc_toolchain_resolution was working as I expected it to work (i.e. specifically that it was using the remote cc_toolchain that I was indicating instead of the local one) I tried to add BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1 but my build failed with the following error:

INFO: ToolchainResolution: Selected execution platform @rbe_default//config:platform, type @bazel_tools//tools/jdk:toolchain_type -> toolchain @bazel_tools//tools/jdk:dummy_toolchain
ERROR: /root/.cache/bazel/_bazel_root/7e958634aed2e0b9513fa7cce861a282/external/local_config_cc/BUILD:28:1: in cc_toolchain_suite rule @local_config_cc//:toolchain: cc_toolchain_suite '@local_config_cc//:toolchain' does not contain a toolchain for cpu 'k8'

Note that without the BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN flag my builds are running fine, and even when I check with --toolchain_resolution_debug it does seem to be the case that my remote builds are picking the toolchain that I want them to pick and not the local one. But perhaps this means that its time to remove support for BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN and perhaps provide some other way to not register the local_cc_config (i.e., I really want to be able to run builds with the local_cc_config disabled altogether so that I can be super sure all actions can only run remotely with the toolchain config I explicitly set).

@laurentlb I think the migration-0.25 tag should be removed from this issue because of this issue that was cherry picked into 0.26 #8330

@nlopezgi In the error message I see cc_toolchain_suite, that shouldn't be used at all with --incompatible_enable_cc_toolchain_resolution, is it possible that you were building the cc_toolchain_target explicitly? The expected error message is:

ERROR: While resolving toolchains for target //:too: no matching toolchains found for types @bazel_tools//tools/cpp:toolchain_type

Can you describe a repro case if there's still a bug?

I just uploaded #8459, which AFAIK removes the need for BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN variable (but I kept it around for cases like you mention - where you want to be super sure local_config_cc doesn't register any toolchain. But the original use case - preventing Bazel from autoconfiguring local C++ toolchain when there are remote toolchains fully configured, is fixed by #8459 (toolchain targets will still be generated, but system will only be inspected after the C++ toolchain is already selected).

Hi Marcel,
You can repro this error using as base this GCB build script which tests a very simple cc target (//examples/remotebuildexecution/hello_world/cc:say_hello_test, i.e., I am not building the cc_toolchain_target)
The GCB yaml file shows all the flags needed to repro. You will need to change the --remote_instance_name to point to some instance of RBE you have access to (or maybe do a local build inside the bazel container, it probably also can be repro'd there?).

The error reproduces when building with --action_env=BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1 (build log)
but does not happen when the --action_env=BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1 is ommitted (build log.
See also my comment in #8459 about how this will impact remote execution.

Hi @nlopezgi, the problem with BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN was fixed by #8459. Thanks for reporting!

To be explicit - I don't think we need to postpone flipping this flag because of BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN bug - users can still migrate incrementally in 0.26 (with --incompatible_enable_cc_toolchain_resolution disabled), and they will get the correct behavior in 0.27 (with --incompatible_enable_cc_toolchain_resolution enabled).

We found a case with master / 0.27.0 where this doesn't work for iOS builds #8716

Removing the bazel 1.0 label and extending the migration window.

@katre Hello :) Is this still happening? Seems like 2.x is no longer a migration window for this flag, is this expected?

I actually don't care (or know) so much about this flag itself, but I just noticed that we have a special case for this flag in the bazel_bootstrap_distfile_test, which causes the bootstrap to run twice: https://github.com/bazelbuild/bazel/blob/master/src/test/shell/bazel/bazel_bootstrap_distfile_test.sh#L76

This is too expensive for CI (it takes over 10 minutes on Windows). Is there a cheaper way to test this rather than running a full bootstrap (which builds Bazel twice, serially)?

I'll investigate.

So're in a tricky situation:

  1. The Configurability team absolutely wants to flip this flag.
  2. The Rules-C++ team member who was championing this is no longer on the team.
  3. We didn't flip this for Bazel 1.0 because of the follow-on problems with Android and iOS rules.
  4. No one is championing those either.

I'm working on a plan to get these addressed and flipped, but it won't be near-term. So to alleviate CI pain, is it possible to disable the test normally, and run it once per day or so, just to catch any regressions?

(CC @aiuto and @lberki in case I am mis-stating the case for flipping this flag)

As a note, we don't appear to be testing this with the migration test pipeline, is that intentional? See https://buildkite.com/bazel/bazelisk-plus-incompatible-flags/builds/393

commented

We're close to making this possible technically (i.e. all the supporting C++ work is done). The big challenge is 3) above: how to not inadvertently break depending projects that aren't in C++ and don't themselves understand platforms.

We have an interim solution. But as John says we need someone to guide the process so projects that might be affected don't get caught offguard and don't know what to do. Ideally this just involves defining a couple of strategic platform mappings in a common place.

This test is basically the only reason now why our presubmits are 17 instead of 12 minutes and I don't see any way how to make it faster.

I read everything I could find about this flag and I still don't understand why we have to run a full Bazel bootstrap using compile.sh as part of our tests to verify that this flag works.

Couldn't we alternatively build e.g. a helloworld.cc with this flag enabled to ensure that it works?

@philwo : how does that test make our presubmits 5 minutes slower? My mental model is that we need to do at least one bootstrap test and if we do one, doing two in parallel takes the same amount of wall time.

@katre : in order to test this with incompatible flags, the migration-* tag needs to be on the bug. I think we just simply forgot to add them.

@katre: remind me again, what's the problem with (3)? I thought platform mapping makes this a no-op for Android and iOS and the only thing that needs to be done after the flag flip is to change their select() statements.

That said, we should totally have well-known platform definitions to replace the well-known --cpu values.

@lberki We are hardware-limited on CI, especially on macOS. Anything we can do to cause the tests to use less hardware resources has massive impact. For example, with my "fix our test framework to not extract 500 JDKs per run" change, I moved us from being so insanely I/O bound that the only way to run our tests was in a RAM disk to being no longer I/O bound at all. Thus we were able to get rid of RAM disks (= switch from highmem to cheaper standard machines) and get rid of the local SSDs (save money and complexity, allows use of machines with faster CPUs that do not have SSDs available).

Another little two line fix then got our postsubmit on Linux down from 50 to 35 minutes on Linux. This high impact is all only possible, because our code and tests eat resources like there's no tomorrow.

This test is the top 1 on my list of "slowest tests on the CI".

A Bazel bootstrap easily occupies the entire machine's capacity (all cores, lots of memory, much I/O). While it does so, the processing power spent on this is not available for other tests. Thus, by running two bootstraps, you occupy the machine's entire capacity twice for the duration of the test. There's nothing gained by running them in parallel - a Bazel bootstrap already is "parallel", because it's Bazel (which is quite good at parallel execution ^^) building itself.

I think we should just remove this test_bootstrap_with_cc_rules_using_platforms test. I talked with Marcel about it, and this test is not to test that the flag works, but it's to test that our code base can still be built with that flag on. That's nice, but that's exactly why we have the Bazelisk migration pipeline on CI, which runs in the night, when using a lot of capacity is fine.

WDYT? If you agree, I'll go ahead and remove the test, so that we just rely on the Bazelisk pipeline. If not, I'd disable it for presubmit, but note that postsubmits aren't free either - we only have so many Macs and while they're running a postsubmit test, they can't run a presubmit.

I was thinking we could run the two bootstraps in two different machines and take less wall time that way.

As long as bootstrapping Bazel with the flag set is tested by the incompatible flag pipeline, I'm fine with removing the test. Does it run the bootstrap with the incompatible flags set, though? I'm not sure what happens with the "inner" Bazel instances (the ones that are being tested, not the one that runs the tests)

We cannot just run the two bootstraps on different machines, because we simply do not have enough Mac machines. 💸😀

If this flag is marked as migration ready, then it will be included in the "Bazelisk + Incompatible flags" pipeline. I just verified what that actually does. In that case, it would run a bazel build --incompatible_enable_cc_toolchain_resolution //src:bazel where bazel is the latest release and the source code it tries to build is Bazel HEAD. Which means, it verifies that the latest Bazel release with that flag flipped can still build Bazel from scratch.

Unless there's some specific reason why we have to test this flag using ./compile.sh, it seems like this is exactly what we want to test. WDYT?

SGTM. As long as the code path is being tested regularly to prevent rot, I am fine with this.

Sounds like a plan. AFAIU the bootstrapping process relies on Bazel itself to build C++ code and the only part that's special-cased is Java compilation, so it should be all fine.

The most recent status here is that the flag is ready to flip, except that Android and Apple rules don't handle platforms well, so there's no clear way to manage those transitions. Configurability is working with the Android rule owners to fix those rules, and then we'll look into Apple rules when that's in place.

Hello everyone. I've observed the toolchain resolution mechanism ignores the --platforms and --host_platform command line options. The case is the following: I wish to select MSVC or clang-cl on Windows from the command line. I've defined two platforms in my BUILD:

platform(
    name = "x64_windows-clang-cl",
    constraint_values = [
        "@platforms//cpu:x86_64",
        "@platforms//os:windows",
        "@bazel_tools//tools/cpp:clang-cl",
    ],
)

platform(
    name = "x64_windows-msvc",
    constraint_values = [
        "@platforms//cpu:x86_64",
        "@platforms//os:windows",
        "@bazel_tools//tools/cpp:msvc",
    ],
)

And registered both in WORKSPACE:

register_execution_platforms(
    ":x64_windows-msvc",
    ":x64_windows-clang-cl",
)

register_toolchains(
    "@local_config_cc//:cc-toolchain-x64_windows-clang-cl",
)

The order in register_execution_platforms does matter: Bazel just takes the first and looks for a toolchain that satisfies all the constraints defined by the platform. If there is a such toolchain, Bazel just uses it. And If I wish to use the second one platform and do:

$ bazel build ... --incompatible_enable_cc_toolchain_resolution --host_platform=//:x64_windows-clang-cl --platforms=//:x64_windows-clang-cl

Bazel just ignores it and still uses x64_windows-msvc as the execution platform.

Maybe it is because the toolchain resolution depends on execution platform, not on a host or target one while the --host_platform and '--platforms` command line options are for host and target platforms respectively.

Now I see only one workaround: not to register execution platforms in WORKSPACE but to use --extra_execution_platforms option and when one wish to set a config_setting for the platform, the --platform option should also been used.

Should I open a new issue about this?

@katre thank you for the advice to open an issue and provide a small project to reproduce. I've opened issue #11522 and put a minimum possible project to demonstrate the observed behavior there: https://github.com/samolisov/bazel-cc-platform-demo Please, have a look. Thank you.

Hi!
I spent a bit of time trying to transition our monorepo to the new toolchain resolution for c++, as we are cross compiling from Windows to Linux and Stadia using some custom toolchain definition. Before I log in an issue, and looking at the code I see that the code making us not pull the legacy cc_configure is commented out:
// TODO(hlopko): Uncomment once Bazel tests pass with --all_incompatible_changes // } else if (semantics.incompatibleUseCcConfigureFromRulesCc()) { // suffix = ruleClassProvider.getDefaultWorkspaceSuffix();
Is this issue stale?

Can you give a file and line number for that comment (or link directly to it on https://cs.opensource.google/bazel/bazel)?

Are you having actual errors? We haven't fully enabled this because of issues with Android and Apple builds not correctly using platforms and toolchains, but pure C++ should work (assuming your toolchains are defined and registered properly).

If you are having an actual error, please open a new issue. If you can add steps to reproduce, that will make debugging much simpler.

This is in: https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/skyframe/WorkspaceASTFunction.java
So while trying to have a small repro to provide you, found that the issue was on my side since anyways we don't rely on the resulting toolchains from cc_configure.
I have now everything working, but I was wondering what is the path onward for our dependencies that rely on the cpu flag to set copts or different includes, for example:
https://boringssl.googlesource.com/boringssl/+/refs/heads/master/util/BUILD.toplevel
https://github.com/grpc/grpc/blob/master/third_party/cares/cares.BUILD
https://github.com/abseil/abseil-cpp/blob/master/absl/BUILD.bazel
The abseil example seems to be mixing constraints and old style cpu flags. So I'm not 100% sure if constraints deduced automatically in some context which makes it backward compatible to move ahead and upstream changing all those configs to constraints rather than relying on platform_mappings.

You will need to use a platform_mapping file to correctly set the --cpu flag from your platform (and other flags, of course).

I was going to point you to the platform mapping docs, then discovered that I had accidentally deleted them (see #11581). So read the older docs, from release 3.1.0, and which I will restore shortly.

For other people reading this, cc_configure is already migrated for incompatible_enable_cc_toolchain_resolution and should be working correctly. If it doesn't it's a new bug so file an issue and we'll look into it.

The line of code @jelmansouri referenced is not related to this incompatible flag, it's related to our effort to move cc_configure out of the Bazel binary and into rules_cc repository, and that is orthogonal to toolchains/platforms migration.

Thanks @hlopko for the precision. Managed to turn on the incompatible_enable_cc_toolchain_resolution flag and have everything compiling without any issues (given I have a platform_mappings file).

This has been downgraded since we're stalled on it. We'll be unblocked when we can finish migrating Android builds to use toolchain resolution, which is being tracked as #11749. We may also need to migrate Apple/iOS rules before we can flip this, too, but we'll know more once we're further with the Android effort.

I flipped --incompatible_enable_cc_toolchain_resolution today with 4.0.0-rc7 (essentially), provided my own toolchains, removed the --crosstool_top flag, and promptly got the following:

$ BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1 bazel build -c opt --platforms=@aos//tools/platforms:linux_aarch64 //...
ERROR: /home/austin/.cache/bazel/_bazel_austin/71ec74b5bb538661db62db908afb04ce/external/local_config_cc/BUILD:29:19: in cc_toolchain_suite rule @local_config_cc//:toolchain: cc_toolchain_suite '@local_config_cc//:toolchain' does not contain a toolchain for cpu 'aarch64'
ERROR: Analysis of target '//build:docker_rootfs' failed; build aborted: Analysis of target '@local_config_cc//:toolchain' failed
INFO: Elapsed time: 0.836s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded, 0 targets configured)

We don't have an android or IOS build, so I would have thought I could flip it without waiting. Happy to file another issue, but this seemed relevant.

You said you provided your own toolchains? Try using --toolchain_resolution_debug to see what toolchains are being considered and why they aren't being used.

Heavily redacted due to all the python and go toolchains (like 100 lines of noise...). Happy to provide an un-filtered dump if you want.

INFO: ToolchainResolution: Target platform @aos//tools/platforms:linux_x86: Selected execution platform @local_config_platform//:host, type @bazel_tools//tools/cpp:toolchain_type -> toolchain @llvm_toolchain//:cc-clang-linux-k8
INFO: ToolchainResolution:   Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: execution @local_config_platform//:host: Selected toolchain @llvm_toolchain//:cc-clang-linux-k8
INFO: ToolchainResolution:     Type @bazel_tools//tools/python:toolchain_type: execution platform @local_config_platform//:host: Rejected toolchain @io_bazel_rules_docker//toolchains:default_container_py_runtime_pair; mismatching values: run_in_container
INFO: ToolchainResolution: Target platform @aos//tools/platforms:linux_aarch64: Selected execution platform @io_bazel_rules_docker//platforms:local_container_platform, type @bazel_tools//tools/python:toolchain_type -> toolchain @io_bazel_rules_docker//toolchains:default_container_py_runtime_pair, type @bazel_tools//tools/cpp:toolchain_type -> toolcha    in @llvm_toolchain//:cc-clang-linux-aarch64
INFO: ToolchainResolution: Target platform @aos//tools/platforms:linux_aarch64: Selected execution platform @local_config_platform//:host, type @bazel_tools//tools/python:toolchain_type -> toolchain @bazel_tools//tools/python:_autodetecting_py_runtime_pair, type @bazel_tools//tools/cpp:toolchain_type -> toolchain @llvm_toolchain//:cc-clang-linux-aarc    h64
INFO: ToolchainResolution:   Type @bazel_tools//tools/cpp:toolchain_type: target platform @aos//tools/platforms:linux_aarch64: execution @io_bazel_rules_docker//platforms:local_container_platform: Selected toolchain @llvm_toolchain//:cc-clang-linux-aarch64
ERROR: /home/austin/.cache/bazel/_bazel_austin/71ec74b5bb538661db62db908afb04ce/external/local_config_cc/BUILD:29:19: in cc_toolchain_suite rule @local_config_cc//:toolchain: cc_toolchain_suite '@local_config_cc//:toolchain' does not contain a toolchain for cpu 'aarch64'

Toolchain resolution succeeded for C++, and the correct toolchains were selected. But, for some reason, the default local_config_cc is still being looked at, maybe because it is the default for --crosstool_top or something like that. This ends up being a problem because I'm cross-compiling for aarch64, and local_config_cc isn't going to be able to find a compiler for that.

Yes, I'm not sure why @local_config_cc is being loaded, please file a separate issue for this.

What is the new way to select a specific toolchain without crosstool_top if multiple toolchains apply? I found several documentation and examples when it comes to a different toolchain for a different platform, but not for different toolchains for the same platform.
The use case that I am trying to model is something like:

  • Host: Windows, x64
  • Target: Windows, x64
  • Toolchains: MSVC toolchain, gcc toolchain, clang toolchain

One of the toolchains is the one that will be used to deliver the executable but the idea is to use the other ones in the CI to make sure that the code is not relying on compiler specific features.

With the new way I would declare a platform that is Windows x64 and then I would declare the 3 toolchains. I would register the toolchains but then the missing part is how I specify which toolchain I should use. This is what I would have specified with --crosstool_top
The only option that I see would be to not register any toolchain at all, and then on the command line provide the --extra_toolchains with the specific toolchain to be used.

Then I could optionally add it to the bazelrc in order to have all the magic done with only --config=gcc:
build:gcc --platforms=//platforms:windows_x64 --extra_toolchains=//toolchains:gcc
build:clang --platforms=//platforms:windows_x64 --extra_toolchains=//toolchains:clang
build:msvc --platforms=//platforms:windows_x64 --extra_toolchains=//toolchains:msvc

Is this the way to go? Is there a better way? Thanks

Bonus: In case this is the way to go, would this have the problem that the local_config_cc would still be used due to the bug #12712? Thanks

The best approach would be to continue to use the --compiler flag, with the new target_settings attribute on the toolchain rule, like this:

config_setting(
    name = "is_msvc",
    values = {"compiler": "msvc"},
)
config_setting(
    name = "is_clang",
    values = {"compiler": "clang"},
)
config_setting(
    name = "is_gcc",
    values = {"compiler": "gcc"},
)

toolchain(
    name = "msvc_toolchain",
    toolchain = "msvc_cc_toolchain",
    toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
    exec_compatible_with = ...,
    target_compatible_with = ....,
    target_settings = [":is_msvc"],
)
toolchain(
    name = "clang_toolchain",
    toolchain = "clang_cc_toolchain",
    toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
    exec_compatible_with = ...,
    target_compatible_with = ....,
    target_settings = [":is_clang"],
)
toolchain(
    name = "gcc_toolchain",
    toolchain = "gcc_cc_toolchain",
    toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
    exec_compatible_with = ...,
    target_compatible_with = ....,
    target_settings = [":is_gcc"],
)
toolchain(
    name = "default_toolchain",
    toolchain = "clang_cc_toolchain", # Or whichever is the default
    toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
    exec_compatible_with = ...,
    target_compatible_with = ....,
    # No target_settings present
)

This will allow you to pass --compiler=msvc to select the msvc toolchain, etc, while still respecting your other constraints for the execution and target platform.

Thanks a lot @katre, I like a lot your approach. I will try it out and I'll recommend it around, I find it very elegant.

@katre @philwo do we want to do this for 5.0, or should it be postponed again?

Postpone. We're making progress in Q4, but we're not ready for this to be flipped yet.

Thanks for the update!

@katre in #7260 (comment) you wrote about using target_settings to select the right compiler. How would this work when different compilers should be used on the host/exec and target platform?

Example:

  • Host and target platform constraints are for both @platforms//cpu:x86_64 and @platforms//os:linux
  • For the exec/host platform gcc compiler should be used
  • For the target platform clang should be used

I'm assuming you have a good reason for this, so the thing you need to do is use the --compiler and --host_compiler flags to indicate that clang is the main compiler but gcc should be used for host/exec. (I'm also assuming you are using toolchain resolution, if not then see the docs for cc_toolchain_suite).

Then, you can define a set of config_setting targets and use them in toolchain, like this:

config_setting(
    name = "use_clang",
    values = {"compiler": "clang"},
)
config_setting(
    name = "use_gcc",
    values = {"compiler": "gcc"},
)
toolchain(
    name = "clang_toolchain",
    exec_compatible_with = ...,
    target_compatible_with = ....,
    target_settings = [":use_clang"],
    toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
    toolchain = ":actual_clang_cc_toolchain",
)
toolchain(
    name = "gcc_toolchain",
    exec_compatible_with = ...,
    target_compatible_with = ....,
    target_settings = [":use_gcc"],
    toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
    toolchain = ":actual_gcc_cc_toolchain",
)

This way, toolchain resolution will know which one to prefer based on the compiler flag for the gives configuration.

@katre I tried you proposal but ran into some CROSSTOOL / Toolchain Resolution Issues.

As we are using toolchain resolution we run into the topic that the compiler is not defined in cc_toolchain_suite. My understanding is that —cpu, —compiler will be deprecated when the migration to platform has been done.

bazelisk build --compiler="gcc" //apps/app1-c:app1-c

bazel build --compiler="gcc" //apps/app1-c:app1-c
ERROR: /home/build/.cache/bazel/_bazel_build/a207a0a4f446163b8764f61791b0b536/external/local_config_cc/BUILD:28:19: in cc_toolchain_suite rule @local_config_cc//:toolchain: cc_toolchain_suite '@local_config_cc//:toolchain' does not contain a toolchain for cpu 'k8' and compiler 'gcc'.
ERROR: /home/build/.cache/bazel/_bazel_build/a207a0a4f446163b8764f61791b0b536/external/local_config_cc/BUILD:28:19: Analysis of target '@local_config_cc//:toolchain' failed
ERROR: Analysis of target '//apps/app1-c:app1-c' failed; build aborted:
INFO: Elapsed time: 0.267s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded, 0 targets configured)

Regarding this issue I am wondering how it could work with the --compiler and --host_compiler flags.

Oh, because the rules currently work both ways, both need to be defined (because the decision of which to follow happens too late to not read the data from the "wrong" side). So you need to have matching toolchain and cc_toolchain_suite setup, unfortunately.

Oh, because the rules currently work both ways, both need to be defined (because the decision of which to follow happens too late to not read the data from the "wrong" side). So you need to have matching toolchain and cc_toolchain_suite setup, unfortunately.

@katre is there any ticket to follow this? I mean that for some situations cc_toolchain_suite is still needed?

No, we don't have any plans to fix this short of enabling toolchain resolution and removing the legacy support. You can file a feature request for this and the C++ team can prioritize it, if you'd like.

@comius Should this block 6.0 release cut?

It looks like this flag was flipped, but my comments above from 2019 are still the case, so this breaks all uses of the apple rules. It seems unfortunate to require all users to flip this flag off, although I get the desire to push this forward since it clearly hasn't been moving in the last few years. Is all downstream apple users disabling this flag the recommendation?

commented

Apple/Android builds failing was our rationale for not flipping this flag earlier.

@comius @lberki what's your current thinking on that?

It looks like target_settings works in certain contexts and not others. Modifying the above example:

This just doesn't work! It doesn't matter what flavor is set to, you always get the "gcc" toolchain.

But if I use "compiler" in my config_settings, then it works.

!@#$%

string_flag(
    name = "flavor",
    build_setting_default = "gcc",
    values = [
        "gcc",
        "clang",
    ],
)

config_setting(
    name = "use_gcc",
    flag_values = {":flavor": "gcc"},
)

config_setting(
    name = "use_clang",
    flag_values = {":flavor": "clang"},
)

toolchain(
    name = "clang_toolchain",
    exec_compatible_with = ...,
    target_compatible_with = ....,
    target_settings = [":use_clang"],
    toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
    toolchain = ":actual_clang_cc_toolchain",
)

toolchain(
    name = "clang_toolchain",
    exec_compatible_with = ...,
    target_compatible_with = ....,
    target_settings = [":use_gcc"],
    toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
    toolchain = ":actual_gcc_cc_toolchain",
)

For tracking:
cc @meteorcloudy

Failures: https://buildkite.com/bazel/bazelisk-plus-incompatible-flags/builds/1339

image

To fix:

The rules_foreign_cc breakage is due to the rules_android_ndk breakage as it is just one of the examples that use rules_android_ndk that is broken.

Hi all, confirming if we still plan to flip this flag in 7.0?

Should this issue be closed since the flag is now flipped?

Should this issue be closed since the flag is now flipped?

Closing the issue when the flag is removed. Or perhaps leaving it open, so that users can more easily find migration instructions?