Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VSCode/Brackets + Santa cause Xcode autoconfiguration to fail on macOS #4603

Closed
vicb opened this issue Feb 8, 2018 · 34 comments
Closed

VSCode/Brackets + Santa cause Xcode autoconfiguration to fail on macOS #4603

vicb opened this issue Feb 8, 2018 · 34 comments
Assignees

Comments

@vicb
Copy link

vicb commented Feb 8, 2018

Description of the problem / feature request:

I got an error when trying to build angular (I'm on the Angular team berchet@)

The error is:

$ bazel test packages/core/test/render3
ERROR: /private/var/tmp/_bazel_berchet/136114fe9f12514cf56a27652da0b4c4/external/local_config_cc/BUILD:50:5: in apple_cc_toolchain rule @local_config_cc//:cc-compiler-darwin_x86_64: Xcode version must be specified to use an Apple CROSSTOOL
ERROR: Analysis of target '//packages/core/test/render3:render3' failed; build aborted: Analysis of target '@local_config_cc//:cc-compiler-darwin_x86_64' failed; build aborted
INFO: Elapsed time: 2.277s
FAILED: Build did NOT complete successfully (34 packages loaded)
ERROR: Couldn't start the build. Unable to run tests

Note: I had this working in the past few weeks (& using multiple times a day).

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

On a gMac:

  1. clone the angular repo at https://github.com/angular/angular
  2. execute yarn into the root folder to install deps
  3. execute bazel test packages/core/test/render3 at the root

yarn and bazel must be installed via brew

What operating system are you running Bazel on?

This seems to be the issue, 4 of us ran into this bug after an High Sierra 10.13.3 update

What's the output of bazel info release?

 bazel version
Build label: 0.9.0-homebrew
Build target: bazel-out/darwin-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Sun Jul 12 12:24:01 +49936 (1513677414241)
Build timestamp: 1513677414241
Build timestamp as int: 1513677414241

Have you found anything relevant by searching the web?

I have searched the web and my team-mates, misc info

  • rebooting help,
  • xcode-select -s /Applications/Xcode.app/Contents/Developer before rebooting seems to help,
  • when you are in a working state, never ever bazel clean you could get into the broken state

The error seems to originate in this BUILD file:

# /private/var/tmp/_bazel_berchet/136114fe9f12514cf56a27652da0b4c4/external/local_config_xcode/BUILD

package(default_visibility = ['//visibility:public'])

xcode_config(name = 'host_xcodes')
# Error: Invoking xcode-locator failed, return code 256, stderr: java.io.IOException: Cannot run program "/private/var/tmp/_bazel_berchet/136114fe9f12514cf56a27652da0b4c4/external/local_config_xcode/./xcode-locator-bin" (in directory "/private/var/tmp/_bazel_berchet/136114fe9f12514cf56a27652da0b4c4/external/local_config_xcode"): error=1, Operation not permitted, stdout: 

executing /private/var/tmp/_bazel_berchet/136114fe9f12514cf56a27652da0b4c4/external/local_config_xcode/./xcode-locator-bin fromthe CLI returns

{
	"9.2.0": "/Applications/Xcode.app/Contents/Developer",
	"9.2": "/Applications/Xcode.app/Contents/Developer",
	"9": "/Applications/Xcode.app/Contents/Developer",
}

permissions are -rwxr-xr-x

EDIT: Turned out to be an interacction with VSCode, see the fix below

@vicb
Copy link
Author

vicb commented Feb 8, 2018

Just tried to update bazel to

Build label: 0.10.0-homebrew
Build target: bazel-out/darwin-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Wed Jan 10 02:02:06 +50057 (1517480013726)
Build timestamp: 1517480013726
Build timestamp as int: 1517480013726

I get the same exact error (after a bazel clean)

@sergiocampama
Copy link
Contributor

I've seen this multiple times before, but haven't been able to reproduce consistently.

I believe there's some sort of global state that prevents xcode-locator-bin from being invoked within java. If you cat /private/var/tmp/_bazel_berchet/136114fe9f12514cf56a27652da0b4c4/external/local_config_xcode/BUILD you'll see the error.

Can you try this just to diagnose?

  1. bazel clean --expunge
  2. if 1 doesn't work, reboot

@sergiocampama
Copy link
Contributor

fyi @c-parsons @jmmv

@vicb
Copy link
Author

vicb commented Feb 8, 2018

@sergiocampama

If you cat /private/[...]/local_config_xcode/BUILD you'll see the error.

true, I even did that in my initial report !

bazel clean --expunge

  1. starting from a working state (after a reboot),
    1a. bazel test ... -> ok
    1b. bazel test ... -> ok
    ...
    1Z. bazel test ... -> ok
  2. bazel clean
  3. bazel test ... -> fails (~3s)
  4. bazel clean --expunge
  5. bazel test ... -> fails (~25s)

It takes longer to fail which doesn't look like an improvement to me ;)

@sergiocampama
Copy link
Contributor

My bad, I missed that part of your report :)

Yeah, that seems to match my experience, after a reboot, everything works, but once you clean it gets messed up. I believe that bazel clean just removes the output directories, but bazel clean --expunge deletes all of the execution root stuff, which is why it takes longer to start.

For now the only workaround I know of is to reboot.

What I find most interesting about this is that xcode-locator-bin gets called twice, once from xcode_configure.bzl and another time from osx_cc_configure.bzl. By tracing the code and the output of local_cc_config/BUILD, it appears that the call from osx_cc_configure.bzl does indeed work, but the one in xcode_configure.bzl doesn't.

@jmmv
Copy link
Contributor

jmmv commented Feb 8, 2018

It'd be that sandboxing is blocking the execution of the xcode-locator-bin, which results in the obscure error written to the BUILD file. Could you maybe try your reproduction steps after sticking:

build --genrule_strategy=standalone
build --spawn_strategy=standalone

into ~/.blazerc and see if you can trigger the problem?

Just a thought. The fact that an invalid BUILD file is generated at all and left behind is concerning.

@sergiocampama
Copy link
Contributor

I looked into that, but repository_ctx.execute does not use any strategy, it just runs the commands passed through JavaSubprocess, so not sure if a sandbox is used at all for repository execute actions

@vicb
Copy link
Author

vicb commented Feb 8, 2018

@jmmv just tried didn't help

I started from a broken state:

  • bazel test ... -> fails
  • bazel clean
  • bazel test ... -> fails
  • bazel clean --expunge
  • bazel test ... -> fails

@alexeagle
Copy link
Contributor

interestingly, I did High Sierra upgrade on my personal Mac and didn't see this, so it could be something about our Managed Software Update version of XCode in Google's corp Mac setup

@alexeagle
Copy link
Contributor

@vicb @jmmv your instructions said to put the options in .blazerc rather than .bazelrc so you may not have tested what you thought...

@jmmv
Copy link
Contributor

jmmv commented Feb 8, 2018

@sergiocampama confirmed by code inspection that the xcode-locator-bin process is being executed directly via Java's subprocess facilities and that no sandboxing is involved... so that's not the case.

However, we have just discovered something that may be getting in the way. Think... antivirus software and Santa. No good explanation on what exactly is wrong yet though.

@jmmv
Copy link
Contributor

jmmv commented Feb 8, 2018

Alright, so it could be that, or... something more crazy: VSCode.

It seems that VSCode holds an open file handle on the xcode-locator-bin file as soon as it's created, which could be blocking executions until whatever VSCode wants to do completes. This may or may not be a problem in itself, or it may be a problem only in combination with Santa.

However, @sergiocampama and myself have confirmed that launching and closing VSCode makes the problem appear and vanish consistently.

Add:

"files.exclude": {"bazel-*": true}

to your workspace configuration to ignore the Bazel trees and things will work. (That's why I had never seen this, because I've always had this setting. Removing it made the problem appear pretty quickly.)

@alexeagle
Copy link
Contributor

alexeagle commented Feb 8, 2018 via email

@menny
Copy link
Contributor

menny commented Feb 9, 2018

I saw something very similar to that error, and I did have VSCode running at the time.
I also have symlink_prefix pointing to a sub-folder under the WORKSPACE root.

@sergiocampama
Copy link
Contributor

so I have "files.exclude": {"bazel-*": true} set in my vscode settings, but having VSCode open on the working repository still makes this error appear. I have to close VSCode before being able to compile correctly

@vicb
Copy link
Author

vicb commented Feb 15, 2018

@sergiocampama "files.exclude": {"bazel-*": true} did help me most of the time. I seldom see the issue now (only after my MBP resumes from sleep it seems)

@alexeagle
Copy link
Contributor

Several members of the Angular team continue to have this problem occasionally.

alexeagle added a commit to alexeagle/angular that referenced this issue Feb 22, 2018
It causes headaches on MacOS High Sierra, see bazelbuild/bazel#4603
vicb pushed a commit to angular/angular that referenced this issue Feb 22, 2018
It causes headaches on MacOS High Sierra, see bazelbuild/bazel#4603

PR Close #22375
vicb pushed a commit to angular/angular that referenced this issue Feb 22, 2018
It causes headaches on MacOS High Sierra, see bazelbuild/bazel#4603

PR Close #22375
smdunn pushed a commit to smdunn/angular that referenced this issue Feb 28, 2018
It causes headaches on MacOS High Sierra, see bazelbuild/bazel#4603

PR Close angular#22375
leo6104 pushed a commit to leo6104/angular that referenced this issue Mar 25, 2018
It causes headaches on MacOS High Sierra, see bazelbuild/bazel#4603

PR Close angular#22375
@jasonaden
Copy link

@jmmv This fails more than occasionally. It's at least 4-5 times per week, meaning on average I have to go through the steps to fix this on a daily basis. It would really help if we could get this fixed.

@lfpino
Copy link
Contributor

lfpino commented Mar 29, 2018

@sergiocampama can you please take a look since Julio is on vacation?

@sergiocampama
Copy link
Contributor

I believe this is something related to some dependency framework that VSCode is built with, as when I tried switching to Brackets the same problem happened. I've also tried with Atom and Sublime 3, and those last 2 work correctly.

I don't really know what we can do from this side of the table. We could file a bug against VSCode but I don't expect this to be looked at anytime soon. For my part, I've switched to Sublime 3 and I'm not seeing these errors any more.

@sergiocampama sergiocampama removed their assignment Mar 29, 2018
@jmmv
Copy link
Contributor

jmmv commented Mar 30, 2018 via email

@Lunarsong
Copy link

Same issue here. I suspected it might have been the following plugin, but it doesn't seem to be the case :/
https://marketplace.visualstudio.com/items?itemName=DevonDCarew.bazel-code

@alexeagle
Copy link
Contributor

alexeagle commented Apr 2, 2018 via email

@sergiocampama
Copy link
Contributor

I've just tried disabling git support ("git.enabled": false) but VSCode still opens the file.

@alexeagle
Copy link
Contributor

alexeagle commented Apr 2, 2018 via email

@jmmv
Copy link
Contributor

jmmv commented May 11, 2018

I've continued to look into this without much success so far. Some more observations:

  • It doesn't matter if the symlinks are present within the workspace. I have a project that doesn't cause Bazel to create the bazel-* symlinks (which means there is nothing in the workspace pointing to the output base), and VSCode still insists on opening files in the temporary directory. In fact, it seems to hold open descriptors on a lot of /private/var/tmp.

  • I have tried to run VSCode with extensions disabled (--disable-extensions) and that made no difference.

  • The inability to run xcode-locator-bin remains for multiple seconds after it's triggered.

  • I've tried to SIGSTOP all Code Helper processes while the problem is active in an attempt to freeze the situation, but the xcode-locator-bin eventually becomes runnable anyway.

  • I've tried to compile a custom program where xcode-locator-bin lives by hand and run it immediately afterwards, and cannot reproduce it.

  • lsof reports the file as being open in read-only mode only.

  • fs_usage shows a ton of getattrlist and open calls on the file.

  • lsof also reports KQUEUE being used on the directories. Have tried to write a simple program that uses kqueue to monitor a directory and compile code within it... and have been unable to trigger the problem.

  • Have written a trivial test that compiles a C++ program and then immediately tries to race execution with open/closes... without success.

@jmmv jmmv changed the title bazel test fails on MacOs VSCode and Brackets cause Xcode autoconfiguration to fail on macOS May 11, 2018
@jmmv
Copy link
Contributor

jmmv commented May 11, 2018

Going back to the Santa theory again. I haven't been able to trigger the problem yet with the Santa kernel module unloaded, so I'm increasingly convinced that Santa is messing up the VSCode interaction with these temporary files.

From the Santa logs, grepping for local_config_xcode/xcode-locator-bin after a bazel clean --expunge and doing a build:

[2018-05-11T11:25:22.665Z] I santad: action=WRITE|path=/private/var/tmp/_bazel_jmmv/0726a2d22e6587da2201c0b3821a90e2/external/local_config_xcode/xcode-locator-bin|pid=85027|ppid=85025|process=ld|processpath=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld|uid=86328|user=jmmv|gid=5000|group=eng

That's it. There is just a WRITE entry. No EXEC entry.

At this point, if I keep manually executing xcode-locator-bin by hand, it will eventually execute successfully after many seconds and Santa logs:

[2018-05-11T11:35:56.825Z] I santad: action=EXEC|decision=ALLOW|reason=UNKNOWN|sha256=6936a9d9d0a11d846aff11b3cc75c08bcb7b4ad4faa37ecd5f726b56039310b7|pid=85920|ppid=1633|uid=86328|user=jmmv|gid=5000|group=eng|mode=M|path=/private/var/tmp/_bazel_jmmv/0726a2d22e6587da2201c0b3821a90e2/external/local_config_xcode/xcode-locator-bin|args=/private/var/tmp/_bazel_jmmv/0726a2d22e6587da2201c0b3821a90e2/external/local_config_xcode/xcode-locator-bin

EXEC claims that the execution was allowed... but the EXEC event is never mentioned when running xcode-locator-bin fails. This could either mean that the kernel is denying executions even before they reach Santa, or that Santa is misbehaving. I favor the latter, but I'm biased. No idea yet really.

@jmmv
Copy link
Contributor

jmmv commented May 11, 2018

I've filed google/santa#260 with Santa to start a discussion with them.

@jmmv
Copy link
Contributor

jmmv commented May 11, 2018

Confirmed that Santa is causing the failures (see google/santa#260 (comment)). But also figured that Bazel is acting suspiciously in that it's rebuilding xcode-locator-bin over and over again, which it shouldn't be doing, and that's probably tickling Santa in the wrong way. Will look into this.

@jmmv jmmv changed the title VSCode and Brackets cause Xcode autoconfiguration to fail on macOS VSCode/Brackets + Santa cause Xcode autoconfiguration to fail on macOS May 11, 2018
@jmmv
Copy link
Contributor

jmmv commented May 29, 2018

Oops, didn't want to close this just yet but I the commit message didn't do what I intended.

While the Santa issue still remains, I think my recent commit should prevent this problem from happening again. Could anyone try to reproduce by using a HEAD-built Bazel with 5b02559 in it?

@jmmv jmmv reopened this May 29, 2018
@LinboLen
Copy link

the bazel-out folder really annoy me

@jmmv
Copy link
Contributor

jmmv commented Jun 20, 2018

Alright! So the Santa bug was fixed (google/santa#260) and should have rolled out by now. And we did as most as we could on the Bazel side to avoid hitting the bug in the first place (#5196). So I am hoping this is all fixed and can now be closed.

@jmmv jmmv closed this as completed Jun 20, 2018
alexeagle added a commit to alexeagle/rules_nodejs that referenced this issue Feb 4, 2019
bazelbuild/bazel#4603
seems to be fixed so these are convenient
alexeagle added a commit to alexeagle/rules_nodejs that referenced this issue Feb 5, 2019
Now that bazelbuild/bazel#4603 is resolved, it's convenient to have them again
gregmagolan pushed a commit to bazel-contrib/rules_nodejs that referenced this issue Feb 5, 2019
Now that bazelbuild/bazel#4603 is resolved, it's convenient to have them again
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants