Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_java_binary_versions from javac_binary_test.py is flaky #12293

Closed
Eric-Arellano opened this issue Jul 7, 2021 · 22 comments · Fixed by #12425 or #13046
Closed

test_java_binary_versions from javac_binary_test.py is flaky #12293

Eric-Arellano opened this issue Jul 7, 2021 · 22 comments · Fixed by #12425 or #13046
Assignees
Labels
backend: JVM JVM backend-related issues flaky-test

Comments

@Eric-Arellano
Copy link
Contributor

This failed in the main branch: https://github.com/pantsbuild/pants/runs/3010851715?check_suite_focus=true#step:11:919

src/python/pants/backend/java/compile/javac_binary_test.py::test_java_binary_versions FAILED [100%]
920

921
=================================== FAILURES ===================================
922
__________________________ test_java_binary_versions ___________________________
923

924
rule_runner = RuleRunner(build_root=/tmp/_BUILD_ROOT63jg1kew)
925

926
    def test_java_binary_versions(rule_runner: RuleRunner) -> None:
927
        # default version is 1.11
928
>       assert "javac 11.0" in run_javac_version(rule_runner)
929

930
src/python/pants/backend/java/compile/javac_binary_test.py:57: 
931
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
932
src/python/pants/backend/java/compile/javac_binary_test.py:46: in run_javac_version
933
    description="",
934
src/python/pants/testutil/rule_runner.py:212: in request
935
    self.scheduler.product_request(output_type, [Params(*inputs)])
936
src/python/pants/engine/internals/scheduler.py:561: in product_request
937
    self._raise_on_error([t for _, t in throws])
938
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
939

940
self = <pants.engine.internals.scheduler.SchedulerSession object at 0x7f50e94f77d0>
941
throws = [Throw(exc=ProcessExecutionFailure('Process \'\' failed with exit code 1.\nstdout:\n\nstderr:\n++ ./cs-x86_64-pc-linux...)\n+ javac_path=/bin/javac\n\n', engine_traceback=['pants.engine.process.fallible_to_exec_result_or_raise', 'select'])]
942

943
    def _raise_on_error(self, throws: list[Throw]) -> NoReturn:
944
        exception_noun = pluralize(len(throws), "Exception")
945
    
946
        if self._scheduler.include_trace_on_error:
947
            throw = throws[0]
948
            etb = throw.engine_traceback
949
            python_traceback_str = throw.python_traceback or ""
950
            engine_traceback_str = ""
951
            others_msg = f"\n(and {len(throws) - 1} more)" if len(throws) > 1 else ""
952
            if etb:
953
                sep = "\n  in "
954
                engine_traceback_str = "Engine traceback:" + sep + sep.join(reversed(etb)) + "\n"
955
            raise ExecutionError(
956
                f"{exception_noun} encountered:\n\n"
957
                f"{engine_traceback_str}"
958
                f"{python_traceback_str}"
959
                f"{others_msg}",
960
>               wrapped_exceptions=tuple(t.exc for t in throws),
961
            )
962
E           pants.engine.internals.scheduler.ExecutionError: 1 Exception encountered:
963
E           
964
E           Engine traceback:
965
E             in select
966
E             in pants.engine.process.fallible_to_exec_result_or_raise
967
E           Traceback (most recent call last):
968
E             File "/tmp/process-executionkWZHpy/src/python/pants/engine/process.py", line 262, in fallible_to_exec_result_or_raise
969
E               description.value,
970
E           pants.engine.process.ProcessExecutionFailure: Process '' failed with exit code 1.
971
E           stdout:
972
E           
973
E           stderr:
974
E           ++ ./cs-x86_64-pc-linux java-home --jvm adopt:1.11
975
E           Downloading https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz
976
E           Still downloading:
977
E           https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz (8.20 %, 15801778 / 192792051)
978
E           
979
E           Still downloading:
980
E           https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz (80.22 %, 154656178 / 192792051)
981
E           
982
E           Downloaded https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz
983
E           Extracting
984
E             /home/runner/.cache/coursier/v1/https/github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%252B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz
985
E           in
986
E             /home/runner/.cache/coursier/jvm/[email protected]
987
E           Extraction failed: java.nio.file.FileSystemException: /home/runner/.cache/coursier/jvm/[email protected]/jdk-11.0.11+9 -> /home/runner/.cache/coursier/jvm/[email protected]: Directory not empty
988
E           Exception in thread "main" java.nio.file.FileSystemException: /home/runner/.cache/coursier/jvm/[email protected]/jdk-11.0.11+9 -> /home/runner/.cache/coursier/jvm/[email protected]: Directory not empty
989
E           	at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:417)
990
E           	at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:267)
991
E           	at java.nio.file.Files.move(Files.java:1421)
992
E           	at coursier.jvm.JvmCache.$anonfun$tryExtract$1(JvmCache.scala:66)
993
E           	at coursier.jvm.JvmCache.$anonfun$withLockFor$1(JvmCache.scala:267)
994
E           	at coursier.cache.CacheLocks$.loop$1(CacheLocks.scala:72)
995
E           	at coursier.cache.CacheLocks$.withLockOr(CacheLocks.scala:98)
996
E           	at coursier.jvm.JvmCache.withLockFor(JvmCache.scala:267)
997
E           	at coursier.jvm.JvmCache.tryExtract(JvmCache.scala:48)
998
E           	at coursier.jvm.JvmCache.$anonfun$get$9(JvmCache.scala:140)
999
E           	at coursier.util.Task$.wrap(Task.scala:84)
1000
E           	at coursier.util.Task$.$anonfun$delay$2(Task.scala:49)
1001
E           	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
1002
E           	at scala.util.Success.$anonfun$map$1(Try.scala:255)
1003
E           	at scala.util.Success.map(Try.scala:213)
1004
E           	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
1005
E           	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
1006
E           	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
1007
E           	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
1008
E           	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
1009
E           	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
1010
E           	at java.lang.Thread.run(Thread.java:834)
1011
E           	at com.oracle.svm.core.thread.JavaThreads.threadStartRoutine(JavaThreads.java:517)
1012
E           	at com.oracle.svm.core.posix.thread.PosixJavaThreads.pthreadStartRoutine(PosixJavaThreads.java:193)
1013
E           + javac_path=/bin/javac
1014

1015
src/python/pants/engine/internals/scheduler.py:501: ExecutionError
1016
- generated xml file: /tmp/process-executionkWZHpy/src.python.pants.backend.java.compile.javac_binary_test.py.tests.xml -
1017

1018

1019
=========================== short test summary info ============================
1020
FAILED src/python/pants/backend/java/compile/javac_binary_test.py::test_java_binary_versions
1021
============================== 1 failed in 19.55s ==============================
1022

cc @patricklaw. I haven't had time to investigate

@patricklaw
Copy link
Member

From my read of the stack trace (haven't debugged further yet), I have two hypothesis:

  • This is a bug in Coursier, potentially only tickled by attempting to bootstrap JVMs in parallel.
  • We're running out of some resource in the test contianer and it's manifesting in a weird way like this.

I'll dig in further, but for now I'm going to try to reduce the number of unique JVMs bootstrapped by our tests in order to discount or at least reduce the disk pressure in the case of the latter.

patricklaw added a commit that referenced this issue Jul 13, 2021
…uce CI flakiness (#12325)

See #12293 (comment) for motivation.  This commit has no functional change, and almost certainly doesn't reduce the efficacy of the existing tests in any substantial way.

[ci skip-rust]
[ci skip-build-wheels]
@Eric-Arellano
Copy link
Contributor Author

A related failure:

02:28:20.51 [WARN] Completed: test - src/python/pants/backend/java/compile/javac_test.py:tests failed (exit code 1).
904
============================= test session starts ==============================
905
collecting ... collected 6 items
906

907
src/python/pants/backend/java/compile/javac_test.py::test_compile_no_deps FAILED [ 16%]
908
src/python/pants/backend/java/compile/javac_test.py::test_compile_jdk_versions PASSED [ 33%]
909
src/python/pants/backend/java/compile/javac_test.py::test_compile_with_deps PASSED [ 50%]
910
src/python/pants/backend/java/compile/javac_test.py::test_compile_with_missing_dep_fails PASSED [ 66%]
911
src/python/pants/backend/java/compile/javac_test.py::test_compile_with_maven_deps PASSED [ 83%]
912
src/python/pants/backend/java/compile/javac_test.py::test_compile_with_missing_maven_dep_fails PASSED [100%]
913

914
=================================== FAILURES ===================================
915
_____________________________ test_compile_no_deps _____________________________
916

917
rule_runner = RuleRunner(build_root=/tmp/_BUILD_ROOT48cu_5na)
918

919
    def test_compile_no_deps(rule_runner: RuleRunner) -> None:
920
        rule_runner.write_files(
921
            {
922
                "BUILD": dedent(
923
                    """\
924
                    coursier_lockfile(
925
                        name = 'lockfile',
926
                        maven_requirements = [],
927
                        sources = [
928
                            "coursier_resolve.lockfile",
929
                        ],
930
                    )
931
    
932
                    java_library(
933
                        name = 'lib',
934
                        dependencies = [
935
                            ':lockfile',
936
                        ]
937
                    )
938
                    """
939
                ),
940
                "coursier_resolve.lockfile": CoursierResolvedLockfile(entries=())
941
                .to_json()
942
                .decode("utf-8"),
943
                "ExampleLib.java": JAVA_LIB_SOURCE,
944
            }
945
        )
946
    
947
        compiled_classfiles = rule_runner.request(
948
            CompiledClassfiles,
949
            [
950
                CompileJavaSourceRequest(
951
>                   target=rule_runner.get_target(address=Address(spec_path="", target_name="lib"))
952
                )
953
            ],
954
        )
955

956
src/python/pants/backend/java/compile/javac_test.py:109: 
957
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
958
src/python/pants/testutil/rule_runner.py:212: in request
959
    self.scheduler.product_request(output_type, [Params(*inputs)])
960
src/python/pants/engine/internals/scheduler.py:561: in product_request
961
    self._raise_on_error([t for _, t in throws])
962
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
963

964
self = <pants.engine.internals.scheduler.SchedulerSession object at 0x7fd10e8db490>
965
throws = [Throw(exc=ProcessExecutionFailure('Process \'Compile //ExampleLib.java:lib with javac\' failed with exit code 1.\nstd...ts.backend.java.compile.javac.compile_java_source', 'pants.backend.java.compile.javac.compile_java_source', 'select'])]
966

967
    def _raise_on_error(self, throws: list[Throw]) -> NoReturn:
968
        exception_noun = pluralize(len(throws), "Exception")
969
    
970
        if self._scheduler.include_trace_on_error:
971
            throw = throws[0]
972
            etb = throw.engine_traceback
973
            python_traceback_str = throw.python_traceback or ""
974
            engine_traceback_str = ""
975
            others_msg = f"\n(and {len(throws) - 1} more)" if len(throws) > 1 else ""
976
            if etb:
977
                sep = "\n  in "
978
                engine_traceback_str = "Engine traceback:" + sep + sep.join(reversed(etb)) + "\n"
979
            raise ExecutionError(
980
                f"{exception_noun} encountered:\n\n"
981
                f"{engine_traceback_str}"
982
                f"{python_traceback_str}"
983
                f"{others_msg}",
984
>               wrapped_exceptions=tuple(t.exc for t in throws),
985
            )
986
E           pants.engine.internals.scheduler.ExecutionError: 1 Exception encountered:
987
E           
988
E           Engine traceback:
989
E             in select
990
E             in pants.backend.java.compile.javac.compile_java_source
991
E             in pants.backend.java.compile.javac.compile_java_source
992
E             in pants.engine.process.fallible_to_exec_result_or_raise
993
E           Traceback (most recent call last):
994
E             File "/tmp/process-execution7MBKa8/src/python/pants/engine/process.py", line 262, in fallible_to_exec_result_or_raise
995
E               description.value,
996
E           pants.engine.process.ProcessExecutionFailure: Process 'Compile //ExampleLib.java:lib with javac' failed with exit code 1.
997
E           stdout:
998
E           
999
E           stderr:
1000
E           ++ ./cs-x86_64-pc-linux java-home --jvm adopt:1.11
1001
E           Downloading https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz
1002
E           Still downloading:
1003
E           https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz (7.42 %, 14296333 / 192792051)
1004
E           
1005
E           Still downloading:
1006
E           https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz (79.17 %, 152642829 / 192792051)
1007
E           
1008
E           Downloaded https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz
1009
E           Extracting
1010
E             /home/runner/.cache/coursier/v1/https/github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%252B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz
1011
E           in
1012
E             /home/runner/.cache/coursier/jvm/[email protected]
1013
E           Extraction failed: java.nio.file.FileSystemException: /home/runner/.cache/coursier/jvm/[email protected]/jdk-11.0.11+9 -> /home/runner/.cache/coursier/jvm/[email protected]: Directory not empty
1014
E           Exception in thread "main" java.nio.file.FileSystemException: /home/runner/.cache/coursier/jvm/[email protected]/jdk-11.0.11+9 -> /home/runner/.cache/coursier/jvm/[email protected]: Directory not empty
1015
E           	at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:417)
1016
E           	at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:267)
1017
E           	at java.nio.file.Files.move(Files.java:1421)
1018
E           	at coursier.jvm.JvmCache.$anonfun$tryExtract$1(JvmCache.scala:66)
1019
E           	at coursier.jvm.JvmCache.$anonfun$withLockFor$1(JvmCache.scala:267)
1020
E           	at coursier.cache.CacheLocks$.loop$1(CacheLocks.scala:72)
1021
E           	at coursier.cache.CacheLocks$.withLockOr(CacheLocks.scala:98)
1022
E           	at coursier.jvm.JvmCache.withLockFor(JvmCache.scala:267)
1023
E           	at coursier.jvm.JvmCache.tryExtract(JvmCache.scala:48)
1024
E           	at coursier.jvm.JvmCache.$anonfun$get$9(JvmCache.scala:140)
1025
E           	at coursier.util.Task$.wrap(Task.scala:84)
1026
E           	at coursier.util.Task$.$anonfun$delay$2(Task.scala:49)
1027
E           	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
1028
E           	at scala.util.Success.$anonfun$map$1(Try.scala:255)
1029
E           	at scala.util.Success.map(Try.scala:213)
1030
E           	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
1031
E           	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
1032
E           	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
1033
E           	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
1034
E           	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
1035
E           	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
1036
E           	at java.lang.Thread.run(Thread.java:834)
1037
E           	at com.oracle.svm.core.thread.JavaThreads.threadStartRoutine(JavaThreads.java:517)
1038
E           	at com.oracle.svm.core.posix.thread.PosixJavaThreads.pthreadStartRoutine(PosixJavaThreads.java:193)
1039
E           + javac_path=/bin/javac
1040

1041
src/python/pants/engine/internals/scheduler.py:501: ExecutionError
1042
- generated xml file: /tmp/process-execution7MBKa8/src.python.pants.backend.java.compile.javac_test.py.tests.xml -
1043

1044

1045
=========================== short test summary info ============================
1046
FAILED src/python/pants/backend/java/compile/javac_test.py::test_compile_no_deps
1047
========================= 1 failed, 5 passed in 34.68s =========================
1048

@patricklaw should we maybe add pytest retries to these tests?

@Eric-Arellano
Copy link
Contributor Author

Another one:

rc/python/pants/backend/java/compile/javac_binary_test.py::test_java_binary_versions FAILED [100%]
228

229
=================================== FAILURES ===================================
230
__________________________ test_java_binary_versions ___________________________
231

232
rule_runner = RuleRunner(build_root=/tmp/_BUILD_ROOT42x_vipb)
233

234
    def test_java_binary_versions(rule_runner: RuleRunner) -> None:
235
        # default version is 1.11
236
>       assert "javac 11.0" in run_javac_version(rule_runner)
237

238
src/python/pants/backend/java/compile/javac_binary_test.py:57: 
239
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
240
src/python/pants/backend/java/compile/javac_binary_test.py:46: in run_javac_version
241
    description="",
242
src/python/pants/testutil/rule_runner.py:212: in request
243
    self.scheduler.product_request(output_type, [Params(*inputs)])
244
src/python/pants/engine/internals/scheduler.py:561: in product_request
245
    self._raise_on_error([t for _, t in throws])
246
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
247

248
self = <pants.engine.internals.scheduler.SchedulerSession object at 0x7f457acfea50>
249
throws = [Throw(exc=ProcessExecutionFailure('Process \'\' failed with exit code 1.\nstdout:\n\nstderr:\n++ ./cs-x86_64-pc-linux...)\n+ javac_path=/bin/javac\n\n', engine_traceback=['pants.engine.process.fallible_to_exec_result_or_raise', 'select'])]
250

251
    def _raise_on_error(self, throws: list[Throw]) -> NoReturn:
252
        exception_noun = pluralize(len(throws), "Exception")
253
    
254
        if self._scheduler.include_trace_on_error:
255
            throw = throws[0]
256
            etb = throw.engine_traceback
257
            python_traceback_str = throw.python_traceback or ""
258
            engine_traceback_str = ""
259
            others_msg = f"\n(and {len(throws) - 1} more)" if len(throws) > 1 else ""
260
            if etb:
261
                sep = "\n  in "
262
                engine_traceback_str = "Engine traceback:" + sep + sep.join(reversed(etb)) + "\n"
263
            raise ExecutionError(
264
                f"{exception_noun} encountered:\n\n"
265
                f"{engine_traceback_str}"
266
                f"{python_traceback_str}"
267
                f"{others_msg}",
268
>               wrapped_exceptions=tuple(t.exc for t in throws),
269
            )
270
E           pants.engine.internals.scheduler.ExecutionError: 1 Exception encountered:
271
E           
272
E           Engine traceback:
273
E             in select
274
E             in pants.engine.process.fallible_to_exec_result_or_raise
275
E           Traceback (most recent call last):
276
E             File "/tmp/process-executionmpQOgc/src/python/pants/engine/process.py", line 262, in fallible_to_exec_result_or_raise
277
E               description.value,
278
E           pants.engine.process.ProcessExecutionFailure: Process '' failed with exit code 1.
279
E           stdout:
280
E           
281
E           stderr:
282
E           ++ ./cs-x86_64-pc-linux java-home --jvm adopt:1.11
283
E           Downloading https://github.com/shyiko/jabba/raw/master/index.json
284
E           Downloaded https://github.com/shyiko/jabba/raw/master/index.json
285
E           Downloading https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz
286
E           Still downloading:
287
E           https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz (0.00 %, 0 / 192792051)
288
E           
289
E           Still downloading:
290
E           https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz (17.01 %, 32789409 / 192792051)
291
E           
292
E           Still downloading:
293
E           https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz (65.22 %, 125743393 / 192792051)
294
E           
295
E           Downloaded https://github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz
296
E           Extracting
297
E             /home/runner/.cache/coursier/v1/https/github.com/AdoptOpenJDK/openjdk11-binaries/releases/download/jdk-11.0.11%252B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.11_9.tar.gz
298
E           in
299
E             /home/runner/.cache/coursier/jvm/[email protected]
300
E           Extraction failed: java.nio.file.FileSystemException: /home/runner/.cache/coursier/jvm/[email protected]/jdk-11.0.11+9 -> /home/runner/.cache/coursier/jvm/[email protected]: Directory not empty
301
E           Exception in thread "main" java.nio.file.FileSystemException: /home/runner/.cache/coursier/jvm/[email protected]/jdk-11.0.11+9 -> /home/runner/.cache/coursier/jvm/[email protected]: Directory not empty
302
E           	at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:417)
303
E           	at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:267)
304
E           	at java.nio.file.Files.move(Files.java:1421)
305
E           	at coursier.jvm.JvmCache.$anonfun$tryExtract$1(JvmCache.scala:66)
306
E           	at coursier.jvm.JvmCache.$anonfun$withLockFor$1(JvmCache.scala:267)
307
E           	at coursier.cache.CacheLocks$.loop$1(CacheLocks.scala:72)
308
E           	at coursier.cache.CacheLocks$.withLockOr(CacheLocks.scala:98)
309
E           	at coursier.jvm.JvmCache.withLockFor(JvmCache.scala:267)
310
E           	at coursier.jvm.JvmCache.tryExtract(JvmCache.scala:48)
311
E           	at coursier.jvm.JvmCache.$anonfun$get$9(JvmCache.scala:140)
312
E           	at coursier.util.Task$.wrap(Task.scala:84)
313
E           	at coursier.util.Task$.$anonfun$delay$2(Task.scala:49)
314
E           	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
315
E           	at scala.util.Success.$anonfun$map$1(Try.scala:255)
316
E           	at scala.util.Success.map(Try.scala:213)
317
E           	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
318
E           	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
319
E           	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
320
E           	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
321
E           	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
322
E           	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
323
E           	at java.lang.Thread.run(Thread.java:834)
324
E           	at com.oracle.svm.core.thread.JavaThreads.threadStartRoutine(JavaThreads.java:517)
325
E           	at com.oracle.svm.core.posix.thread.PosixJavaThreads.pthreadStartRoutine(PosixJavaThreads.java:193)
326
E           + javac_path=/bin/javac
327

328
src/python/pants/engine/internals/scheduler.py:501: ExecutionError
329
- generated xml file: /tmp/process-executionmpQOgc/src.python.pants.backend.java.compile.javac_binary_test.py.tests.xml -
330

331

332
=========================== short test summary info ============================
333
FAILED src/python/pants/backend/java/compile/javac_binary_test.py::test_java_binary_versions
334
============================== 1 failed in 30.54s ==============================

@patricklaw any insight into why these are so flaky? Do you think retries make sense?

Eric-Arellano added a commit that referenced this issue Jul 21, 2021
See #12293. These seem to be failing around 5-10% of builds in CI.

Because this backend is still experimental, there's less risk in skipping the tests. We can stabilize and re-enable before productionizing the backend.

[ci skip-rust]
[ci skip-build-wheels]
patricklaw added a commit to patricklaw/pants that referenced this issue Jul 25, 2021
…antsbuild#12293.

This causes JavacBinary to pass the `--system-jvm` option to Coursier,
which selects whichever JDK Coursier happens to find already installed
on the system.  Long term, we shouldn't use this behavior in tests and
probably shouldn't even allow it at all since the inherent non-hermeticity
isn't properly captured in cache keys, but for now it allows testing
of Coursier-dependent backends in CI until pantsbuild#12293 is resolved properly.
patricklaw added a commit to patricklaw/pants that referenced this issue Aug 21, 2021
…antsbuild#12293.

This causes JavacBinary to pass the `--system-jvm` option to Coursier,
which selects whichever JDK Coursier happens to find already installed
on the system.  Long term, we shouldn't use this behavior in tests and
probably shouldn't even allow it at all since the inherent non-hermeticity
isn't properly captured in cache keys, but for now it allows testing
of Coursier-dependent backends in CI until pantsbuild#12293 is resolved properly.
patricklaw added a commit to patricklaw/pants that referenced this issue Aug 21, 2021
…antsbuild#12293.

This causes JavacBinary to pass the `--system-jvm` option to Coursier,
which selects whichever JDK Coursier happens to find already installed
on the system.  Long term, we shouldn't use this behavior in tests and
probably shouldn't even allow it at all since the inherent non-hermeticity
isn't properly captured in cache keys, but for now it allows testing
of Coursier-dependent backends in CI until pantsbuild#12293 is resolved properly.
patricklaw added a commit to patricklaw/pants that referenced this issue Aug 21, 2021
This implementation is just good enough to demonstrate how to use the existing Pants Java infrastructure to compile and consume Java source for the purpose of executing junit tests. This initial iteration has several limitations:

* jUnit5 (org.junit.platform:junit-platform-console:1.7.2) is hard-coded as the JUnit runner in the rule source.  As needed, this can be hoisted into a subsystem for configurability.  By design, junit4 is not supported as a **runner**, because its classpath scanning isn't powerful enough.  However, junit4 tests can still be run with the junit5 runner.

* junit_tests targets have the same requirement of java_library targets that there must be exactly 1 coursier_lockfile dependency in the transitive closure of the junit_tests target. In practice this means that any third party dependencies required by the test source must also be shared by the library targets upon which the test transitively depends. Lockfile subsetting will mostly make this a non-issue, but it's still unfortunate that all test targets are indirectly locked to all other test targets that transitively depend on the same Java library code.

* Due to pantsbuild#12293, the test runner currently hard-codes the Coursier `--system-jvm` argument.  Future revisions will expose this as an option via `junit_test` parameters and/or a junit subsystem.

[ci skip-rust]
[ci skip-build-wheels]
patricklaw added a commit to patricklaw/pants that referenced this issue Aug 21, 2021
This implementation is just good enough to demonstrate how to use the existing Pants Java infrastructure to compile and consume Java source for the purpose of executing junit tests. This initial iteration has several limitations:

* jUnit5 (org.junit.platform:junit-platform-console:1.7.2) is hard-coded as the JUnit runner in the rule source.  As needed, this can be hoisted into a subsystem for configurability.  By design, junit4 is not supported as a **runner**, because its classpath scanning isn't powerful enough.  However, junit4 tests can still be run with the junit5 runner.

* junit_tests targets have the same requirement of java_library targets that there must be exactly 1 coursier_lockfile dependency in the transitive closure of the junit_tests target. In practice this means that any third party dependencies required by the test source must also be shared by the library targets upon which the test transitively depends. Lockfile subsetting will mostly make this a non-issue, but it's still unfortunate that all test targets are indirectly locked to all other test targets that transitively depend on the same Java library code.

* Due to pantsbuild#12293, the test runner currently hard-codes the Coursier `--system-jvm` argument.  Future revisions will expose this as an option via `junit_test` parameters and/or a junit subsystem.

[ci skip-rust]
[ci skip-build-wheels]
patricklaw added a commit to patricklaw/pants that referenced this issue Aug 21, 2021
This implementation is just good enough to demonstrate how to use the existing Pants Java infrastructure to compile and consume Java source for the purpose of executing junit tests. This initial iteration has several limitations:

* jUnit5 (org.junit.platform:junit-platform-console:1.7.2) is hard-coded as the JUnit runner in the rule source.  As needed, this can be hoisted into a subsystem for configurability.  By design, junit4 is not supported as a **runner**, because its classpath scanning isn't powerful enough.  However, junit4 tests can still be run with the junit5 runner.

* junit_tests targets have the same requirement of java_library targets that there must be exactly 1 coursier_lockfile dependency in the transitive closure of the junit_tests target. In practice this means that any third party dependencies required by the test source must also be shared by the library targets upon which the test transitively depends. Lockfile subsetting will mostly make this a non-issue, but it's still unfortunate that all test targets are indirectly locked to all other test targets that transitively depend on the same Java library code.

* Due to pantsbuild#12293, the test runner currently hard-codes the Coursier `--system-jvm` argument.  Future revisions will expose this as an option via `junit_test` parameters and/or a junit subsystem.

[ci skip-rust]
[ci skip-build-wheels]
patricklaw added a commit to patricklaw/pants that referenced this issue Aug 21, 2021
This implementation is just good enough to demonstrate how to use the existing Pants Java infrastructure to compile and consume Java source for the purpose of executing junit tests. This initial iteration has several limitations:

* jUnit5 (org.junit.platform:junit-platform-console:1.7.2) is hard-coded as the JUnit runner in the rule source.  As needed, this can be hoisted into a subsystem for configurability.  By design, junit4 is not supported as a **runner**, because its classpath scanning isn't powerful enough.  However, junit4 tests can still be run with the junit5 runner.

* junit_tests targets have the same requirement of java_library targets that there must be exactly 1 coursier_lockfile dependency in the transitive closure of the junit_tests target. In practice this means that any third party dependencies required by the test source must also be shared by the library targets upon which the test transitively depends. Lockfile subsetting will mostly make this a non-issue, but it's still unfortunate that all test targets are indirectly locked to all other test targets that transitively depend on the same Java library code.

* Due to pantsbuild#12293, the test runner currently hard-codes the Coursier `--system-jvm` argument.  Future revisions will expose this as an option via `junit_test` parameters and/or a junit subsystem.

[ci skip-rust]
[ci skip-build-wheels]
patricklaw added a commit that referenced this issue Aug 23, 2021
…12293. (#12425)

* Add special casing for '--javac-jdk=system' as a temporary hack for #12293.

This causes JavacBinary to pass the `--system-jvm` option to Coursier,
which selects whichever JDK Coursier happens to find already installed
on the system.  Long term, we shouldn't use this behavior in tests and
probably shouldn't even allow it at all since the inherent non-hermeticity
isn't properly captured in cache keys, but for now it allows testing
of Coursier-dependent backends in CI until #12293 is resolved properly.

* Also skip test_compile_jdk_versions in javac_test.py, which forces other JVM versions.

[ci skip-rust]
[ci skip-build-wheels]
@Eric-Arellano
Copy link
Contributor Author

Hm, this flaked on main today:

src/python/pants/backend/java/compile/javac_binary_test.py::test_java_binary_system_version FAILED [ 33%]
406
src/python/pants/backend/java/compile/javac_binary_test.py::test_java_binary_bogus_version_fails PASSED [ 66%]
407
src/python/pants/backend/java/compile/javac_binary_test.py::test_java_binary_versions SKIPPED [100%]
408

409
=================================== FAILURES ===================================
410
_______________________ test_java_binary_system_version ________________________
411

412
rule_runner = RuleRunner(build_root=/tmp/_BUILD_ROOT710dxgri)
413

414
    def test_java_binary_system_version(rule_runner: RuleRunner) -> None:
415
        rule_runner.set_options(["--javac-jdk=system"])
416
>       assert "javac" in run_javac_version(rule_runner)
417

418
src/python/pants/backend/java/compile/javac_binary_test.py:57: 
419
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
420
src/python/pants/backend/java/compile/javac_binary_test.py:34: in run_javac_version
421
    javac_binary = rule_runner.request(JavacBinary, [])
422
src/python/pants/testutil/rule_runner.py:212: in request
423
    self.scheduler.product_request(output_type, [Params(*inputs)])
424
src/python/pants/engine/internals/scheduler.py:562: in product_request
425
    self._raise_on_error([t for _, t in throws])
426
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
427

428
self = <pants.engine.internals.scheduler.SchedulerSession object at 0x7f304408d090>
429
throws = [Throw(exc=ProcessExecutionFailure('Process \'Invoke Coursier with system-jvm to fingerprint JVM version.\' failed wit...ne.process.fallible_to_exec_result_or_raise', 'pants.backend.java.compile.javac_binary.setup_javac_binary', 'select'])]
430

431
    def _raise_on_error(self, throws: list[Throw]) -> NoReturn:
432
        exception_noun = pluralize(len(throws), "Exception")
433
    
434
        if self._scheduler.include_trace_on_error:
435
            throw = throws[0]
436
            etb = throw.engine_traceback
437
            python_traceback_str = throw.python_traceback or ""
438
            engine_traceback_str = ""
439
            others_msg = f"\n(and {len(throws) - 1} more)" if len(throws) > 1 else ""
440
            if etb:
441
                sep = "\n  in "
442
                engine_traceback_str = "Engine traceback:" + sep + sep.join(reversed(etb)) + "\n"
443
            raise ExecutionError(
444
                f"{exception_noun} encountered:\n\n"
445
                f"{engine_traceback_str}"
446
                f"{python_traceback_str}"
447
                f"{others_msg}",
448
>               wrapped_exceptions=tuple(t.exc for t in throws),
449
            )
450
E           pants.engine.internals.scheduler.ExecutionError: 1 Exception encountered:
451
E           
452
E           Engine traceback:
453
E             in select
454
E             in pants.backend.java.compile.javac_binary.setup_javac_binary
455
E             in pants.engine.process.fallible_to_exec_result_or_raise
456
E           Traceback (most recent call last):
457
E             File "/tmp/process-executionkqvK5b/src/python/pants/engine/process.py", line 262, in fallible_to_exec_result_or_raise
458
E               description.value,
459
E           pants.engine.process.ProcessExecutionFailure: Process 'Invoke Coursier with system-jvm to fingerprint JVM version.' failed with exit code 1.
460
E           stdout:
461
E           
462
E           stderr:
463
E           Downloading https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u292-b10/OpenJDK8U-jdk_x64_linux_hotspot_8u292b10.tar.gz
464
E           Still downloading:
465
E           https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u292-b10/OpenJDK8U-jdk_x64_linux_hotspot_8u292b10.tar.gz (66.30 %, 68308708 / 103026380)
466
E           
467
E           Downloaded https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u292-b10/OpenJDK8U-jdk_x64_linux_hotspot_8u292b10.tar.gz
468
E           Extracting
469
E             /home/runner/.cache/coursier/v1/https/github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u292-b10/OpenJDK8U-jdk_x64_linux_hotspot_8u292b10.tar.gz
470
E           in
471
E             /home/runner/.cache/coursier/jvm/[email protected]
472
E           Extraction failed: java.nio.file.FileSystemException: /home/runner/.cache/coursier/jvm/[email protected]/jdk8u292-b10 -> /home/runner/.cache/coursier/jvm/[email protected]: Directory not empty
473
E           Exception in thread "main" java.nio.file.FileSystemException: /home/runner/.cache/coursier/jvm/[email protected]/jdk8u292-b10 -> /home/runner/.cache/coursier/jvm/[email protected]: Directory not empty
474
E           	at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:417)
475
E           	at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:267)
476
E           	at java.nio.file.Files.move(Files.java:1421)
477
E           	at coursier.jvm.JvmCache.$anonfun$tryExtract$1(JvmCache.scala:66)
478
E           	at coursier.jvm.JvmCache.$anonfun$withLockFor$1(JvmCache.scala:267)
479
E           	at coursier.cache.CacheLocks$.loop$1(CacheLocks.scala:72)
480
E           	at coursier.cache.CacheLocks$.withLockOr(CacheLocks.scala:98)
481
E           	at coursier.jvm.JvmCache.withLockFor(JvmCache.scala:267)
482
E           	at coursier.jvm.JvmCache.tryExtract(JvmCache.scala:48)
483
E           	at coursier.jvm.JvmCache.$anonfun$get$9(JvmCache.scala:140)
484
E           	at coursier.util.Task$.wrap(Task.scala:84)
485
E           	at coursier.util.Task$.$anonfun$delay$2(Task.scala:49)
486
E           	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
487
E           	at scala.util.Success.$anonfun$map$1(Try.scala:255)
488
E           	at scala.util.Success.map(Try.scala:213)
489
E           	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
490
E           	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
491
E           	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
492
E           	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
493
E           	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
494
E           	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
495
E           	at java.lang.Thread.run(Thread.java:834)
496
E           	at com.oracle.svm.core.thread.JavaThreads.threadStartRoutine(JavaThreads.java:517)
497
E           	at com.oracle.svm.core.posix.thread.PosixJavaThreads.pthreadStartRoutine(PosixJavaThreads.java:193)
498

499
src/python/pants/engine/internals/scheduler.py:502: ExecutionError
500
- generated xml file: /tmp/process-executionkqvK5b/src.python.pants.backend.java.compile.javac_binary_test.py.tests.xml -
501

502

503
=========================== short test summary info ============================
504
FAILED src/python/pants/backend/java/compile/javac_binary_test.py::test_java_binary_system_version
505
=================== 1 failed, 1 passed, 1 skipped in 15.63s ====================
506

@Eric-Arellano Eric-Arellano reopened this Aug 23, 2021
@tdyas tdyas added the backend: JVM JVM backend-related issues label Aug 30, 2021
@tdyas tdyas self-assigned this Aug 30, 2021
@tdyas
Copy link
Contributor

tdyas commented Aug 31, 2021

The failed download here is the JDK, not a fetched jar. Maybe we should just host the JDK somewhere or allow a JDK in the Pants CI Docker image?

@tdyas
Copy link
Contributor

tdyas commented Aug 31, 2021

Alternate idea: embed the JDK archive in the CI Docker image and then run a web server during the integration test pointing to that JDK archive.

@tdyas
Copy link
Contributor

tdyas commented Aug 31, 2021

Hmm the actual error is:

Extraction failed: java.nio.file.FileSystemException: /home/runner/.cache/coursier/jvm/[email protected]/jdk-11.0.11+9 -> /home/runner/.cache/coursier/jvm/[email protected]: Directory not empty
301

So the extraction failed because the directory where it was unpacking was not empty.

@tdyas
Copy link
Contributor

tdyas commented Aug 31, 2021

Potentially related: coursier/coursier#1815

Theory: Multiple concurrent Coursier runs are racing with each other.

@tdyas
Copy link
Contributor

tdyas commented Aug 31, 2021

Also potentially related: coursier/coursier#1818

patricklaw added a commit to patricklaw/pants that referenced this issue Sep 1, 2021
This implementation is just good enough to demonstrate how to use the existing Pants Java infrastructure to compile and consume Java source for the purpose of executing junit tests. This initial iteration has several limitations:

* jUnit5 (org.junit.platform:junit-platform-console:1.7.2) is hard-coded as the JUnit runner in the rule source.  As needed, this can be hoisted into a subsystem for configurability.  By design, junit4 is not supported as a **runner**, because its classpath scanning isn't powerful enough.  However, junit4 tests can still be run with the junit5 runner.

* junit_tests targets have the same requirement of java_library targets that there must be exactly 1 coursier_lockfile dependency in the transitive closure of the junit_tests target. In practice this means that any third party dependencies required by the test source must also be shared by the library targets upon which the test transitively depends. Lockfile subsetting will mostly make this a non-issue, but it's still unfortunate that all test targets are indirectly locked to all other test targets that transitively depend on the same Java library code.

* Due to pantsbuild#12293, the test runner currently hard-codes the Coursier `--system-jvm` argument.  Future revisions will expose this as an option via `junit_test` parameters and/or a junit subsystem.

[ci skip-rust]
[ci skip-build-wheels]
patricklaw added a commit that referenced this issue Sep 1, 2021
This implementation is just good enough to demonstrate how to use the existing Pants Java infrastructure to compile and consume Java source for the purpose of executing junit tests. This initial iteration has several limitations:

* jUnit5 (org.junit.platform:junit-platform-console:1.7.2) is hard-coded as the JUnit runner in the rule source.  As needed, this can be hoisted into a subsystem for configurability.  By design, junit4 is not supported as a **runner**, because its classpath scanning isn't powerful enough.  However, junit4 tests can still be run with the junit5 runner.

* junit_tests targets have the same requirement of java_library targets that there must be exactly 1 coursier_lockfile dependency in the transitive closure of the junit_tests target. In practice this means that any third party dependencies required by the test source must also be shared by the library targets upon which the test transitively depends. Lockfile subsetting will mostly make this a non-issue, but it's still unfortunate that all test targets are indirectly locked to all other test targets that transitively depend on the same Java library code.

* Due to #12293, the test runner currently hard-codes the Coursier `--system-jvm` argument.  Future revisions will expose this as an option via `junit_test` parameters and/or a junit subsystem.

[ci skip-rust]
[ci skip-build-wheels]
@patricklaw
Copy link
Member

So it seems like the workaround I added didn't actually fix the issue, because Coursier still needed to fetch a JVK even when --system-jvm was used. Presumably in the CI environment it couldn't find a system JVM, and falls back to fetching one?

Tom: Those Coursier bugs look spot on. My initial suspicion was that this was a race condition of some kind in Coursier's JVM fetch logic.

@stuhood
Copy link
Member

stuhood commented Sep 15, 2021

I think that there might be a fairly straightforward fix workaround for this issue: currently we trigger downloading the JDK in every compile process, which means that we have concurrent fetches. If we move selecting the JDK into a separate process, we'll have only a single copy running per Pants run due to memoization.

@tdyas
Copy link
Contributor

tdyas commented Sep 15, 2021

I think that there might be a fairly straightforward fix for this issue: currently we trigger downloading the JDK in every compile process, which means that we have concurrent fetches. If we move selecting the JDK into a separate process, we'll have only a single copy running per Pants run due to memoization.

I'll look into this.

@tdyas
Copy link
Contributor

tdyas commented Sep 19, 2021

Side note: I found the https://github.com/shyiko/jabba project today which is a JVM version manager (as pyenv is for Python). Coursier actually uses the JVM index from jabba to know where it can download JVMs from.

@tdyas
Copy link
Contributor

tdyas commented Sep 21, 2021

If we move selecting the JDK into a separate process, we'll have only a single copy running per Pants run due to memoization.

@stuhood: Does memoization only operate if pantsd is running?

@stuhood
Copy link
Member

stuhood commented Sep 21, 2021

If we move selecting the JDK into a separate process, we'll have only a single copy running per Pants run due to memoization.

@stuhood: Does memoization only operate if pantsd is running?

No, always.

tdyas pushed a commit that referenced this issue Sep 22, 2021
## Motivation

As described in #12293, multiple Coursier invocations were downloading the JDK and triggering [a race condition in Coursier's locking](coursier/coursier#1815) that caused flakiness in tests.

## Solution

This PR mitigates the issue by isolating JDK download to a single `Process`. The new `JdkSetup` type provides rules with the command to obtain the location of the JDK so they may query Coursier for JAVA_HOME. This has the benefit of still downloading in remote execution, but providing some guarantee that there will be a single download.
@Eric-Arellano
Copy link
Contributor Author

FYI @tdyas, seen on main

21:05:08.45 [ERROR] Completed: Run Pytest - src/python/pants/backend/java/compile/javac_test.py:tests failed (exit code 1).
416
============================= test session starts ==============================
417
collecting ... collected 10 items
418

419
src/python/pants/backend/java/compile/javac_test.py::test_compile_no_deps FAILED [ 10%]
420
src/python/pants/backend/java/compile/javac_test.py::test_compile_jdk_versions SKIPPED [ 20%]
421
src/python/pants/backend/java/compile/javac_test.py::test_compile_multiple_source_files PASSED [ 30%]
422
src/python/pants/backend/java/compile/javac_test.py::test_compile_with_cycle PASSED [ 40%]
423
src/python/pants/backend/java/compile/javac_test.py::test_compile_with_transitive_cycle PASSED [ 50%]
424
src/python/pants/backend/java/compile/javac_test.py::test_compile_with_transitive_multiple_sources PASSED [ 60%]
425
src/python/pants/backend/java/compile/javac_test.py::test_compile_with_deps PASSED [ 70%]
426
src/python/pants/backend/java/compile/javac_test.py::test_compile_with_missing_dep_fails PASSED [ 80%]
427
src/python/pants/backend/java/compile/javac_test.py::test_compile_with_maven_deps PASSED [ 90%]
428
src/python/pants/backend/java/compile/javac_test.py::test_compile_with_missing_maven_dep_fails PASSED [100%]
429

430
=================================== FAILURES ===================================
431
_____________________________ test_compile_no_deps _____________________________
432

433
rule_runner = RuleRunner(build_root=/tmp/_BUILD_ROOTcyso078s)
434

435
    def test_compile_no_deps(rule_runner: RuleRunner) -> None:
436
        rule_runner.write_files(
437
            {
438
                "BUILD": dedent(
439
                    """\
440
                    coursier_lockfile(
441
                        name = 'lockfile',
442
                        requirements = [],
443
                        sources = [
444
                            "coursier_resolve.lockfile",
445
                        ],
446
                    )
447
    
448
                    java_sources(
449
                        name = 'lib',
450
                        dependencies = [
451
                            ':lockfile',
452
                        ]
453
                    )
454
                    """
455
                ),
456
                "coursier_resolve.lockfile": CoursierResolvedLockfile(entries=())
457
                .to_json()
458
                .decode("utf-8"),
459
                "ExampleLib.java": JAVA_LIB_SOURCE,
460
            }
461
        )
462
        coarsened_target = expect_single_expanded_coarsened_target(
463
            rule_runner, Address(spec_path="", target_name="lib")
464
        )
465
    
466
        compiled_classfiles = rule_runner.request(
467
            CompiledClassfiles,
468
>           [CompileJavaSourceRequest(component=coarsened_target)],
469
        )
470

471
src/python/pants/backend/java/compile/javac_test.py:142: 
472
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
473
src/python/pants/testutil/rule_runner.py:213: in request
474
    self.scheduler.product_request(output_type, [Params(*inputs)])
475
src/python/pants/engine/internals/scheduler.py:568: in product_request
476
    self._raise_on_error([t for _, t in throws])
477
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
478

479
self = <pants.engine.internals.scheduler.SchedulerSession object at 0x7ff214b8a550>
480
throws = [Throw(exc=ValueError('Failed to determine JAVA_HOME for JDK system: Downloading https://github.com/AdoptOpenJDK/openj...ts.backend.java.compile.javac.compile_java_source', 'pants.backend.java.compile.javac.required_classfiles', 'select'])]
481

482
    def _raise_on_error(self, throws: list[Throw]) -> NoReturn:
483
        exception_noun = pluralize(len(throws), "Exception")
484
    
485
        if self._scheduler.include_trace_on_error:
486
            throw = throws[0]
487
            etb = throw.engine_traceback
488
            python_traceback_str = throw.python_traceback or ""
489
            engine_traceback_str = ""
490
            others_msg = f"\n(and {len(throws) - 1} more)" if len(throws) > 1 else ""
491
            if etb:
492
                sep = "\n  in "
493
                engine_traceback_str = "Engine traceback:" + sep + sep.join(reversed(etb)) + "\n"
494
            raise ExecutionError(
495
                f"{exception_noun} encountered:\n\n"
496
                f"{engine_traceback_str}"
497
                f"{python_traceback_str}"
498
                f"{others_msg}",
499
>               wrapped_exceptions=tuple(t.exc for t in throws),
500
            )
501
E           pants.engine.internals.scheduler.ExecutionError: 1 Exception encountered:
502
E           
503
E           Engine traceback:
504
E             in select
505
E             in pants.backend.java.compile.javac.required_classfiles
506
E             in pants.backend.java.compile.javac.compile_java_source
507
E             in pants.backend.java.compile.javac_binary.setup_javac_binary
508
E             in pants.backend.java.util_rules.setup_jdk
509
E           Traceback (most recent call last):
510
E             File "/tmp/process-executionwOjESy/src/python/pants/engine/internals/selectors.py", line 695, in native_engine_generator_send
511
E               res = func.send(arg)
512
E             File "/tmp/process-executionwOjESy/src/python/pants/backend/java/util_rules.py", line 45, in setup_jdk
513
E               f"Failed to determine JAVA_HOME for JDK {javac.options.jdk}: {java_home_result.stderr.decode('utf-8')}"
514
E           ValueError: Failed to determine JAVA_HOME for JDK system: Downloading https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u292-b10/OpenJDK8U-jdk_x64_linux_hotspot_8u292b10.tar.gz
515
E           Still downloading:
516
E           https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u292-b10/OpenJDK8U-jdk_x64_linux_hotspot_8u292b10.tar.gz (48.80 %, 50278251 / 103026380)
517
E           
518
E           Downloaded https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u292-b10/OpenJDK8U-jdk_x64_linux_hotspot_8u292b10.tar.gz
519
E           Extracting
520
E             /home/runner/.cache/coursier/v1/https/github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u292-b10/OpenJDK8U-jdk_x64_linux_hotspot_8u292b10.tar.gz
521
E           in
522
E             /home/runner/.cache/coursier/jvm/[email protected]
523
E           Extraction failed: java.nio.file.FileSystemException: /home/runner/.cache/coursier/jvm/[email protected]/jdk8u292-b10 -> /home/runner/.cache/coursier/jvm/[email protected]: Directory not empty
524
E           Exception in thread "main" java.nio.file.FileSystemException: /home/runner/.cache/coursier/jvm/[email protected]/jdk8u292-b10 -> /home/runner/.cache/coursier/jvm/[email protected]: Directory not empty
525
E           	at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:417)
526
E           	at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:267)
527
E           	at java.nio.file.Files.move(Files.java:1421)
528
E           	at coursier.jvm.JvmCache.$anonfun$tryExtract$1(JvmCache.scala:66)
529
E           	at coursier.jvm.JvmCache.$anonfun$withLockFor$1(JvmCache.scala:267)
530
E           	at coursier.cache.CacheLocks$.loop$1(CacheLocks.scala:72)
531
E           	at coursier.cache.CacheLocks$.withLockOr(CacheLocks.scala:98)
532
E           	at coursier.jvm.JvmCache.withLockFor(JvmCache.scala:267)
533
E           	at coursier.jvm.JvmCache.tryExtract(JvmCache.scala:48)
534
E           	at coursier.jvm.JvmCache.$anonfun$get$9(JvmCache.scala:140)
535
E           	at coursier.util.Task$.wrap(Task.scala:84)
536
E           	at coursier.util.Task$.$anonfun$delay$2(Task.scala:49)
537
E           	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
538
E           	at scala.util.Success.$anonfun$map$1(Try.scala:255)
539
E           	at scala.util.Success.map(Try.scala:213)
540
E           	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
541
E           	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
542
E           	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
543
E           	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
544
E           	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
545
E           	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
546
E           	at java.lang.Thread.run(Thread.java:834)
547
E           	at com.oracle.svm.core.thread.JavaThreads.threadStartRoutine(JavaThreads.java:517)
548
E           	at com.oracle.svm.core.posix.thread.PosixJavaThreads.pthreadStartRoutine(PosixJavaThreads.java:193)
549

550
src/python/pants/engine/internals/scheduler.py:508: ExecutionError
551
- generated xml file: /tmp/process-executionwOjESy/src.python.pants.backend.java.compile.javac_test.py.tests.xml -
552

553

554
=========================== short test summary info ============================
555
FAILED src/python/pants/backend/java/compile/javac_test.py::test_compile_no_deps
556
=================== 1 failed, 8 passed, 1 skipped in 29.59s ====================
557

@tdyas
Copy link
Contributor

tdyas commented Sep 23, 2021

So even with #12972, concurrent instances of Coursier are still trying to download a JDK. Maybe the GitHub Actions config is missing the JDK setup config is some job where the JDK is needed?

Related, I believe we should give up on using Coursier to download the JDK. For now, we should require that it be installed in the system (whether that be local or remote).

@stuhood
Copy link
Member

stuhood commented Sep 23, 2021

So even with #12972, concurrent instances of Coursier are still trying to download a JDK. Maybe the GitHub Actions config is missing the JDK setup config is some job where the JDK is needed?

Related, I believe we should give up on using Coursier to download the JDK. For now, we should require that it be installed in the system (whether that be local or remote).

That is the effect of the --system flag: so maybe enable it by default?

@tdyas
Copy link
Contributor

tdyas commented Sep 23, 2021

That is the effect of the --system flag: so maybe enable it by default?

Note the error was Failed to determine JAVA_HOME for JDK system which means it was told to use a system JVM but still tried to download.

@stuhood
Copy link
Member

stuhood commented Sep 23, 2021

@tdyas : Argh. I realized the issue with #12293 (comment) ... each test process (we're up to 6ish test files that invoke the JDK) is still concurrent, and they're sharing the system named_cache directory.

@tdyas
Copy link
Contributor

tdyas commented Sep 23, 2021

@tdyas : Argh. I realized the issue with #12293 (comment) ... each test process (we're up to 6ish test files that invoke the JDK) is still concurrent, and they're sharing the system named_cache directory.

But the conflict is happening in the Coursier cache directory, right? And not the named cache dir?

@stuhood
Copy link
Member

stuhood commented Sep 23, 2021

But the conflict is happening in the Coursier cache directory, right? And not the named cache dir?

...yea, maybe. But if we're not pointing the Coursier cache directory into the named_cache directory, we should be for consistency. Depending on the degree to which the named_cache directory is being preserved in tests (it may not be at all: the RuleRunner could be using a new one for each run), that might also be another way to control this.

Additionally, we could use a stable named cache directory for integration tests, but concurrency control it with https://www.pantsbuild.org/docs/reference-pytest#section-execution-slot-var (i.e., effectively have N subdirectories)... that wouldn't avoid this issue in production when multiple repos are sharing a directory, but it might give us breathing room for tests.

stuhood added a commit that referenced this issue Sep 30, 2021
…and use them for Coursier (#13046)

Coursier is used to fetch JVMs and artifacts for Java support. On `main`, it uses its default cache directory, but as shown in #12293, this has concurrency issues when multiple clients are fetching JVMs at the same time.

The underlying issue is likely to be fixed by coursier/coursier#2197, but in the meantime we can move to using `append_only_caches` for the `coursier` `--cache-dir` and `--jvm-dir`. Alone, this isn't sufficient to avoid concurrency issues (since fetches would simply collide in a new location): but it allows us to re-configure the cache location in `RuleRunner` tests using the `[pytest] execution_slot_var` to prevent collisions.

Along the way, support for mixing `append_only_caches` and `use_nailgun` needed fixing.

Re-enables running of JVM tests by default, and fixes #12293.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment