Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA not enabled in builds of OpenJ9 for Windows #2272

Closed
keithc-ca opened this issue Nov 30, 2020 · 11 comments
Closed

CUDA not enabled in builds of OpenJ9 for Windows #2272

keithc-ca opened this issue Nov 30, 2020 · 11 comments
Assignees
Labels
bug Issues that are problems in the code as reported by the community openj9 Issues that are enhancements or bugs raised against the OpenJ9 group windows Issues that affect or relate to the WINDOWS OS
Milestone

Comments

@keithc-ca
Copy link
Contributor

Windows builds with OpenJ9 used to have CUDA enabled, and apparently still anticipate including that capability (see [1] and [2]).

Recent builds (e.g. openj9-0.23.0) do not include CUDA support.

[1] https://github.com/AdoptOpenJDK/openjdk-build/blob/master/build-farm/platform-specific-configurations/windows.sh#L152
[2] https://github.com/AdoptOpenJDK/openjdk-infrastructure/blob/master/ansible/playbooks/AdoptOpenJDK_Windows_Playbook/roles/NVidia_Cuda_Toolkit/tasks/main.yml

@M-Davies M-Davies added bug Issues that are problems in the code as reported by the community code-tools Issues that are miscellaneous enhancements or bugs with our utilities that assist our build scripts openj9 Issues that are enhancements or bugs raised against the OpenJ9 group windows Issues that affect or relate to the WINDOWS OS labels Dec 1, 2020
@M-Davies M-Davies removed the code-tools Issues that are miscellaneous enhancements or bugs with our utilities that assist our build scripts label Dec 1, 2020
@adamfarley adamfarley self-assigned this Dec 1, 2020
@adamfarley
Copy link
Contributor

adamfarley commented Dec 2, 2020

Found the problem. Script bug. Working on a fix.

Details:

cygpath -ms "C:/program files/etc"

Is not removing the space like it is expected to, and as such the subsequent call to "cygpath -u" splits the string at the space and returns two paths, neither of which are valid for the -f "does this file exist" check.

@aahlenst
Copy link
Contributor

aahlenst commented Dec 2, 2020

Is there an easy way to inspect whether CUDA is enabled in an OpenJ9 JDK? Asking because of adoptium/aqa-tests#2024.

@adamfarley
Copy link
Contributor

Sure. Go to the build job and search for --enable-cuda in the config command.

@aahlenst
Copy link
Contributor

aahlenst commented Dec 2, 2020

@adamfarley That's not what I'm asking. I want to know whether CUDA is enabled by looking at a downloaded JDK. Is there any property set?

@adamfarley
Copy link
Contributor

I don't know. I only know where cuda is enabled in the config args due to keith's links.

@aahlenst
Copy link
Contributor

aahlenst commented Dec 2, 2020

@keithc-ca Maybe you know. And if you could tell us which platforms/versions should have CUDA enabled, that would help towards catching regressions like that in the future.

@adamfarley
Copy link
Contributor

adamfarley commented Dec 2, 2020

As for this issue, the problem appears to be that most/all of our Windows machines lack shortened versions of either the "C:/Program Files" folder or the "C:/Program Files/NVIDIA GPU Computing Toolkit" folders, but not both.

The code then fails to find the "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v9.0/include/cuda.h" file, thinks you have no cuda, and simply declines to add the --enable-cuda configuration flag.

However, while this seems like a bug, it has the effect of protecting us from potentially failing later on when we fail to correctly deal with all parts of a single configure argument that has spaces inside of it (see #2279).

So, while I could solve this by making "both of these folders must have shortened names available on all Windows machines" an infrastructure issue and hefting it over the fence, I think the right play here is to reconfigure this code to use the unshortened folder names that have spaces in it, and to solve #2279 so that we can pass it into the build scripts proper as a configure argument.

@adamfarley
Copy link
Contributor

adamfarley commented Dec 2, 2020

After some discussion, it was pointed out that having a space makes this path make-incompatible, so we're probably better off with the infrastructure short-names solution.

Will raise an infrastructure issue, and once that's resolved we can add a PR for this issue that checks for spaces after the short-name substitution operation; thus creating the error message we should have had in the first place when this started happening.

Infra issue to prevent shortname issue in the future: adoptium/infrastructure#1729
Infra issue to resolve our specific niche case of the shortname issue: adoptium/infrastructure#1731

PR to check for this issue ever happening again: #2283
(Blocked on the infra issue)

@karianna karianna added this to the December 2020 milestone Dec 2, 2020
@keithc-ca
Copy link
Contributor Author

Is there an easy way to inspect whether CUDA is enabled in an OpenJ9 JDK? Asking because of AdoptOpenJDK/openjdk-tests#2024.

Besides running a program that actually uses an available GPU, you can check by looking for the string 'cudart' in j9prt29.dll. For example, the 0.21.0 release supports CUDA:

C:\jdk\openj9-11.0.8+10-0.21.0>bin\java -version
openjdk version "11.0.8" 2020-07-14
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.8+10)
Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.21.0, JRE 11 Windows 10 amd64-64-Bit Compressed References 20200715_679 (JIT enabled, AOT enabled)
OpenJ9   - 34cf4c075
OMR      - 113e54219
JCL      - 95bb504fbb based on jdk-11.0.8+10)

C:\jdk\openj9-11.0.8+10-0.21.0>strings bin\compressedrefs\j9prt29.dll | grep cudart
cudart64_102.dll
cudart64_101.dll
cudart64_100.dll
cudart64_92.dll
cudart64_91.dll
cudart64_90.dll

while the 0.23.0 release does not:

C:\jdk\openj9-11.0.9+11-0.23.0>bin\java -version
openjdk version "11.0.9" 2020-10-20
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.9+11)
Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.23.0, JRE 11 Windows 10 amd64-64-Bit Compressed References 20201022_795 (JIT enabled, AOT enabled)
OpenJ9   - 0394ef754
OMR      - 582366ae5
JCL      - 3b09cfd7e9 based on jdk-11.0.9+11)

C:\jdk\openj9-11.0.9+11-0.23.0>strings bin\compressedrefs\j9prt29.dll | grep cudart

@adamfarley
Copy link
Contributor

Ok, cuda is now enabled in Windows 64bit builds using openj9, as of last night's nightlies.

https://ci.adoptopenjdk.net/view/Failing%20Builds/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-windows-x64-openj9/846/consoleFull

@sxa
Copy link
Member

sxa commented Dec 4, 2020

Yep this looks ok to me now. Closing :-)

@sxa sxa closed this as completed Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issues that are problems in the code as reported by the community openj9 Issues that are enhancements or bugs raised against the OpenJ9 group windows Issues that affect or relate to the WINDOWS OS
Projects
None yet
Development

No branches or pull requests

6 participants