Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICU-22787 Fix ClangCL compilation on Windows #3023

Merged

Conversation

StefanStojanovic
Copy link
Contributor

@StefanStojanovic StefanStojanovic commented Jun 3, 2024

I'm working on supporting Node.js on Windows. There is an ongoing effort to enable it to be compiled with ClangCL. While working on it I noticed an issue when creating .obj file in ICU because of the *pCPU = IMAGE_FILE_MACHINE_UNKNOWN; line, and the code I made fixes that by adding a new option. Although this is a broader problem that should be fixed for multiple compilers and platforms, my change is aimed toward ClangCL and Windows, but can be used as a good foundation to work on fixes for other platforms and compilers and make corrections in future iterations.

This issue is currently our main blocker for enabling ClangCL compilation for Node.js and it'd be greatly appreciated if it could be reviewed soon.

Refs: nodejs/node#53003

Checklist
  • Required: Issue filed: https://unicode-org.atlassian.net/browse/ICU-22787
  • Required: The PR title must be prefixed with a JIRA Issue number.
  • Required: The PR description must include the link to the Jira Issue, for example by completing the URL in the first checklist item
  • Required: Each commit message must be prefixed with a JIRA Issue number.
  • Issue accepted (done by Technical Committee after discussion)
  • Tests included, if applicable
  • API docs and/or User Guide docs changed or added, if applicable

@srl295
Copy link
Member

srl295 commented Jun 3, 2024

#3013

@StefanStojanovic StefanStojanovic changed the title ICU-22787 Fix ClangCL cross-compilation on Windows ICU-22787 Fix ClangCL compilation on Windows Jun 4, 2024
@markusicu markusicu requested review from srl295 and markusicu June 6, 2024 16:49
} else if (strcmp(optCpuArch, "arm64") == 0) {
*pCPU = IMAGE_FILE_MACHINE_ARM64;
} else {
// This should never happen.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I'd say that it'd be easier to debug if the process terminated with an error message if that which should never happen ever happened anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I've changed it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this code never will be executed, I recommend keeping it as simple as possible, it really doesn't need to print any custom error message that never will be printed anyway and a simple source code comment would suffice:

#include <exception>
std::terminate();  // Unreachable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it based on your previous comment:

In that case, I'd say that it'd be easier to debug if the process terminated with an error message if that which should never happen ever happened anyway.

Although I didn't think about it before you mentioned it, this log makes sense to me. For example, imagine someone makes a typo and passes -c X64 or -c amr64, this log would help them fix it much quicker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, imagine someone makes a typo and passes -c X64 or -c amr64, this log would help them fix it much quicker.

Wait a second, your original comment here was "This should never happen." but the user making a typo on the commandline is the total opposite of that, that's something that certainly is going to happen, sooner or later.

But verifying that the commandline arguments are correct should be done already in main() in genccode.c, in the same way that the arguments to -a are being verified there by calling the checkAssemblyHeaderName() helper function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the confusion. My comment was misleading. Sorry for that. I checked the code now and saw the checkAssemblyHeaderName helper, I'll add that type of check to the -c option as well. Thanks for pointing that out. Once I add that the else branch should really never happen, and then the change you suggested with std::terminate() will be correct.

@@ -131,6 +133,7 @@ main(int argc, char* argv[]) {
"\t-o or --object write a .obj file instead of .c\n"
"\t-m or --match-arch file.o match the architecture (CPU, 32/64 bits) of the specified .o\n"
"\t ELF format defaults to i386. Windows defaults to the native platform.\n"
"\t-c or --cpu-arch Specify a CPU architecture for which to write a .obj file for ClangCL on Windows\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be easier to use this tool correctly if this commandline flag was #ifdef'ed out altogether on platforms where it isn't implemented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I've only #ifdefed out the parts in genccode.c, as I didn't want to make writeObjectCode signature different depending on which compiler is used on Windows. I just pass it NULL if Clang isn't used. If you think I should make the signature #ifdefed too, let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to leave it a generic argument, even if it is only currently valid for ClangCL. It may be needed in other situations.

@@ -847,7 +852,24 @@ getArchitecture(uint16_t *pCPU, uint16_t *pBits, UBool *pIsBigEndian, const char
# if defined(_M_IX86)
*pCPU = IMAGE_FILE_MACHINE_I386;
# else
*pCPU = IMAGE_FILE_MACHINE_UNKNOWN;
// Linker for ClangCL doesn't handle IMAGE_FILE_MACHINE_UNKNOWN the same as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read the existing U_PLATFORM_HAS_WIN32_API block correctly it's all about Microsoft's link.exe, so it seems odd to now mix in another linker in that. Wouldn't this become easier to read if Microsoft's and LLVM's linkers got each their separate #if block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In platform.h for U_PLATFORM_HAS_WIN32_API it says: Defines whether the Win32 API is available on the platform.. This is true for ClangCL, so I've added a new constant U_CLANG_CL to distinguish when using Clang on Windows. After that, I changed to using that constant inside the mentioned block.

Wouldn't this become easier to read if Microsoft's and LLVM's linkers got each their separate #if block

It would, but I do not know how to write linker-specific code (I'm not saying it is impossible). What can be done from my perspective is to detect the compiler (MSVC or Clang) and then infer a linker (link.exe or lld-link.exe), which is more or less what we have now after these changes.

// the target architecture.
# if defined(__clang__)
if (strcmp(optCpuArch, "x64") == 0) {
*pCPU = IMAGE_FILE_MACHINE_AMD64;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are 32-bit builds not supported? It's been a long time since I last had any reason to do one, but I wasn't aware that they're no longer supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I've added a strcmp(optCpuArch, "x86") == 0 branch in case someone needs it since it is still supported.

Above this code there is:

#   if defined(_M_IX86)
        *pCPU = IMAGE_FILE_MACHINE_I386;

but that would only apply if genccode was compiled targeting x86, which is not necessarily the case.

@StefanStojanovic StefanStojanovic force-pushed the mefi-cross-compile-clangcl-fix branch from 1343c73 to a539dde Compare June 7, 2024 19:20
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • icu4c/source/common/unicode/platform.h is now changed in the branch
  • icu4c/source/tools/genccode/genccode.c is different
  • icu4c/source/tools/toolutil/pkg_genc.cpp is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@StefanStojanovic
Copy link
Contributor Author

Thanks for the review @roubert, I tried to address all your comments in a new commit.

I see 2 checks failed: jira-ticket and single-commit. I'll squash commits and keep only the original commit message once all is ready to land, but I kept 2 commits now for easier reviewing.

@markusicu
Copy link
Member

I see 2 checks failed: jira-ticket and single-commit. I'll squash commits and keep only the original commit message once all is ready to land, but I kept 2 commits now for easier reviewing.

Please wait with squashing until after feedback is resolved.

@StefanStojanovic
Copy link
Contributor Author

Please wait with squashing until after feedback is resolved.

Will do.

By the way, I see that in CI-ICU4C all MSVC builds failed. I'm not sure why that happened, but I built it locally before pushing (allinone.sln -> Build Solution) and it worked.

@StefanStojanovic
Copy link
Contributor Author

Hey everyone, I wanted to check if there is anything else for me to do after my initial changes? If you think it is better to leave the new option I can every my latest changes in genccode.c. Other than that, are the other changes good? Regards.

@roubert
Copy link
Member

roubert commented Jun 12, 2024

I see that in CI-ICU4C all MSVC builds failed. I'm not sure why that happened,

The CI is currently broken, you'll just have to wait for it to get fixed (and then those tests should pass again).

@roubert
Copy link
Member

roubert commented Jun 13, 2024

The CI is currently broken,

It has now been fixed and the tests pass.

* @internal
*/
#if (defined(_MSC_VER) && defined(__clang__) && __clang__)
# define U_CLANG_CL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend against adding a new macro to the public API for a check that's only used in one single internal tool. (That would also make the icu4c-docs-build failure disappear, without you having to actually update the documentation.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I'll remove it.

} else if (strcmp(optCpuArch, "arm64") == 0) {
*pCPU = IMAGE_FILE_MACHINE_ARM64;
} else {
// This should never happen.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this code never will be executed, I recommend keeping it as simple as possible, it really doesn't need to print any custom error message that never will be printed anyway and a simple source code comment would suffice:

#include <exception>
std::terminate();  // Unreachable.

} else if (strcmp(optCpuArch, "arm64") == 0) {
*pCPU = IMAGE_FILE_MACHINE_ARM64;
} else {
// This should never happen.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, imagine someone makes a typo and passes -c X64 or -c amr64, this log would help them fix it much quicker.

Wait a second, your original comment here was "This should never happen." but the user making a typo on the commandline is the total opposite of that, that's something that certainly is going to happen, sooner or later.

But verifying that the commandline arguments are correct should be done already in main() in genccode.c, in the same way that the arguments to -a are being verified there by calling the checkAssemblyHeaderName() helper function.

@StefanStojanovic StefanStojanovic force-pushed the mefi-cross-compile-clangcl-fix branch from a539dde to d5fe469 Compare June 20, 2024 10:30
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • icu4c/source/common/unicode/platform.h is different
  • icu4c/source/tools/genccode/genccode.c is different
  • icu4c/source/tools/toolutil/pkg_genc.cpp is different
  • icu4c/source/tools/toolutil/pkg_genc.h is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@StefanStojanovic
Copy link
Contributor Author

@roubert I pushed the changes in a new commit. I addressed your comments and made changes based on this comment. Please let me know if there is something else I should change. Regards.

Copy link
Member

@roubert roubert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this all looks reasonable to me now, so I'll leave it to @rp9-next (@microsoft) to take this review from here (and verify that what this does really is what it should be doing, and that this is the preferred way of doing that, and so on).

@rp9-next rp9-next requested a review from roubert August 8, 2024 08:04
@rp9-next
Copy link
Contributor

rp9-next commented Aug 8, 2024

@roubert , merging is blocked by a change requested check which I believe is outdated..
Could you dismiss it ?

@StefanStojanovic, please squash all the commits to a single commit for merging (you could use the squash branch tool as well)

@StefanStojanovic StefanStojanovic force-pushed the mefi-cross-compile-clangcl-fix branch from d5fe469 to d0712de Compare August 8, 2024 12:48
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • icu4c/source/tools/pkgdata/pkgdata.cpp is different
  • icu4c/source/tools/toolutil/pkg_genc.cpp is different
  • icu4c/source/tools/toolutil/pkg_genc.h is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@StefanStojanovic
Copy link
Contributor Author

@StefanStojanovic, please squash all the commits to a single commit for merging (you could use the squash branch tool as well)

Thanks for the review @rp9-next. I rebased to the latest main branch and squashed commits afterward.

@roubert
Copy link
Member

roubert commented Aug 8, 2024

@roubert , merging is blocked by a change requested check which I believe is outdated.. Could you dismiss it ?

Done.

@roubert roubert dismissed their stale review August 8, 2024 17:02

Done.

@roubert roubert removed their request for review August 8, 2024 17:03
@rp9-next rp9-next merged commit 66ba099 into unicode-org:main Aug 9, 2024
99 checks passed
StefanStojanovic added a commit to JaneaSystems/node that referenced this pull request Aug 22, 2024
StefanStojanovic added a commit to JaneaSystems/node that referenced this pull request Aug 23, 2024
StefanStojanovic added a commit to JaneaSystems/node that referenced this pull request Aug 27, 2024
nodejs-github-bot pushed a commit to nodejs/node that referenced this pull request Aug 27, 2024
- Floating patch for ICU 75.x

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-22787
Backport of: unicode-org/icu#3023

PR-URL: #54502
Reviewed-By: Steven R Loomis <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
RafaelGSS pushed a commit to nodejs/node that referenced this pull request Aug 30, 2024
- Floating patch for ICU 75.x

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-22787
Backport of: unicode-org/icu#3023

PR-URL: #54502
Reviewed-By: Steven R Loomis <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
RafaelGSS pushed a commit to nodejs/node that referenced this pull request Aug 30, 2024
- Floating patch for ICU 75.x

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-22787
Backport of: unicode-org/icu#3023

PR-URL: #54502
Reviewed-By: Steven R Loomis <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
louwers pushed a commit to louwers/node that referenced this pull request Nov 2, 2024
- Floating patch for ICU 75.x

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-22787
Backport of: unicode-org/icu#3023

PR-URL: nodejs#54502
Reviewed-By: Steven R Loomis <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants