Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GNU ObjC] Clang 14 regression: class methods always return nil object with ARC and optimization #56952

Closed
triplef opened this issue Aug 5, 2022 · 13 comments

Comments

@triplef
Copy link
Member

triplef commented Aug 5, 2022

Using using Automatic Reference Counting and -O1 or better optimization using Clang 14.0.5 or 15.0 rc1 causes class methods returning an ObjC object type to always return a null pointer.

This works correctly with Clang 13, or when not using ARC, or when using -O0 optimization. Methods returning C primitive types are not affected.

The following code reproduces the issue:

#include <stdio.h>
#import "objc/runtime.h"

__attribute__((objc_root_class))
@interface TestObj { id isa; }
+ (Class)class;
+ (id)testObj;
+ (const char *)testStr;
+ (int)testNum;
@end

@implementation TestObj
+ (Class)class {
	return self;
}
+ (id)testObj {
	id testObj = @"Test";
	printf("returning testObj: 0x%p\n", testObj);
	return testObj;
}
+ (const char *)testStr {
	return "forty two";
}
+ (int)testNum {
	return 42;
}
@end

int main() {
	id testClass = [TestObj class];
	printf("testClass: %s (%p)\n", class_getName(testClass), testClass);
	TestObj *testObj = [TestObj testObj];
	printf("testObj: 0x%p\n", testObj);
	const char *testStr = [TestObj testStr];
	printf("testStr: %s\n", testStr);
	int testNum = [TestObj testNum];
	printf("testNum: %d\n", testNum);

	return 0;
}

Build command:

"C:\Program Files\LLVM-14\bin\clang" -IC:\libobjc2\include -LC:\libobjc2\lib -fobjc-runtime=gnustep-2.0 -fobjc-arc -fuse-ld=lld -lobjc -O2 -o test.exe test.m

Output:

testClass: TestObj (00007FF7CD7F1158)
returning testObj: 0xA9979F4000000024
testObj: 0x0000000000000000   // <<<< should be 0xA9979F4000000024
testStr: forty two
testNum: 42
@triplef
Copy link
Member Author

triplef commented Aug 19, 2022

Additionally I’m seeing some very strange behavior calling class methods in GNUstep Base with Clang 14 and 15rc1 using ARC and -O1 or -O2, which might be related to the above:

@try {
	id processInfoClass = [NSProcessInfo class];
	NSLog(@"processInfoClass: %@ (%p)", processInfoClass, processInfoClass);
	id processInfo = [NSProcessInfo processInfo];
	NSLog(@"processInfo: %@ (%p)", processInfo, processInfo);
	id arguments = [processInfo arguments];
	NSLog(@"arguments: %@", arguments);
} @catch (id exc) {
	NSLog(@"Exception: %@", exc);
}

This will result in the following exception when calling [NSProcessInfo processInfo], which seems to indicate that the processInfo method is called on NSObject and not NSProcessInfo:

<NSException: 0x8d20a068> NAME:NSInvalidArgumentException REASON:NSObject(class) does not recognize processInfo INFO:(null)

I can work around that by replacing [NSProcessInfo processInfo] with [processInfoClass processInfo], but then I’m getting the following exception when calling [processInfo arguments] in the next line, where the arguments method seems to be called on the NSProcessInfo class instead of an object of that class:

<NSException: 0xebb88048> NAME:NSInvalidArgumentException REASON:NSProcessInfo(class) does not recognize arguments INFO:(null)

In summary it seems like there’s some mixup with what kind of class objects the compiler returns, but it depends on the exact code, e.g. when calling just one of these methods above individually it works fine.

@llvmbot
Copy link
Member

llvmbot commented Aug 19, 2022

@llvm/issue-subscribers-clang-codegen

@davidchisnall
Copy link
Contributor

Is this Windows-only? Can you run clang with -mllvm -print-after-all and see where the message send is either being elided or where the receiver becomes 0?

My guess is that LLVM is seeing that the class global variable is initialized to a null pointer and somehow missing the fact that the function that sets it is called from the runtime.

@triplef
Copy link
Member Author

triplef commented Aug 24, 2022

Looks like this is a Windows-only issue, at least we were unable to reproduce it on Debian 11 aarch64 using Clang 15.0.0-rc2.

This is the LLVM IR dump of a further reduced test case printed below, though unfortunately I’m not fully sure how to read it:
llvm-output.txt

#import "objc/runtime.h"

__attribute__((objc_root_class))
@interface TestObj { id isa; }
+ (Class)class;
+ (id)testObj;
@end

@implementation TestObj
+ (Class)class {
	return self;
}
+ (id)testObj {
	id testObj = @"Test";
	return testObj;
}
@end

int main() {
	TestObj *testObj = [TestObj testObj];
	return testObj ? 0 : 1;
}

weliveindetail referenced this issue in weliveindetail/libobjc2 Sep 21, 2022
@weliveindetail
Copy link
Contributor

One quick find in the assembly of the libobjc2 test above: There are return values that clang-14 copies from %rax to %rdi register while clang-13 used to copy to %rcx -- so it becomes first parameter for the subsequent call.

Here is an annotated assembly diff that omits some non-printable characters from the objc_selectors:

-ClassMethods_arc_clang13_release.m.obj:	file format coff-x86-64
+ClassMethods_arc_clang14_release.m.obj:	file format coff-x86-64
 
@@ -92,7 +92,7 @@
        7: 48 8b 0d 00 00 00 00  movq	(%rip), %rcx      # $_OBJC_REF_CLASS_TestObj
        e: 48 8d 15 00 00 00 00  leaq	(%rip), %rdx      # .objc_selector_class___ (1)
       15: e8 00 00 00 00        callq	0x1a <main+0x1a>  # objc_msgSend
-      1a: 48 89 c1              movq	%rax, %rcx
+      1a: 48 89 c7              movq	%rax, %rdi
       1d: e8 00 00 00 00        callq	0x22 <main+0x22>  # objc_retainAutoreleasedReturnValue
       22: 48 89 c6              movq	%rax, %rsi
       25: 48 89 c1              movq	%rax, %rcx
@@ -104,7 +104,7 @@
       40: 48 8b 0d 00 00 00 00  movq	(%rip), %rcx      # $_OBJC_REF_CLASS_TestObj
       47: 48 8d 15 00 00 00 00  leaq	(%rip), %rdx      # .objc_selector_testObj___ (2)
       4e: e8 00 00 00 00        callq	0x53 <main+0x53>  # objc_msgSend
-      53: 48 89 c1              movq	%rax, %rcx
+      53: 48 89 c7              movq	%rax, %rdi
       56: e8 00 00 00 00        callq	0x5b <main+0x5b>  # objc_retainAutoreleasedReturnValue
       5b: 48 89 c7              movq	%rax, %rdi
       5e: 48 8d 0d 00 00 00 00  leaq	(%rip), %rcx

The difference is in two places between the calls to objc_msgSend and objc_retainAutoreleasedReturnValue:

  1. After objc_selector_class___ it looks like it "silently works" because objc_msgSend stored its return value 0x00007ff6979b58e0 in %rcx as well (or just doesn't clear the register before returning). Then objc_retainAutoreleasedReturnValue returns it again and we correctly print the line: testClass: TestObj (00007FF6979B58E0)

  2. After objc_selector_testObj___ the objc_msgSend call behaves differently and %rcx is null. Then objc_retainAutoreleasedReturnValue returns null as well and we incorrectly print the line: testObj: 0x0000000000000000

@weliveindetail
Copy link
Contributor

I used the above assembly difference for bisectioning the clang history and arrived at this commit: d61eb6c

On the current release/15.x version, I tested a change that lets Windows targets opt-out from using clang.arc.attachedcall operand bundles and it does successfully prevent this issue: weliveindetail@d2c3ec2

My guess is that this can only be a temporary fix as it likely has an impact on performance of compiled output. A proper solution would teach the x86_64 backend for Windows how to handle the operand bundle correctly.

@rnk
Copy link
Collaborator

rnk commented Sep 22, 2022

This usage of RDI seems like the bug:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86ExpandPseudo.cpp#L229

That was added in c2d44bd by @fhahn . The first parameter register on Windows is RCX, not RDI.

@fhahn
Copy link
Contributor

fhahn commented Sep 23, 2022

As discussed in https://reviews.llvm.org/D134441, if the runtime on windows excepts a different marker instruction, we should update x86ExpandPseudo.cpp. If no marker is needed on Windows, we shouldn't generatedclang.arc.attachedcall in Windows

@weliveindetail
Copy link
Contributor

Thanks for your feedback everyone!

This usage of RDI seems like the bug: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86ExpandPseudo.cpp#L229

That was added in c2d44bd by @fhahn . The first parameter register on Windows is RCX, not RDI.

Yes, I tested this and it gives the correct output as well!

As discussed in https://reviews.llvm.org/D134441, if the runtime on windows excepts a different marker instruction, we should update x86ExpandPseudo.cpp.

On Windows it just looks like a regular call following the x64 calling convention and I didn't see anything in the implementation that inspects instruction opcodes on the caller side (so far).

@davidchisnall Is GNUstep checking for optimized callers like the callerAcceptsOptimizedReturn() in Apple's ObjC runtime here? https://opensource.apple.com/source/objc4/objc4-818.2/runtime/objc-object.h.auto.html

@weliveindetail
Copy link
Contributor

If no marker is needed on Windows, we shouldn't generate clang.arc.attachedcall in Windows

Basically, this is the attempt from my first review D134441. The bundle only makes sure that the instruction sequence ends up unmodified in the binary right?

Could it have any other/non-obvious side-effects? There are several places that check for the presence of the bundle. Most of them appear to make sure the instruction sequence stays intact, but is it the case for the following two as well?
https://github.com/llvm/llvm-project/blob/release/15.x/llvm/lib/Transforms/Scalar/SCCP.cpp#L133-L139
https://github.com/llvm/llvm-project/blob/release/15.x/llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp#L246-L249

Also, it looks like the inliner can insert these bundles as well -- would I need to suppress this?
https://github.com/llvm/llvm-project/blob/release/15.x/llvm/lib/Transforms/Utils/InlineFunction.cpp#L1753-L1762

I guess without a marker, GNUstep wouldn't have a chance to implement ObjC ARC autorelease optimization and always miss autorelease elision like this: #31273

weliveindetail added a commit to weliveindetail/llvm-project that referenced this issue Sep 26, 2022
…on Windows

Fix regression llvm#56952 for Clang CodeGen on Windows.
In the Windows ABI the instruction sequence that is expanded from CALL_RVMARKER should use RCX as target register and not RDI.

Differential Revision: https://reviews.llvm.org/D134441
triplef added a commit to gnustep/tools-windows-msvc that referenced this issue Sep 27, 2022
weliveindetail added a commit that referenced this issue Sep 27, 2022
…on Windows

Fix regression #56952 for Clang CodeGen on Windows. In the Windows ABI the instruction sequence that is expanded from CALL_RVMARKER should use RCX as target register and not RDI.

Reviewed By: rnk, fhahn

Differential Revision: https://reviews.llvm.org/D134441
@weliveindetail
Copy link
Contributor

This should be fixed with commit https://reviews.llvm.org/rGed8409dfa0a92d80c021f13ca271737492522cc7. It doesn't look like GNUstep checks for optimized callers, but we keep the marker so that it will remain possible. Please find more details in the review.

@fhahn
Copy link
Contributor

fhahn commented Sep 28, 2022

Thanks for the fix!

@llvmbot
Copy link
Member

llvmbot commented Sep 28, 2022

@llvm/issue-subscribers-backend-x86

weliveindetail added a commit to weliveindetail/llvm-project that referenced this issue Oct 7, 2022
…on Windows

Fix regression llvm#56952 for Clang CodeGen on Windows. In the Windows ABI the instruction sequence that is expanded from CALL_RVMARKER should use RCX as target register and not RDI.

Reviewed By: rnk, fhahn

Differential Revision: https://reviews.llvm.org/D134441
vfdff added a commit to vfdff/llvm-project that referenced this issue Dec 5, 2023
A case for this transformation, https://gcc.godbolt.org/z/nhYcWq1WE
```
Fold
  mov     w8, llvm#56952
  movk    w8, llvm#15, lsl llvm#16
  ldrb    w0, [x0, x8]
into
  add     x0, x0, 1036288
  ldrb    w0, [x0, 3704]
```
Only support single use base, multi-use scenes are supported by PR74046.
Fix llvm#71917

TODO: support the multiple-uses with reuseing common base offset.
https://gcc.godbolt.org/z/Mr7srTjnz
vfdff added a commit to vfdff/llvm-project that referenced this issue Dec 21, 2023
A case for this transformation, https://gcc.godbolt.org/z/nhYcWq1WE
Fold
  mov     w8, llvm#56952
  movk    w8, llvm#15, lsl llvm#16
  ldrb    w0, [x0, x8]
into
  add     x0, x0, 1036288
  ldrb    w0, [x0, 3704]

Only LDRBBroX is supported for the first time.
Fix llvm#71917
vfdff added a commit to vfdff/llvm-project that referenced this issue Jan 30, 2024
A case for this transformation, https://gcc.godbolt.org/z/nhYcWq1WE
Fold
  mov     w8, llvm#56952
  movk    w8, llvm#15, lsl llvm#16
  ldrb    w0, [x0, x8]
into
  add     x0, x0, 1036288
  ldrb    w0, [x0, 3704]

Only LDRBBroX is supported for the first time.
Fix llvm#71917
veselypeta pushed a commit to veselypeta/cherillvm that referenced this issue May 20, 2024
…on Windows

Fix regression llvm/llvm-project#56952 for Clang CodeGen on Windows. In the Windows ABI the instruction sequence that is expanded from CALL_RVMARKER should use RCX as target register and not RDI.

Reviewed By: rnk, fhahn

Differential Revision: https://reviews.llvm.org/D134441
vfdff added a commit to vfdff/llvm-project that referenced this issue Aug 8, 2024
A case for this transformation, https://gcc.godbolt.org/z/nhYcWq1WE
Fold
  mov     w8, llvm#56952
  movk    w8, llvm#15, lsl llvm#16
  ldrb    w0, [x0, x8]
into
  add     x0, x0, 1036288
  ldrb    w0, [x0, 3704]

Only LDRBBroX is supported for the first time.
Fix llvm#71917

Note: This PR is try relanding the commit 32878c2 with fix crash for PR79756
  this crash is exposes when there is MOVKWi instruction in the head of a block,
but without MOVZWi
vfdff added a commit to vfdff/llvm-project that referenced this issue Aug 15, 2024
A case for this transformation, https://gcc.godbolt.org/z/nhYcWq1WE
Fold
  mov     w8, llvm#56952
  movk    w8, llvm#15, lsl llvm#16
  ldrb    w0, [x0, x8]
into
  add     x0, x0, 1036288
  ldrb    w0, [x0, 3704]

Only LDRBBroX is supported for the first time.
Fix llvm#71917

Note: This PR is try relanding the commit 32878c2 with fix crash for PR79756
  this crash is exposes when there is MOVKWi instruction in the head of a block,
but without MOVZWi
vfdff added a commit to vfdff/llvm-project that referenced this issue Aug 15, 2024
A case for this transformation, https://gcc.godbolt.org/z/nhYcWq1WE
Fold
  mov     w8, llvm#56952
  movk    w8, llvm#15, lsl llvm#16
  ldrb    w0, [x0, x8]
into
  add     x0, x0, 1036288
  ldrb    w0, [x0, 3704]

Only LDRBBroX is supported for the first time.
Fix llvm#71917

Note: This PR is try relanding the commit 32878c2 with fix crash for PR79756
  this crash is exposes when there is MOVKWi instruction in the head of a block,
but without MOVZWi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants