Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-111506: Implement Py_SET_REFCNT() as inline function #112798

Closed
wants to merge 1 commit into from

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Dec 6, 2023

Convert Py_SET_REFCNT() and _Py_IsImmortal() static inline functions to inline functions.

Py_SET_REFCNT() function is now exported as "Py_SET_REFCNT" name in the stable ABI, instead of "_Py_SetRefcnt".

_Py_IsImmortal() has to be converted to an inline function, since non-static Py_SET_REFCNT() function cannot call a static _Py_IsImmortal() function. _Py_IsImmortal() is not part of the stable ABI.


📚 Documentation preview 📚: https://cpython-previews--112798.org.readthedocs.build/

@vstinner
Copy link
Member Author

vstinner commented Dec 6, 2023

On macOS, the build fails with:

Error:  _asyncio failed to import: dlopen(/Users/runner/work/cpython/cpython/build/lib.macosx-12.6-x86_64-3.13-pydebug/_asyncio.cpython-313td-darwin.so, 0x0002): symbol not found in flat namespace (__Py_IsImmortal)
Error:  _testcapi failed to import: dlopen(/Users/runner/work/cpython/cpython/build/lib.macosx-12.6-x86_64-3.13-pydebug/_testcapi.cpython-313td-darwin.so, 0x0002): symbol not found in flat namespace (_Py_SET_REFCNT)
Error:  _testinternalcapi failed to import: dlopen(/Users/runner/work/cpython/cpython/build/lib.macosx-12.6-x86_64-3.13-pydebug/_testinternalcapi.cpython-313td-darwin.so, 0x0002): symbol not found in flat namespace (_Py_SET_REFCNT)

Error: Failed to build some stdlib modules

Following modules built successfully but were removed because they could not be imported:
_asyncio              _testcapi             _testinternalcapi  

Windows build complains in Objects/object.c with:

'_Py_IsImmortal': redefinition; different linkage [D:\a\cpython\cpython\PCbuild\pythoncore.vcxproj]

'Py_SET_REFCNT': redefinition; different linkage [D:\a\cpython\cpython\PCbuild\pythoncore.vcxproj]

Oh, and Free Threading build complains with:

‘_Py_IsOwnedByCurrentThread’ is static but used in inline function ‘Py_SET_REFCNT’ which is not static

Include/object.h Outdated Show resolved Hide resolved
Include/object.h Outdated Show resolved Hide resolved
Include/object.h Outdated Show resolved Hide resolved
@vstinner
Copy link
Member Author

vstinner commented Dec 7, 2023

As suggested by @pitrou, I added PyAPI_FUNC(). Let's see how it goes (if it fix some build errors on Windows/macOS):

-inline void Py_SET_REFCNT(PyObject *ob, Py_ssize_t refcnt) {
+inline PyAPI_FUNC(void) Py_SET_REFCNT(PyObject *ob, Py_ssize_t refcnt) {

It seems like test_cppext pass: C++ code including <Python.h> builds successfully (without C++ compiler warning).

@zooba
Copy link
Member

zooba commented Dec 7, 2023

Is it even possible to define an inline function in a header file and export it from a DLL? Which compilation unit gets the implementation? Or does it just pick one at random (there's another declspec that does that, IIRC).

The exported function doesn't actually have to be defined in a header file. But then, why not define the real function as Py_SetRefCnt rather than following our macro style? Then define it as a macro for C users using our headers that should get the function instead.

There's no need to get into such obscure definitions. They'll inherently destabilise our code, so better to just avoid them and use what works reliably (and yes, a simple macro #define Py_SET_REFCNT Py_SetRefCnt does work reliably).

@pitrou
Copy link
Member

pitrou commented Dec 7, 2023

Is it even possible to define an inline function in a header file and export it from a DLL? Which compilation unit gets the implementation? Or does it just pick one at random (there's another declspec that does that, IIRC).

You can read about it here:
https://devblogs.microsoft.com/oldnewthing/20140109-00/?p=2123

The exported function doesn't actually have to be defined in a header file.

In this PR it doesn't, if the limited API is enabled.

There's no need to get into such obscure definitions.

Which obscure definition?

They'll inherently destabilise our code, so better to just avoid them and use what works reliably (and yes, a simple macro #define Py_SET_REFCNT Py_SetRefCnt does work reliably).

Naming aside, #112798 looks simpler than #112794 to me.

Convert Py_SET_REFCNT() and _Py_IsImmortal() static inline functions
to inline functions.

Py_SET_REFCNT() function is now exported as "Py_SET_REFCNT" name in
the stable ABI, instead of "_Py_SetRefcnt".

_Py_IsImmortal(), _Py_IsOwnedByCurrentThread() and _Py_ThreadId()
have to be converted to an inline functions, since non-static
Py_SET_REFCNT() function cannot call a static functions.
_Py_IsImmortal(), _Py_IsOwnedByCurrentThread() and _Py_ThreadId() are
not part of the stable ABI.
@vstinner
Copy link
Member Author

vstinner commented Dec 7, 2023

Oops, I replaced another PyAPI_DATA() with PyAPI_FUNC(). I also rebased the PR.

@zooba
Copy link
Member

zooba commented Dec 7, 2023

You can read about it here:

Oh neat. That's... more intelligent than I ever expected from C++.

I wonder if we can use __declspec(noinline) on an inline function, to force the compiler to always use the dllimport definition when compiling for limited ABI? That could then be bundled into a new PyAPI_INLINE_FUNC which has a comment explaining how/why the magic works.

@vstinner
Copy link
Member Author

vstinner commented Dec 7, 2023

@pitrou: Ah nice, all tests pass now.

@vstinner
Copy link
Member Author

vstinner commented Dec 7, 2023

@zooba:

I wonder if we can use __declspec(noinline) on an inline function, to force the compiler to always use the dllimport definition when compiling for limited ABI?

In the limited API, this PR declares Py_SET_REFCNT() as:

PyAPI_FUNC(void) Py_SET_REFCNT(PyObject *ob, Py_ssize_t refcnt);

It's not declared as an inline function, but as a regular function.

Implementation to export Py_SET_REFCNT() as a function, in Objects/object.c:

extern inline PyAPI_FUNC(void) Py_SET_REFCNT(PyObject *ob, Py_ssize_t refcnt);

@zooba
Copy link
Member

zooba commented Dec 7, 2023

In the limited API, this PR declares Py_SET_REFCNT() as:

Yes, I saw that. It's in its own #ifdef block, which could be simplified into an attribute if applying the noinline attribute is interpreted as "always use the exported function and never inline it". And then the attribute could be centralised entirely so that we don't have to propagate multiple declarations around our header files.

@vstinner
Copy link
Member Author

vstinner commented Dec 7, 2023

Yes, I saw that. It's in its own #ifdef block, which could be simplified into an attribute if applying the noinline attribute is interpreted as "always use the exported function and never inline it". And then the attribute could be centralised entirely so that we don't have to propagate multiple declarations around our header files.

The problem is not about adding an attribute to the function. The problem is that it's not technically possible to declare the function as inline in the limited C API.

The whole point of the limited C API is to not expose any implementation details. Py_SET_REFCNT() cannot be declared as an inline function simply because _Py_IsImmortal(), _Py_IsOwnedByCurrentThread() and _Py_ThreadId() dsn't exist in the limited C API (and I don't want to add them). I'm trying to hide asm in the limited C API of the Free Threading build.

@neonene
Copy link
Contributor

neonene commented Dec 8, 2023

MSBuild emits the following error with Release mode:

xxlimited_35.obj : error LNK2001: unresolved external symbol _imp__Py_IsImmortal [C:\a\PCbuild\xxlimited_35.vcxproj]
C:\a\PCbuild\amd64\xxlimited_35.pyd : fatal error LNK1120: 1 unresolved externals [C:\a\PCbuild\xxlimited_35.vcxproj]

@vstinner
Copy link
Member Author

vstinner commented Dec 8, 2023

xxlimited_35.obj : error LNK2001: unresolved external symbol _imp__Py_IsImmortal [C:\a\PCbuild\xxlimited_35.vcxproj]

This C extension is built for the limited C API version 3.5. In this version, Py_SET_REFCNT() is implemented as:

inline Py_ALWAYS_INLINE PyAPI_FUNC(int) _Py_IsImmortal(PyObject *op)
{
    ...
}

inline PyAPI_FUNC(void) Py_SET_REFCNT(PyObject *ob, Py_ssize_t refcnt) {
    if (_Py_IsImmortal(ob)) {
        return;
    }
    ...
}

This implementation is very fragile for the limited C API version 3.5. Python 3.11 and older don't have immortal objects and should not call _Py_IsImmortal(). But Python 3.12 changed the limited C API making the assumption that "well, it should be nice" :-) See the Stable ABI section of PEP 683 – Immortal Objects, Using a Fixed Refcount.

But at the ABI level, Py_SET_REFCNT() must not call a function _Py_IsImmortal(), since it doesn't exist in the stable ABI! _Py_IsImmortal() implementation must always be inlined. It seems like inline Py_ALWAYS_INLINE PyAPI_FUNC(int) has weaker guarantees that the _Py_IsImmortal() function will be inlined.

Or maybe I misunderstood something. Well, I'm not sure of what "unresolved external symbol _imp__Py_IsImmortal" means here.

@vstinner
Copy link
Member Author

vstinner commented Dec 8, 2023

@pitrou: Do you have an explanation/fix for the error?

xxlimited_35.obj : error LNK2001: unresolved external symbol _imp__Py_IsImmortal [C:\a\PCbuild\xxlimited_35.vcxproj]

@pitrou
Copy link
Member

pitrou commented Dec 8, 2023

Well, it means that the implementation that you quoted above is inlined in the caller:



inline PyAPI_FUNC(void) Py_SET_REFCNT(PyObject *ob, Py_ssize_t refcnt) {
    if (_Py_IsImmortal(ob)) {
        return;
    }
    ...
}

and that _Py_IsImmortal cannot be resolved in that context, no? Though I'm not sure why it isn't...

@vstinner
Copy link
Member Author

and that _Py_IsImmortal cannot be resolved in that context, no? Though I'm not sure why it isn't...

I looked at the Py_ALWAYS_INLINE attribute, but it doesn't seem to be related (same issue if I remove it on _Py_IsImmortal().

@vstinner
Copy link
Member Author

Well, it means that the implementation that you quoted above is inlined in the caller:

If it's inlined, why does the linker expect getting the symbol from Python?

@pitrou
Copy link
Member

pitrou commented Dec 11, 2023

I suppose that Py_SET_REFCNT is inlined, but _Py_IsImmortal isn't? For inlining, the inline keyword is only a hint and doesn't guarantee anything.

@vstinner
Copy link
Member Author

For inlining, the inline keyword is only a hint and doesn't guarantee anything.

_Py_IsImmortal() and Py_INCREF() are marked with Py_ALWAYS_INLINE. With MSVC, this macro is expanded as __forceinline. Right, inline and __forceinline are only "hints" and the compiler can ignore them depending on other constraints. MSVC doc:

There's no guarantee that functions will be inlined. You can't force the compiler to inline a particular function, even with the __forceinline keyword.

Oh interesting, when Python is built in debug mode, Py_ALWAYS_INLINE macro is always expanded as an empty expression:

#if defined(Py_DEBUG)
   // If Python is built in debug mode, usually compiler optimizations are
   // disabled. In this case, Py_ALWAYS_INLINE can increase a lot the stack
   // memory usage. For example, forcing inlining using gcc -O0 increases the
   // stack usage from 6 KB to 15 KB per Python function call.
#  define Py_ALWAYS_INLINE

@colesbury
Copy link
Contributor

This seems risky, especially because our header files are included by both C and C++ projects. I think static inline is basically treated the same by C and C++, but inline has different meanings. In C, you can get an undefined symbol if the compiler doesn't inline the body. In C++, I think you'll get a weak symbol.

@vstinner
Copy link
Member Author

In C, you can get an undefined symbol if the compiler doesn't inline the body.

This PR does two things:

  • Replace static inline with inline.
  • Export _Py_ThreadId(), _Py_IsOwnedByCurrentThread(), _Py_IsImmortal(), Py_SET_REFCNT() as regular functions.

Example with the current PR:

$ ./python
Python 3.13.0a2+ (heads/set_refcnt_inline:49d6671183d, Dec 12 2023, 11:44:21) [GCC 13.2.1 20231205 (Red Hat 13.2.1-6)] on linux
>>> import ctypes
>>> python = ctypes.cdll.LoadLibrary(None)
>>> python.Py_SET_REFCNT
<_FuncPtr object at 0x7f8493c42440>

@colesbury
Copy link
Contributor

Yeah, what I'm trying to say is that I think that (1) "Replace static inline with inline" is a risky way to achieve (2) "Export as regular functions" when we have other ways of exporting these as regular functions.

I think it's risky because non-static inlines are treated differently in C than in C++, have meaningfully different behavior when compiling with optimizations vs. without, and the complexity and risk is increased because Include/object.h is used in multiple different ways (like limited vs. non-limited API). It is hard to analyze all the use cases and we don't have comprehensive test coverage.

As a concrete example, an extension using Py_SET_REFCNT and targeting Py_LIMITED_API versions earlier than 3.13 may sometimes work or not depending on the optimization level. If it uses -O0 and Py_SET_REFCNT is not inlined, the extension will have an undefined symbol for Py_SET_REFCNT. That's fine if it's loaded in Python 3.13+, but not if it's loaded in 3.12 or earlier, which don't provide that symbol (i.e., don't export Py_SET_REFCNT as a function).

This was not an issue when we used "static inline" because you don't get undefined symbols like you do with non-static inlines.

@vstinner
Copy link
Member Author

@colesbury:

I think it's risky because non-static inlines are treated differently in C than in C++, have meaningfully different behavior when compiling with optimizations vs. without, and the complexity and risk is increased because Include/object.h is used in multiple different ways (like limited vs. non-limited API). It is hard to analyze all the use cases and we don't have comprehensive test coverage.

Do you prefer PR #112794 approach which keeps static inline but uses pre-processor black magic to expose the function as a regular function and as a static inline function under the same name?

@colesbury
Copy link
Contributor

@vstinner yes, although it seems like that could be simpler if we named the regular function differently from the static inline function (like we do for Py_INCREF vs. Py_IncRef)

@vstinner
Copy link
Member Author

Let's focus on the static inline API in this case.

@vstinner vstinner closed this Dec 20, 2023
@vstinner vstinner deleted the set_refcnt_inline branch December 20, 2023 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants