Avoiding duplicate type names #1501

kennykerr · 2023-03-16T18:21:28Z

Duplicate type names are used today to let the same struct or function have distinct per-architecture definitions. The canonical example is the CONTEXT struct. One of the reasons that it is so hard to fix "the CONTEXT problem" (#1044) is that it is very difficult to manipulate the individual per-architecture definitions since their names are ambiguous. As I started working on metadata processing and generation I've also discovered this on the Rust side where it makes it very difficult to uniquely refer to a type since the ECMA-335 TypeRef table is meant to identify type references but it only lets you specify an assembly/namespace/name tuple so there's no way to reliably identify a specific definition across assemblies.

I suggest we switch to a model where we have strictly unique type names and add an optional "Overload" field to the SupportedArchitecture attribute that lets languages find the "good name" for a type if it has been overloaded.

So instead of this:

[SupportedArchitecture(Architecture.X86)]
public struct CONTEXT {..}

[SupportedArchitecture(Architecture.X64)]
public struct CONTEXT {..}

We would have something like this:

[SupportedArchitecture(Architecture.X86, "CONTEXT")]
public struct CONTEXT_X86 {..}

[SupportedArchitecture(Architecture.X64, "CONTEXT")]
public struct CONTEXT_X64 {..}

Language bindings can hide the mangled name and only show overloaded names as before, but tools can quickly and efficiently identify and process unique definitions.

The text was updated successfully, but these errors were encountered:

mikebattista · 2023-03-16T18:40:15Z

@AArnott @timsneath @marler8997 what are your thoughts on this?

timsneath · 2023-03-16T19:29:58Z

I'm not a big fan of this.

My primary rationale is that this metadata file should IMHO be as close as possible to a faithful structured documentation of the Win32 API. Most of us are using it for language projections, but there are many other useful potential purposes for this, including research and documentation. If the struct is called CONTEXT, it really ought to be called CONTEXT in the metadata, rather than a mangled name. Otherwise ILSpy and other tools need to support searching attributes as well as the identifiers.

(This is also my rationale why I want to see inline macros and constants embedded in the metadata -- I should have confidence that the metadata is an exhaustive representation of the API.)

I'm curious though, @kennykerr: what does your proposed syntax give you that the current syntax doesn't? Can't you infer exactly the same Rust binding from either syntax?

kennykerr · 2023-03-16T20:33:33Z

I'm not merely interested in generating Rust bindings, but completely roundtripping metadata so I need to also merge and generate metadata and I'm finding that while you can technically get away with duplicate names in the TypeDef table in a single winmd, since ECMA-335 doesn't sort the TypeDef table, you cannot reliably reference such a type across assemblies, since ECMA-335 doesn't provide more than a textual name in the TypeRef table. So what I'm saying is that what we have here is strictly speaking invalid. Given a TypeRef for CONTEXT, there's no way to know which CONTEXT it refers to and as I work through the process of producing metadata, not just consuming it, I'm finding out that this is a real problem and likely why its been months with no resolution to problems like #1044. So while I appreciate that the type name should ideally match the original type, we're also dealing with the fact that there are in fact multiple types with that same name that we'd like to identify and the current approach doesn't really work in general, even if we've been able to "make do" for the purpose of generating language bindings.

marler8997 · 2023-03-16T20:50:25Z

Apologies if this was already answered in the referenced issues/links. I haven't seen any issues yet with the status quo because every platform-specific type exists in it's own "universe", as if each definition is in it's own mutually exclusive "ifdef" sections. Is there ever a time where you would need access to multiple platform-specific definitions in the same compilation? If so then this seems like a good solution.

AArnott · 2023-03-16T20:55:50Z

it makes it very difficult to uniquely refer to a type since the ECMA-335 TypeRef table is meant to identify type references but it only lets you specify an assembly/namespace/name tuple so there's no way to reliably identify a specific definition across assemblies.

This is a good thing, since otherwise a type that is only defined once in the metadata, but references another type that is defined twice in the metadata, would have to just arbitrarily pick one.

you cannot reliably reference such a type across assemblies...Given a TypeRef for CONTEXT, there's no way to know which CONTEXT it refers to

Not by type name alone, that's true. What Cswin32 has to do is consider the architecture of the source of the reference to disambiguate the target, when there are multiple possible targets.

switch to a model where we have strictly unique type names and add an optional "Overload" field to the SupportedArchitecture attribute

I remember this came up before. But it just compounds the problem that everyone else who references the type within the metadata now has to reference a specific one when they don't care to. How would a field whose type is CONTEXT reference it once there are 3 distinctly named definitions of it?

To be fair, the way the metadata expresses all this was very hard to get right in CsWin32, but it works. So I'm selfishly hoping it doesn't change.

kennykerr · 2023-03-16T20:59:30Z

The fact that such types have the same name doesn't really help if you end up ignoring the SupportedArchitecture attribute since you'll end up with duplicate type definitions, so there's no way around having to check this attribute. In Rust for example, as in C/C++, we emit all of the definitions in their own cfg/ifdef sections.

consider the architecture of the source of the reference

That's not practical in C/C++/Rust since we don't know what that is until compile time and long after code is generated.

metadata now has to reference a specific one when they don't care to.

That isn't reasonable since you cannot surely be content with "whatever" definition you happen to find.

riverar · 2023-03-16T21:28:17Z

Data point: If we move forward with removing type name ambiguity, we can trivially fix CONTEXT #1044 by excluding it from the scraper and manually define per-architecture types.

AArnott · 2023-03-17T02:42:00Z

metadata now has to reference a specific one when they don't care to.

That isn't reasonable since you cannot surely be content with "whatever" definition you happen to find.

I don't understand this point.

If the win32 headers define struct B to reference struct A, B doesn't have to know in that header or in the win32metadata which architecture A is defined in. As a result, B can be defined without #ifdef around it, even if A can only be defined that way.

kennykerr · 2023-03-17T04:14:51Z

I decided to move in a different direction, so I'll just close this issue - sorry for the distraction.

mikebattista added the usability Touch-up to improve the user experience for a language projection label Mar 16, 2023

kennykerr closed this as completed Mar 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoiding duplicate type names #1501

Avoiding duplicate type names #1501

kennykerr commented Mar 16, 2023 •

edited

Loading

mikebattista commented Mar 16, 2023

timsneath commented Mar 16, 2023

kennykerr commented Mar 16, 2023

marler8997 commented Mar 16, 2023

AArnott commented Mar 16, 2023

kennykerr commented Mar 16, 2023

riverar commented Mar 16, 2023

AArnott commented Mar 17, 2023

kennykerr commented Mar 17, 2023

Avoiding duplicate type names #1501

Avoiding duplicate type names #1501

Comments

kennykerr commented Mar 16, 2023 • edited Loading

mikebattista commented Mar 16, 2023

timsneath commented Mar 16, 2023

kennykerr commented Mar 16, 2023

marler8997 commented Mar 16, 2023

AArnott commented Mar 16, 2023

kennykerr commented Mar 16, 2023

riverar commented Mar 16, 2023

AArnott commented Mar 17, 2023

kennykerr commented Mar 17, 2023

kennykerr commented Mar 16, 2023 •

edited

Loading