Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x86: improve code generation for passing promoted structs #7048

Open
CarolEidt opened this issue Nov 23, 2016 · 0 comments
Open

x86: improve code generation for passing promoted structs #7048

CarolEidt opened this issue Nov 23, 2016 · 0 comments
Labels
arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Milestone

Comments

@CarolEidt
Copy link
Contributor

CarolEidt commented Nov 23, 2016

A number of issues remain in the handling of promoted struct arguments (GT_FIELD_LIST) after PR dotnet/coreclr#7847:

  • We currently push the fields in reverse order, which is what we want to do for pointer-sized fields. However, for smaller fields we should keep them in the list order so that we can push the first field in the slot as 32-bits, and then store the remaining fields.
  • We don’t require the byte fields to be in byteable registers, but it would be nice if we could preference them that way. (The current mechanism for indicating preferences relates mostly to preferencing to another lclVar or tree node.)
  • If we have sub-pointer-sized fields, we always allocate an internal register, even though we may not need it. One strategy, if any of the sub-pointer-sized fields are either tree temps or last-use lclVars:
    • For the first such field, require a register (i.e. don’t make it RegOptional). This register can then be also used as a temporary register to load up the remaining sub-pointer-sized fields.
    • For the first byte field, make it require a register – this may be the same as above, or in addition to. One could fine-tune this but it’s probably not really worth it.
  • If we need to adjust the stack for padding or holes in the struct, we currently use a series of pushes of 0. We should consider whether 1) it would be better to generate a sub in some or all cases, and/or 2) we should use push eax has the legacy jit does. My concern with the latter was that I wasn't sure how careful we need to be about what eax might contain (if it's padding it isn't guaranteed to be zero, but not sure if it's safe to just push any arbitrary value).

category:cq
theme:structs
skill-level:expert
cost:medium
impact:small

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

2 participants