Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect logic on multiple SMSTART calls #329

Closed
FinnWilkinson opened this issue Jul 6, 2023 · 1 comment · Fixed by #339
Closed

Incorrect logic on multiple SMSTART calls #329

FinnWilkinson opened this issue Jul 6, 2023 · 1 comment · Fixed by #339
Assignees
Labels
bug Something isn't working

Comments

@FinnWilkinson
Copy link
Contributor

As part of the SME AArch64 extension, Streaming SVE Mode was introduced. This is enabled via SMSTART and disabled by SMSTOP. For any SME instruction to be executed the core must be in Streaming SVE mode.

SMSTART/SMSTOP is composed of 2 parts: SM and ZA (not to be confused with the register za). When SM goes from an enabled state (=1) to a disabled state (=0) or vice versa then all Vector (z, and subsequently v, b, h, s, d regs) and Predicate (p) registers are set to 0. Similarly, when ZA goes from an enabled state (=1) to a disabled state (=0) or vice versa then the za matrix register is zeroed out.

Currently within SimEng, on any execution of SMSTART or SMSTOP the register clearing functionality is performed for SM or ZA or both; which ever is appropriate for the instruction variation executed. However, this zeroing functionality should only be performed when the state changes from 1 to 0 or 0 to 1, not on any call.

For example, the trace below is what currently would happen in SimEng

SMSTART                    // Enter Streaming SVE mode. Zero all vector, fp, scalar, neon, predicate, and za matrix registers
dup z0.s, #3               // Duplicate the number 3 in all elements of z0 :  `z0.s = {3, 3, 3, 3}`
dup z1.s, #2               // Duplicate the number 2 in all elements of z1 : `z1.s = {2, 2, 2, 2}`
SMSTART                  // Enter Streaming SVE mode, zero all vector, fp, scalar, neon, predicate, and za matrix registers
add z2.s, z1.s, z0.s   // Add z0.s and z1.s element wise : `z2.s = {0, 0, 0, 0}`
SMSTOP                    // Exit Streaming mode. Zero all vector, fp, scalar, neon, predicate, and za matrix registers

However, the following should happen instead - effectively making the second SMSTART have no affect:

SMSTART                    // Enter Streaming SVE mode. Zero all vector, fp, scalar, neon, predicate, and za matrix registers
dup z0.s, #3               // Duplicate the number 3 in all elements of z0 :  `z0.s = {3, 3, 3, 3}`
dup z1.s, #2               // Duplicate the number 2 in all elements of z1 : `z1.s = {2, 2, 2, 2}`
SMSTART                  // Already in Streaming SVE mode. Do nothing
add z2.s, z1.s, z0.s   // Add z0.s and z1.s element wise : `z2.s = {5, 5, 5, 5}`
SMSTOP                    // Exit Streaming mode. Zero all vector, fp, scalar, neon, predicate, and za matrix registers

Whilst such a scenario (or a variation of it where single SMSTART and SMSTOP pairs aren't used in a code block) is unlikely, it would be a valid execution path and as such could cause an unexpected bug in the future.

@FinnWilkinson FinnWilkinson added the bug Something isn't working label Jul 6, 2023
@FinnWilkinson FinnWilkinson self-assigned this Jul 6, 2023
@FinnWilkinson FinnWilkinson linked a pull request Oct 30, 2023 that will close this issue
FinnWilkinson added a commit that referenced this issue Nov 2, 2023
This PR for the most part addresses issue #329 - correcting the SVCR and Streaming Mode logic introduced with SME. This is summaried as:

 - If a program is in Streaming Mode and calls SMSTART again, ZA and Vector registers are not zeroed out.
 - SVE Vector registers (and their aliases such as NEON and Scalar registers) are only zeroed out when SVCR.SM bit changes
 - SME's ZA register is only zeroed out when SVCR.ZA bit changes
 - SVCR can now be correctly updated using a regulat msr svcr, xt instruction in addition to smstart/smstop/msr svcr, #imm
 - More tests surrounding the SVCR logic have been added

Additional to this, logic for the NEON muti-struct store with post index instructions introduced in PR #335 has been corrected via a small change to the metadata to correctly identify if a #imm or xt is used as said post index.
@FinnWilkinson
Copy link
Contributor Author

Issue fixed via commit e0bd27e into dev branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant