Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMDGPU: update GFX11 wmma hazards #76143

Merged
merged 1 commit into from
Jan 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 2 additions & 21 deletions llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1713,8 +1713,8 @@ bool GCNHazardRecognizer::fixWMMAHazards(MachineInstr *MI) {
if (!SIInstrInfo::isWMMA(I))
return false;

// Src0 or Src1 of the current wmma instruction overlaps with the dest of
// the previous wmma.
// Src0(matrix A) or Src1(matrix B) of the current wmma instruction overlaps
// with the dest(matrix D) of the previous wmma.
const Register CurSrc0Reg =
TII->getNamedOperand(*MI, AMDGPU::OpName::src0)->getReg();
const Register CurSrc1Reg =
Expand All @@ -1728,25 +1728,6 @@ bool GCNHazardRecognizer::fixWMMAHazards(MachineInstr *MI) {
return true;
}

// Src2 of the current wmma instruction overlaps with the dest of the
// previous wmma.
const MachineOperand *Src2 =
TII->getNamedOperand(*MI, AMDGPU::OpName::src2);
const Register CurSrc2Reg = Src2->isReg() ? Src2->getReg() : Register();

if (CurSrc2Reg != AMDGPU::NoRegister &&
TRI->regsOverlap(PrevDstReg, CurSrc2Reg)) {

const MachineOperand *Src2Mods =
TII->getNamedOperand(*MI, AMDGPU::OpName::src2_modifiers);
const bool NoSrc2Mods =
(Src2Mods->getImm() & (SISrcMods::NEG | SISrcMods::NEG_HI)) == 0;
// Exception: there is no hazard if the wmma instructions are of the same
// type and there is no input modifier on src2 of the current instruction.
return !(NoSrc2Mods && (TII->pseudoToMCOpcode(I.getOpcode()) ==
TII->pseudoToMCOpcode(MI->getOpcode())));
}

Comment on lines -1731 to -1749
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did this end up here in the first place?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was coded according to the spec, but the behavior/spec have recently been clarified: inserting V_NOP in these cases won't help correctness and performance at all.

return false;
};

Expand Down
Loading
Loading