Skip to content

Commit

Permalink
[GPU] Disable prefetching for loops with no computation (#19695)
Browse files Browse the repository at this point in the history
There is no point in prefetching if you dont have any compute ops in the
loop.
Currently attempting this is leading to a bug described in
#19612
The proposed solution implemented in this PR is if the loop has no
compute ops then bail out.
Fixes : #19612

---------

Signed-off-by: Nirvedh Meshram <[email protected]>
  • Loading branch information
nirvedhmeshram authored Jan 14, 2025
1 parent 8d1d867 commit 21b0101
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,12 @@ class LoopPrefetcher {
getValueDependencies(compute, computeDependencies);
}
}
// If `scf.yeild` is the only compute op then there is no value in doing
// prefetching.
if (computeDependencies.size() == 1) {
LDBG("Loop does not have compute so not doing prefetching." << forOp);
return failure();
}

// Restore the original order.
for (auto &op : forOp.getBody()->getOperations()) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -134,3 +134,19 @@ func.func @prefetch_add_with_if(%arg0: memref<128xf32>) {
vector.transfer_write %0, %arg0[%c0] {in_bounds = [true]} : vector<1xf32>, memref<128xf32>
return
}

// -----
// CHECK-LABEL: @noprefetch_copyback
func.func @noprefetch_copyback(%arg0: memref<128xf32>, %arg1: memref<128xf32>) {
%cst = arith.constant dense<0.000000e+00> : vector<1xf32>
%cst_0 = arith.constant 0.000000e+00 : f32
%c128 = arith.constant 128 : index
%c1 = arith.constant 1 : index
%c0 = arith.constant 0 : index
scf.for %arg2 = %c0 to %c128 step %c1{
%1 = vector.transfer_read %arg0[%arg2], %cst_0 : memref<128xf32>, vector<1xf32>
vector.transfer_write %1, %arg1[%arg2] {in_bounds = [true]} : vector<1xf32>, memref<128xf32>
}
return
}
// CHECK-NOT: gpu.barrier

0 comments on commit 21b0101

Please sign in to comment.