FindIntrinsics creates incorrect IR for casts from bool to arithmetic type #6601

shoaibkamil · 2022-02-01T19:56:18Z

Here's a small repro:

#include "Halide.h"

using namespace Halide;

class MyGen : public Generator<MyGen> {
public:
  Input<Buffer<float, 1>> in{"input"};
  Output<Buffer<float, 1>> out{"output"};
  Var x, tx;

  void generate() {
    out(x) = cast<float>(in(x) >= 0) - cast<float>(in(x) < 0);
    out.gpu_tile(x, tx, 16)
      .vectorize(tx, 4);
  }

};

HALIDE_REGISTER_GENERATOR(MyGen, mygen)

Building for a GPU target results in:

Unhandled exception: Error: Can't represent an integer with this many bits in Metal C: int4x4

Drilling down a bit, what I'm seeing is that the IR after FindIntrinsics includes expressions like:

output[ramp(((.__thread_id_x*4) + output.extent.0) + -16, 1, 4)] = 
   float32x4((int4x4)widening_sub(int2x4((x4(0.000000f) <= t19)), int2x4((t19 < x4(0.000000f)))))

The text was updated successfully, but these errors were encountered:

rootjalex · 2022-02-02T21:06:26Z

This issue is coming from here: https://github.com/halide/Halide/blob/1e7040c9ab44c31b4754aa0cdd3ca0555c8f8142/src/FindIntrinsics.cpp#L217-#L218

Which allows for this rewrite:

cast<float>(a)  - cast<float>(b) -> cast<float>(widening_sub(a, b))

As long as a and b can be losslessly cast to uints with < 32 bits. (likewise for < 64 bits if it was a cast<double>). Is this a valid rewrite that just shouldn't be applied to booleans, or should it not be a valid rewrite at all? I'm unsure of the semantics here (maybe @dsharlet or @abadams know?).

The same problem also appears exists for add, but I imagine this is rather difficult to trigger: https://github.com/halide/Halide/blob/1e7040c9ab44c31b4754aa0cdd3ca0555c8f8142/src/FindIntrinsics.cpp#L187-#L188

pranavb-ca · 2022-02-02T21:30:37Z

Minor nit: I think it allows for the rewrite

cast<float>(a)  - cast<float>(b) -> (widening_sub(a, b)

FWIW, this used to be in the Hexagon backend in Halide for the widening_add & sub operations which Dillon pulled into FindIntrinsics. We didn't however apply this to booleans. May be the check for types needs to be more restrictive.

rootjalex · 2022-02-02T21:45:37Z

@pranavb-ca I'm not sure what the difference is, could you elaborate? The cast back to float comes from here:

Halide/src/FindIntrinsics.cpp

Line 232 in 1e7040c

result = Cast::make(op->type, result);

I agree that this should probably be more strict on types, but I'm also wondering if this rewrite should be valid for, e.g.:

cast<float>((uint8)a)  - cast<float>((uint8)b) -> cast<float>((uint16)widening_sub((uint8)a, (uint8)b))

pranavb-ca · 2022-02-02T22:42:50Z

@pranavb-ca I'm not sure what the difference is, could you elaborate? The cast back to float comes from here:

Halide/src/FindIntrinsics.cpp

Line 232 in 1e7040c

result = Cast::make(op->type, result);

I agree that this should probably be more strict on types, but I'm also wondering if this rewrite should be valid for, e.g.:
cast<float>((uint8)a)  - cast<float>((uint8)b) -> cast<float>((uint16)widening_sub((uint8)a, (uint8)b))

I apologize, you are right @rootjalex. Since in the Hexagon backend we used to restrict it to the same type (different width) I assumed that was the case here as well.

rootjalex · 2022-02-02T22:50:11Z

Oh no worries @pranavb-ca ! Thanks for the clarification!

rootjalex mentioned this issue Feb 18, 2022

Disallow Type::narrow() and Type::widen() from producing bitwidths between 1 and 8 bits #6622

Merged

rootjalex closed this as completed in #6622 Feb 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FindIntrinsics creates incorrect IR for casts from bool to arithmetic type #6601

FindIntrinsics creates incorrect IR for casts from bool to arithmetic type #6601

shoaibkamil commented Feb 1, 2022

rootjalex commented Feb 2, 2022 •

edited

Loading

pranavb-ca commented Feb 2, 2022

rootjalex commented Feb 2, 2022

pranavb-ca commented Feb 2, 2022

rootjalex commented Feb 2, 2022

FindIntrinsics creates incorrect IR for casts from bool to arithmetic type #6601

FindIntrinsics creates incorrect IR for casts from bool to arithmetic type #6601

Comments

shoaibkamil commented Feb 1, 2022

rootjalex commented Feb 2, 2022 • edited Loading

pranavb-ca commented Feb 2, 2022

rootjalex commented Feb 2, 2022

pranavb-ca commented Feb 2, 2022

rootjalex commented Feb 2, 2022

rootjalex commented Feb 2, 2022 •

edited

Loading