[Relax] Implement operators to read runtime DLTensor* information #16563

Lunderberg · 2024-02-13T18:45:44Z

Relax is capable of expressing tensors whose element type is unknown. However, these must typically be replaced with a known dtype prior to compilation, as most operators require known data types prior to legalization. This can be done by using a relax::MatchCast node, such as accepting a parameter arg: R.Tensor([16,16]), then defining the dtype using R.match_cast(arg, R.Tensor([16,16],'float16')).

However, using a R.match_cast node requires knowing which data type should be used in the new R.Tensor, and raises an error for an incorrect data type. If an argument may be one of two distinct data types, R.match_cast cannot be used to check which data type is in use.

This commit adds Relax operators to read the runtime values of a DLTensor* argument. These can be be used to normalize arguments prior to a compute step. For example, pre-processing a model weight that may be provided in either float16 or bfloat16 format.

Relax is capable of expressing tensors whose element type is unknown. However, these must typically be replaced with a known dtype prior to compilation, as most operators require known data types prior to legalization. This can be done by using a `relax::MatchCast` node, such as accepting a parameter `arg: R.Tensor([16,16])`, then defining the dtype using `R.match_cast(arg, R.Tensor([16,16],'float16'))`. However, using a `R.match_cast` node requires knowing which data type should be used in the new `R.Tensor`, and raises an error for an incorrect data type. If an argument may be one of two distinct data types, `R.match_cast` cannot be used to check which data type is in use. This commit adds Relax operators to read the runtime values of a `DLTensor*` argument. These can be be used to normalize arguments prior to a compute step. For example, pre-processing a model weight that may be provided in either `float16` or `bfloat16` format.

slyubomirsky

The implementation seems solid, a very good change. I like the parameterized test cases too. The use of the _DLTensorShapeProxy resulted in an elegant UI in tvmscript.

Idle musing: I wonder if there's any way the PrimFuncs can be unrolled in cases where the parameters are known at compile time.

slyubomirsky · 2024-02-15T03:18:17Z

python/tvm/relax/expr.py

+
+        Used for early checks in `expr.dtype` and `expr.shape`
+        accessors.  While invalid usage would cause errors to be
+        raised durin shape inference, an earlier check makes it easier


Suggested change

raised durin shape inference, an earlier check makes it easier

raised during shape inference, an earlier check makes it easier

typo

Thank you, and fixed

slyubomirsky · 2024-02-15T03:19:04Z

python/tvm/relax/expr.py

+    Exposes accessors for `DLDataType` fields `type_code`, `lanes`,
+    and `bits` within a `DLTensor::dtype`.  Accessing these fields


These are good to have. Offset might also be useful to add, as it might help for memory reuse.

Good point. At the moment, I stuck with the values that have a direct presence elsewhere in Relax, but I agree that it would be good to be able to extract any of the DLTensor* fields.

slyubomirsky · 2024-02-15T03:24:21Z

src/relax/op/tensor/unpack.cc

+ */
+
+/*!
+ * \file unpack.cc


I wonder if "unpack" is the best name. Perhaps "accessors" or "runtime_accessors" could be a little more descriptive? I'm not sure.

how about relax.inspect(as a sub namespace for the ops)

I like relax.inspect quite a bit, and will update the tensors to that namespace.

slyubomirsky · 2024-02-15T03:28:01Z

src/relax/transform/legalize_ops.cc

+        // TODO(Lunderberg): Make a new operator attribute
+        // `.set_attr<Bool>("DataDependent")`, rather than relying on
+        // the name of the operator.


I agree with this, probably should be a separate PR.

Agreed. The note is more so that it has a record somewhere, but it would be enough of a change that it should be part of a separate PR. As it was, I wanted to make as few changes to LegalizeOps as possible in this PR.

slyubomirsky · 2024-02-15T03:28:55Z

src/relax/transform/legalize_ops.cc

+        //     Improve this fallback case, as failure to legalize can
+        //     produce unexpected errors during CodeGenVM.  This could
+        //     be done by having `R.Tensor(ndim=2)` be syntactic sugar
+        //     for `R.Tensor(shape=[m, n])`, where `m` and `n` are new
+        //     shape variables.  This would allow legalization into
+        //     dynamic TIR PrimFuncs.
+        //
+        //     This fallback would only be applicable for cases where
+        //     both the dtype and the dimensionality are known.  While
+        //     Relax can express a tensor with unknown dtype and
+        //     dimensionality as `TensorStructInfo(DataType::Void(),
+        //     kUnknownNDim)`, TIR cannot express unknown dtype or
+        //     unknown dimensionality.


Interesting idea. This could be done by inserting a MatchCast that introduces the new vars. Perhaps this should be filed as an issue rather than made a long comment.

Good call, though I haven't had time to flesh out the idea yet. This would be part of a general cleanup I'd propose for the StructInfo interactions:

Remove the Optional<Expr> shape and int ndim in TensorStructInfo. Instead, have Optional<ShapeStructInfo>.

Remove the ndim field from ShapeStructInfo. Instead, the ndim is unknown when Optional<Array<PrimExpr>> is NullOpt.

If the dimensionality of a ShapeStructInfo is known, every dimension must have an associated PrimExpr. The constructor that accepts ndim initializes fresh TIR variables to represent the unknown size.

slyubomirsky · 2024-02-15T03:33:35Z

python/tvm/relax/expr.py

+            )
+
+    @property
+    def dtype(self) -> "_DLTensorDTypeProxy":


Why does the return type have to be in quotes? I assume it has to do with the property decorator.

More the order of definitions in the file. Function annotations are evaluated when the class is being defined. Since the _DLTensorDTypeProxy class is defined lower in the file, the type annotations are provided as a string, rather than as a class object.

Lunderberg · 2024-02-15T20:42:38Z

The implementation seems solid, a very good change. I like the parameterized test cases too. The use of the _DLTensorShapeProxy resulted in an elegant UI in tvmscript.

Thank you! I really like polishing up the interface to be as clean as possible. (For my own sake if nothing else, as I am liable to forget a builtin-function name, but am much less likely to forget obj.dtype.)

Idle musing: I wonder if there's any way the PrimFuncs can be unrolled in cases where the parameters are known at compile time.

Prior to LegalizeOps, this is implicitly done by the FNormalize implementation for each operator. After LegalizeOps, not so much. While the FoldConstant pass can compile/run TIR functions if their arguments are known, there isn't a good way to indicate that a TIR function only requires the DLTensor struct, and not the data itself.

Lunderberg · 2024-02-20T20:58:59Z

All CI tests passing, and thank you for the review @slyubomirsky ! I'll follow up with another PR to add inspection of the remainder of the DLTensor* fields.

A follow-up PR to apache#16563. This PR implements similar operators to inspect the runtime values of `DLTensor::strides` and `DLTensor::byte_offset`. In addition, while the element offset is not explicitly present in the `DLTensor` struct, a Relax operator is implemented to infer it from the `byte_offset` and `data_type` fields, for use when interacting with the TIR `BufferNode::elem_offset` field.

…16721) * [TIR] LowerTVMBuiltin may use device_type from PrimFunc annotation If an allocation occurs within a host function, it may not have a device/host split. * lint fix * [Relax] Implement operators to inspec DLTensor::strides and offset A follow-up PR to #16563. This PR implements similar operators to inspect the runtime values of `DLTensor::strides` and `DLTensor::byte_offset`. In addition, while the element offset is not explicitly present in the `DLTensor` struct, a Relax operator is implemented to infer it from the `byte_offset` and `data_type` fields, for use when interacting with the TIR `BufferNode::elem_offset` field.

…pache#16721) * [TIR] LowerTVMBuiltin may use device_type from PrimFunc annotation If an allocation occurs within a host function, it may not have a device/host split. * lint fix * [Relax] Implement operators to inspec DLTensor::strides and offset A follow-up PR to apache#16563. This PR implements similar operators to inspect the runtime values of `DLTensor::strides` and `DLTensor::byte_offset`. In addition, while the element offset is not explicitly present in the `DLTensor` struct, a Relax operator is implemented to infer it from the `byte_offset` and `data_type` fields, for use when interacting with the TIR `BufferNode::elem_offset` field.

Lunderberg added 2 commits February 13, 2024 12:19

Added utility methods to access the dtype/ndim/shape

536943a

Lunderberg force-pushed the relax_unpack_ops branch from 89aa131 to 536943a Compare February 13, 2024 20:15

slyubomirsky approved these changes Feb 15, 2024

View reviewed changes

Lunderberg added 2 commits February 15, 2024 20:38

Update based on review comments

aec7c3f

lint fix

7593bc3

Lunderberg merged commit b218557 into apache:main Feb 20, 2024
18 checks passed

Lunderberg deleted the relax_unpack_ops branch February 20, 2024 20:59

This was referenced Feb 27, 2024

[SVE] Add support for scalable data type strings #16612

Merged

[CI Problem] unity/pr-head does not automatically rebase against main before CI run #16657

Closed

Lunderberg mentioned this pull request Mar 14, 2024

[Relax] Implement operators to inspec DLTensor::strides and offset #16721

Merged

ysh329 mentioned this pull request Apr 21, 2024

[Release] v0.16.0 Release Candidate Notes #16911

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relax] Implement operators to read runtime DLTensor* information #16563

[Relax] Implement operators to read runtime DLTensor* information #16563

Lunderberg commented Feb 13, 2024

slyubomirsky left a comment

slyubomirsky Feb 15, 2024

Lunderberg Feb 16, 2024

slyubomirsky Feb 15, 2024

Lunderberg Feb 15, 2024

slyubomirsky Feb 15, 2024

tqchen Feb 15, 2024

Lunderberg Feb 15, 2024

slyubomirsky Feb 15, 2024

Lunderberg Feb 15, 2024

slyubomirsky Feb 15, 2024

Lunderberg Feb 15, 2024

slyubomirsky Feb 15, 2024

Lunderberg Feb 15, 2024

Lunderberg commented Feb 15, 2024

Lunderberg commented Feb 20, 2024

	raised durin shape inference, an earlier check makes it easier
	raised during shape inference, an earlier check makes it easier

		Exposes accessors for `DLDataType` fields `type_code`, `lanes`,
		and `bits` within a `DLTensor::dtype`. Accessing these fields

[Relax] Implement operators to read runtime DLTensor* information #16563

[Relax] Implement operators to read runtime DLTensor* information #16563

Conversation

Lunderberg commented Feb 13, 2024

slyubomirsky left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lunderberg commented Feb 15, 2024

Lunderberg commented Feb 20, 2024