Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relax] Implement operators to read runtime DLTensor* information #16563

Merged
merged 4 commits into from
Feb 20, 2024

Conversation

Lunderberg
Copy link
Contributor

Relax is capable of expressing tensors whose element type is unknown. However, these must typically be replaced with a known dtype prior to compilation, as most operators require known data types prior to legalization. This can be done by using a relax::MatchCast node, such as accepting a parameter arg: R.Tensor([16,16]), then defining the dtype using R.match_cast(arg, R.Tensor([16,16],'float16')).

However, using a R.match_cast node requires knowing which data type should be used in the new R.Tensor, and raises an error for an incorrect data type. If an argument may be one of two distinct data types, R.match_cast cannot be used to check which data type is in use.

This commit adds Relax operators to read the runtime values of a DLTensor* argument. These can be be used to normalize arguments prior to a compute step. For example, pre-processing a model weight that may be provided in either float16 or bfloat16 format.

Relax is capable of expressing tensors whose element type is unknown.
However, these must typically be replaced with a known dtype prior to
compilation, as most operators require known data types prior to
legalization.  This can be done by using a `relax::MatchCast` node,
such as accepting a parameter `arg: R.Tensor([16,16])`, then defining
the dtype using `R.match_cast(arg, R.Tensor([16,16],'float16'))`.

However, using a `R.match_cast` node requires knowing which data type
should be used in the new `R.Tensor`, and raises an error for an
incorrect data type.  If an argument may be one of two distinct data
types, `R.match_cast` cannot be used to check which data type is in
use.

This commit adds Relax operators to read the runtime values of a
`DLTensor*` argument.  These can be be used to normalize arguments
prior to a compute step.  For example, pre-processing a model weight
that may be provided in either `float16` or `bfloat16` format.
Copy link
Contributor

@slyubomirsky slyubomirsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation seems solid, a very good change. I like the parameterized test cases too. The use of the _DLTensorShapeProxy resulted in an elegant UI in tvmscript.

Idle musing: I wonder if there's any way the PrimFuncs can be unrolled in cases where the parameters are known at compile time.


Used for early checks in `expr.dtype` and `expr.shape`
accessors. While invalid usage would cause errors to be
raised durin shape inference, an earlier check makes it easier
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raised durin shape inference, an earlier check makes it easier
raised during shape inference, an earlier check makes it easier

typo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, and fixed

Comment on lines +287 to +288
Exposes accessors for `DLDataType` fields `type_code`, `lanes`,
and `bits` within a `DLTensor::dtype`. Accessing these fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are good to have. Offset might also be useful to add, as it might help for memory reuse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. At the moment, I stuck with the values that have a direct presence elsewhere in Relax, but I agree that it would be good to be able to extract any of the DLTensor* fields.

*/

/*!
* \file unpack.cc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if "unpack" is the best name. Perhaps "accessors" or "runtime_accessors" could be a little more descriptive? I'm not sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about relax.inspect(as a sub namespace for the ops)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like relax.inspect quite a bit, and will update the tensors to that namespace.

Comment on lines +225 to +227
// TODO(Lunderberg): Make a new operator attribute
// `.set_attr<Bool>("DataDependent")`, rather than relying on
// the name of the operator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this, probably should be a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. The note is more so that it has a record somewhere, but it would be enough of a change that it should be part of a separate PR. As it was, I wanted to make as few changes to LegalizeOps as possible in this PR.

Comment on lines +192 to +204
// Improve this fallback case, as failure to legalize can
// produce unexpected errors during CodeGenVM. This could
// be done by having `R.Tensor(ndim=2)` be syntactic sugar
// for `R.Tensor(shape=[m, n])`, where `m` and `n` are new
// shape variables. This would allow legalization into
// dynamic TIR PrimFuncs.
//
// This fallback would only be applicable for cases where
// both the dtype and the dimensionality are known. While
// Relax can express a tensor with unknown dtype and
// dimensionality as `TensorStructInfo(DataType::Void(),
// kUnknownNDim)`, TIR cannot express unknown dtype or
// unknown dimensionality.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea. This could be done by inserting a MatchCast that introduces the new vars. Perhaps this should be filed as an issue rather than made a long comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, though I haven't had time to flesh out the idea yet. This would be part of a general cleanup I'd propose for the StructInfo interactions:

  • Remove the Optional<Expr> shape and int ndim in TensorStructInfo. Instead, have Optional<ShapeStructInfo>.
  • Remove the ndim field from ShapeStructInfo. Instead, the ndim is unknown when Optional<Array<PrimExpr>> is NullOpt.
  • If the dimensionality of a ShapeStructInfo is known, every dimension must have an associated PrimExpr. The constructor that accepts ndim initializes fresh TIR variables to represent the unknown size.

)

@property
def dtype(self) -> "_DLTensorDTypeProxy":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the return type have to be in quotes? I assume it has to do with the property decorator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More the order of definitions in the file. Function annotations are evaluated when the class is being defined. Since the _DLTensorDTypeProxy class is defined lower in the file, the type annotations are provided as a string, rather than as a class object.

@Lunderberg
Copy link
Contributor Author

The implementation seems solid, a very good change. I like the parameterized test cases too. The use of the _DLTensorShapeProxy resulted in an elegant UI in tvmscript.

Thank you! I really like polishing up the interface to be as clean as possible. (For my own sake if nothing else, as I am liable to forget a builtin-function name, but am much less likely to forget obj.dtype.)

Idle musing: I wonder if there's any way the PrimFuncs can be unrolled in cases where the parameters are known at compile time.

Prior to LegalizeOps, this is implicitly done by the FNormalize implementation for each operator. After LegalizeOps, not so much. While the FoldConstant pass can compile/run TIR functions if their arguments are known, there isn't a good way to indicate that a TIR function only requires the DLTensor struct, and not the data itself.

@Lunderberg
Copy link
Contributor Author

All CI tests passing, and thank you for the review @slyubomirsky ! I'll follow up with another PR to add inspection of the remainder of the DLTensor* fields.

@Lunderberg Lunderberg merged commit b218557 into apache:main Feb 20, 2024
18 checks passed
@Lunderberg Lunderberg deleted the relax_unpack_ops branch February 20, 2024 20:59
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Mar 14, 2024
A follow-up PR to apache#16563.  This PR
implements similar operators to inspect the runtime values of
`DLTensor::strides` and `DLTensor::byte_offset`.  In addition, while the
element offset is not explicitly present in the `DLTensor` struct, a
Relax operator is implemented to infer it from the `byte_offset` and
`data_type` fields, for use when interacting with the TIR
`BufferNode::elem_offset` field.
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Mar 15, 2024
A follow-up PR to apache#16563.  This PR
implements similar operators to inspect the runtime values of
`DLTensor::strides` and `DLTensor::byte_offset`.  In addition, while the
element offset is not explicitly present in the `DLTensor` struct, a
Relax operator is implemented to infer it from the `byte_offset` and
`data_type` fields, for use when interacting with the TIR
`BufferNode::elem_offset` field.
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Mar 18, 2024
A follow-up PR to apache#16563.  This PR
implements similar operators to inspect the runtime values of
`DLTensor::strides` and `DLTensor::byte_offset`.  In addition, while the
element offset is not explicitly present in the `DLTensor` struct, a
Relax operator is implemented to infer it from the `byte_offset` and
`data_type` fields, for use when interacting with the TIR
`BufferNode::elem_offset` field.
Lunderberg added a commit that referenced this pull request Mar 26, 2024
…16721)

* [TIR] LowerTVMBuiltin may use device_type from PrimFunc annotation

If an allocation occurs within a host function, it may not have a
device/host split.

* lint fix

* [Relax] Implement operators to inspec DLTensor::strides and offset

A follow-up PR to #16563.  This PR
implements similar operators to inspect the runtime values of
`DLTensor::strides` and `DLTensor::byte_offset`.  In addition, while the
element offset is not explicitly present in the `DLTensor` struct, a
Relax operator is implemented to infer it from the `byte_offset` and
`data_type` fields, for use when interacting with the TIR
`BufferNode::elem_offset` field.
thaisacs pushed a commit to thaisacs/tvm that referenced this pull request Apr 3, 2024
…pache#16721)

* [TIR] LowerTVMBuiltin may use device_type from PrimFunc annotation

If an allocation occurs within a host function, it may not have a
device/host split.

* lint fix

* [Relax] Implement operators to inspec DLTensor::strides and offset

A follow-up PR to apache#16563.  This PR
implements similar operators to inspect the runtime values of
`DLTensor::strides` and `DLTensor::byte_offset`.  In addition, while the
element offset is not explicitly present in the `DLTensor` struct, a
Relax operator is implemented to infer it from the `byte_offset` and
`data_type` fields, for use when interacting with the TIR
`BufferNode::elem_offset` field.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants