-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ROCM/CodeGen] added initial math functions support to rocm #553
Conversation
|
||
// num_signature means number of arguments used to query signature | ||
template<unsigned id, int num_signature> | ||
inline void DispatchLLVMPureIntrin(const TVMArgs& targs, TVMRetValue* rv) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create a new file intrin_rule_llvm.h and move the two template functions into there so it is shared with intrin_rule_llvm.cc
call->type, "llvm_intrin", cargs, Call::Intrinsic); | ||
} | ||
|
||
TVM_REGISTER_GLOBAL("tvm.intrin.rule.llvm.rocm.prefetch") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tvm.intrin.rule.rocm.prefetch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually we can delete this line, as prefetch is only intended for cpu for now
TVM_REGISTER_GLOBAL("tvm.intrin.rule.llvm.rocm.prefetch") | ||
.set_body(DispatchLLVMIntrin<::llvm::Intrinsic::prefetch, 0>); | ||
|
||
TVM_REGISTER_GLOBAL("tvm.intrin.rule.llvm.rocm.exp") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tvm.intrin.rule.rocm.exp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same changes in the following functions
this is surpassed by #570 |
Able to generate
declare f32 @expf()
but should bedeclare f32 @llvm.exp.f32()