Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROCM] fix dropout and remove hipcub, test=develop #31455

Merged
merged 1 commit into from
Mar 8, 2021

Conversation

qili93
Copy link
Contributor

@qili93 qili93 commented Mar 6, 2021

PR types

Bug fixes

PR changes

OPs

Describe

Fix rocm failure of dropout and remove hipcub duplicate namespace.

@paddle-bot-old
Copy link

paddle-bot-old bot commented Mar 6, 2021

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@qili93 qili93 requested review from smallv0221 and removed request for smallv0221 March 8, 2021 04:47
Copy link
Contributor

@wangxicoding wangxicoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -41,7 +41,11 @@ struct GpuLaunchConfig {

inline GpuLaunchConfig GetGpuLaunchConfig1D(
const platform::CUDADeviceContext& context, int element_count,
#ifdef PADDLE_WITH_HIP
int max_threads = 256) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个为啥是256和cuda的不一样,是不是可以加一个注释

Copy link
Contributor Author

@qili93 qili93 Mar 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

哦,收到,好的,我在下个PR里面加上注释,这个算是ROCM和CUDA平台上的一个普遍问题。

之前在HIP提过一次类似问题的issue ROCm/HIP#2235

HIP的回复说是建议是修改线程数为256。这个issue的表现和其他OP的表现是,如果使用1024线程ROCM会抛GPU内存地址访问错误,512也是,修改为256线程之后就可以正常运行了。

Dropout OP表现比较奇怪,并不会抛core dump,但是如果数据超过1024位,那1024位之后的输出就是错误的,比如输入是一个32*64位的数据,prob=1.0,正常grad应该全都是1,但是ROCM下计算出来前32x32是1,后32x32位是0。改成256线程之后结果就对了。

@qili93 qili93 requested a review from GaoWei8 March 8, 2021 08:21
Copy link
Contributor

@GaoWei8 GaoWei8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for op benchmark ci

@qili93 qili93 merged commit f937796 into PaddlePaddle:develop Mar 8, 2021
@qili93 qili93 deleted the rocm_fix_dropout branch March 8, 2021 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants