Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve performance of depthwise_conv2d #31099

Merged
merged 2 commits into from
Mar 4, 2021

Conversation

zhangting2020
Copy link
Contributor

@zhangting2020 zhangting2020 commented Feb 22, 2021

PR types

Performance optimization

PR changes

APIs

Describe

improve performance of depthwise_conv2d

当data format为NHWC时,原始实现:

  • 前向:c++端对输入transpose为NCHW,使用NCHW计算,再将输出transpose为NHWC,引入2次transpose
  • 反向:c++端对input和out_grad分别transpose为NCHW,计算input_grad,然后再将input_grad transpose为NHWC,3次transpose
  • c++端使用eigen实现transpose,未深度优化

本PR的修改:
通过在python API里插入transpose,前反向能否减少1次transpose,transpose的性能也相对更优。

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot-old
Copy link

paddle-bot-old bot commented Mar 2, 2021

Sorry to inform you that fc0a2e9's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

@zhangting2020 zhangting2020 force-pushed the depthwise_conv2d branch 2 times, most recently from c7f12c3 to bffe1bb Compare March 4, 2021 05:14
Copy link
Contributor

@Xreki Xreki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhangting2020 zhangting2020 merged commit dcce54e into PaddlePaddle:develop Mar 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants