Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize range op #30811

Merged
merged 7 commits into from
Mar 11, 2021
Merged

Optimize range op #30811

merged 7 commits into from
Mar 11, 2021

Conversation

thisjiang
Copy link
Contributor

PR types

Performance optimization

PR changes

OPs

Describe

优化起因
paddle.fluid.layers.range存在三个拷贝点,分别将三个变量从gpu拷贝到cpu上,十分耗时

优化一
优化点:

  1. 将输入参数放置在gpu上改为放置在cpu上,即在rangepython op中,使用fill_constant转换变量为tensor时指定force_cpu=True参数。
  2. 判断输入参数是在cpu上还是gpu上,若在cpu上则无需拷贝

优化效果:
和竞品对比(取1000次计算cost):

条件 paddle tf.range torch.arange 优化1
size=100 0.314474 0.071642 0.0051 0.226910
size=100000 0.304597 0.138057 27.3776 0.228088

mask-RCNN动态图速度对比(V100-SXM2-32GB取前18个ips跑4次取平均值):

版本 ips
优化前 6.698619444
优化后 7.189104167

待优化点:

  1. size=100size=100000时cost差不多,可以认为大部分时间都耗在了launch gpu kernel上。但另一方面,完全放到cpu上更耗时,因为首先需要给cpu上的临时变量alloc空间。
  2. 单测op速度还是比不上竞品,猜测原因是tensorflow和pytorch的输入输出都是cpu,不需要分配和拷贝。注:tensorflow实现见sequence_ops.cc#L35,pytorch实现见utility_ops.h#L1420。待进一步分析。

@paddle-bot-old
Copy link

paddle-bot-old bot commented Feb 1, 2021

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot-old
Copy link

paddle-bot-old bot commented Feb 9, 2021

Sorry to inform you that c9c9561's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

Copy link
Contributor

@Xreki Xreki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Xreki Xreki merged commit 9ed6c89 into PaddlePaddle:develop Mar 11, 2021
@thisjiang thisjiang deleted the optimize-range branch April 13, 2021 03:09
thisjiang added a commit to thisjiang/Paddle that referenced this pull request Apr 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants