Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
优化效果
环境:Win10 WLS/Ubuntu 20.04 + GCC9.3.0_x86_64
说明
stars
为 Struct of Array 的类型,修改相应的计算代码。这里用 C array 而不是std::array<float, N>
因为感觉没有必要上模板(计算逻辑不是十分依赖STL与模板,有时间可以比较两种数组的优化)。G * dt
。将部分常量用const
修饰以期 compiler 优化。值得注意的是,不是所有公共表达式都需要提取,如1 / RAND_MAX * 2
就不能。因为1 / RAND_MAX
就是0.0f
,这会导致计算逻辑错误。开启编译器优化前后,对比汇编代码,在优化后
step
函数原来的标量计算指令,许多都被替换成mulps
、addps
与shufps
等 SIMD 指令的结合操作,而且有循环的展开,使得计算速度大幅提升。