Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

update fc with nnpck #5570

Merged
merged 1 commit into from
Mar 25, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/how_to/nnpack.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

[NNPACK](https://github.com/Maratyszcza/NNPACK) is an acceleration package for neural network computations, which can run on x86-64, ARMv7, or ARM64 architecture cpus. it's very useful for us using NNPACK to speed up running speed when deploy the trained model on mobile device.

MXNet(nnvm branch) has integrated NNPACK for forward propagation(only inference) in convolution/max-pooling/fully-connected, so you may consider using NNPACK now.
MXNet has integrated NNPACK for forward propagation(only inference) in convolution/max-pooling/fully-connected, so you may consider using NNPACK now.


### Conditions
Expand All @@ -15,7 +15,7 @@ The following table will tell you which satisfaction will NNPACK work.
|:--------- |:---------- |
|convolution |2d convolution `and` no-bias=False `and` dilate=(1,1) `and` num_group=1 `and` batch-size = 1 or batch-size > 1 && stride = (1,1);|
|pooling | max-pooling `and` kernel=(2,2) `and` stride=(2,2) `and` pooling_convention=full |
|fully-connected| batch-size = 2^n |
|fully-connected| without any restrictions |

### Build/Install NNPACK with MXNet

Expand Down
14 changes: 5 additions & 9 deletions src/operator/fully_connected.cc
Original file line number Diff line number Diff line change
Expand Up @@ -36,15 +36,11 @@ Operator* CreateOp<cpu>(FullyConnectedParam param, int dtype,
const size_t batch_size = (*in_shape)[0][0];
// nnp_fully_connected_inference will do optimization for batch-size = 1
// nnp_fully_connected_output will do optimization for batch-size > 1
// but just found FullyConnected in NNPACK result is wrong when batch_size != 2^n
// so here only using NNPACK when batch_size = 2^n.
if ((batch_size == 1) || ((batch_size > 1) && (!(batch_size & (batch_size - 1))))) {
switch (dtype) {
case mshadow::kFloat32:
return new NNPACKFullyConnectedOp<cpu, float>(param);
default:
break;
}
switch (dtype) {
case mshadow::kFloat32:
return new NNPACKFullyConnectedOp<cpu, float>(param);
default:
break;
}
#endif
switch (dtype) {
Expand Down