Nnvm RNN op with icc and mkldnn memory cache #10

lihaofd · 2019-01-22T02:17:10Z

Description

(Brief description on what this PR is about)

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Feature1, tests, (and when applicable, API doc)
Feature2, tests, (and when applicable, API doc)

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

TaoLv

Reviewed part of the change. Will review the rest once get time. Ping @ciyongch.

TaoLv · 2019-01-24T04:54:30Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+namespace op {
+
+namespace mkldnn_rnn_enum {
+  enum RNNModeType {kRnnRelu, kRnnTanh, kLstm, kGru};


it's already defined in operator/rnn-inl.h.

TaoLv · 2019-01-24T05:01:15Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+}
+
+template <typename DType>
+mkldnn::memory::data_type GetMKLDNNDataType() {


Looks strange and unsafe for me. Please refer the example in https://en.cppreference.com/w/cpp/language/typeid .

There's already a function "get_mkldnn_type()" in mkldnn_base-inl.h, can we just reuse that function instead of creating a similar one?

TaoLv · 2019-01-24T05:04:09Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+  return algo;
+}
+
+void ReorderForWeight(mkldnn::memory src,


Why ReorderForWeight? Looks like a general reorder function, not only for weight.

TaoLv · 2019-01-24T05:05:53Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+void ReorderForWeight(mkldnn::memory src,
+                      mkldnn::memory dst) {
+  auto r = mkldnn::reorder(src, dst);
+  stream(stream::kind::eager).submit({r}).wait();


what's stream? If it's mkldnn::stream, you need call MKLDNNStream::Get().

TaoLv · 2019-01-24T05:08:41Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+
+  auto dst_desc = mkldnn::memory::desc(dst_cds, mkldnn_dtype, dst_format);
+  auto concat_pd = mkldnn::concat::primitive_desc(dst_desc, concat_dimension, srcs_pd);
+  auto dst = mkldnn::memory(concat_pd.dst_primitive_desc());


you will allocate temporal memory here...

TaoLv · 2019-01-24T05:10:48Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+                          mkldnn::memory::dims dst_cds,
+                          mkldnn::memory::data_type mkldnn_dtype,
+                          int concat_dimension,
+                          std::vector<DType*> srcs_data) {


It's not necessary to have DType here. Use 'void*' then you needn't have this function as a template function.

TaoLv · 2019-01-24T05:12:52Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+  for (size_t i = 0; i < srcs_cds.size(); i++) {
+    auto desc = memory::desc(srcs_cds[i], mkldnn_dtype, src_format);
+    auto mpd = memory::primitive_desc(desc, cpu_engine);
+    auto src_memory = mkldnn::memory({desc, cpu_engine}, srcs_data[i]);


atuo src_memory = mkldnn::memory(mpd, srcs_data[i]);

TaoLv · 2019-01-24T05:16:57Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+                      const int T,
+                      const int N,
+                      int I,
+                      const int H,


Why T, N, H are const but L, D, I are not const? Any special consideration?

ciyongch · 2019-01-25T02:14:53Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+}
+
+template <typename DType>
+mkldnn::memory::data_type GetMKLDNNDataType() {


There's already a function "get_mkldnn_type()" in mkldnn_base-inl.h, can we just reuse that function instead of creating a similar one?

ciyongch · 2019-01-25T02:25:39Z

src/operator/rnn-inl.h

@@ -39,6 +39,7 @@
 #include "./math_functions-inl.h"
 #include "./operator_common.h"
 #include "./rnn_impl.h"
+#include "./nn/mkldnn/mkldnn_rnn_impl.h"


add "#if MXNET_USE_MKLDNN ==1" for mkldnn related header file.

ciyongch · 2019-01-25T02:32:41Z

src/operator/rnn-inl.h

-    param_.Init(kwargs);
+#if MXNET_USE_MKLDNN == 1
+template<typename DType>
+static RNNOp<DType> &GetMKLDNNRNNOp(const RNNParam &param,


move MKLDNN functions to mkldnn related files.

many of class and struct are shared here, if moving all to mkldnn related file, it will create many repeated codes

Agree to move MKLDNN code to mkldnn_rnn_impl.h. Please refer what we did for other operators.

ciyongch · 2019-01-25T02:34:05Z

src/operator/rnn-inl.h

+    std::shared_ptr<RNNOp<DType>> op(new RNNOp<DType>(param));
+    auto ins_ret = ops.insert(std::pair<RNNSignature, std::shared_ptr<RNNOp<DType> > >(key, op));
+    CHECK(ins_ret.second);
+    it = ins_ret.first;


refer mkldnn_convolution.cc to use AddToCache() function here.
Why not add data/output here?

ciyongch · 2019-01-25T02:39:17Z

src/operator/rnn.cc

+  if (param_.mode == rnn_enum::kLstm) {
+    CHECK_EQ(in_shape->size(), 4U) << "Input:[data, parameters, state, cell_state]";
+  } else {
+    CHECK_EQ(in_shape->size(), 3U) << "Input:[data, parameters, state]";


Better to make the error info more comprehensible, such as expected 4 inputs ..... but got x inputs?

ciyongch · 2019-01-25T02:54:30Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+        ConcatData(mkldnn::memory::format::ldgoi, mkldnn::memory::format::ldgoi,
+            {weights_iter_r_tz_0, weights_iter_r_tz_0}, weights_iter_tz_0,
+             mkldnn_dtype, 1, srcs_data1);
+  }


combine these two if closure into one.

ciyongch · 2019-01-25T02:56:08Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+
+  auto user_src_layer_memory_0 = mkldnn::memory(
+      { user_src_layer_md_0, cpu_engine }, x_0);
+


can we use the way SetNewMem as current MKLDNN ops?

ciyongch · 2019-01-25T02:58:32Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+    }
+  }
+  //  go to next L - 1 layers.
+  //  If D = 2, do it layer by layer. If D = 1, fused L - 1 layers


For the case of I != H, I think the two stage is reasonable due to MKLDNN API restriction.
For the case of I == H, do you think it's able to combine the two stage into one? If so, only one primitive is needed.

ciyongch · 2019-01-25T03:00:51Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+  memory::dims dst_layer_tz_0 = {T, N, D * H};
+  memory::dims src_iter_tz_0 = {1, D, nstates, N, H};  //  ldsnc
+  memory::dims dst_iter_tz_0 = {1, D, nstates, N, H};  //  ldsnc
+  int offset1 = 0, offset2 = 0;


minor comments for the variables naming, please use more meaningful names. maybe change back_xxx to l2r_xxx or r2l_xxx?

No, in non_mkldnn path, the naming of GRU/LSTM/vRNN are based on back_xxx, it is better to keep them same

ciyongch · 2019-01-25T03:01:49Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+    back_b_ptr = b_ptr + single_b_size * 2;
+  }
+  DType* back_wx_0 = back_w_ptr;
+  DType* back_wh_0 = back_w_ptr + I * H * ngates;


move all the assignment of back_xxx to if (D == 2) closure, and with default value here is enough.

will trying it, but some of them have ordered definition and can't move them all

TaoLv · 2019-01-25T05:51:06Z

Cannot see your fix. Please push them here. @lihaofd

lihaofd · 2019-01-25T05:53:32Z

@TaoLv I am still fixing the code, if I pushed the code now, many of comment's relate code can't be found easily, I will push the code fix after I fix them all.

TaoLv · 2019-01-25T09:29:05Z

I see. So it's "will fix" rather than "fixed".

TaoLv · 2019-01-25T09:35:40Z

src/operator/rnn.cc

+    if ((*in_type)[i] == -1) {
+      (*in_type)[i] = dtype;
+    } else {
+      UNIFORM_TYPE_CHECK((*in_type)[i], dtype, ListArguments(param_)[i]);


May exceed the boundary of list if i > 3.

TaoLv · 2019-01-25T09:37:20Z

src/operator/rnn.cc

+  DispatchMode wanted_mode = DispatchMode::kFCompute;
+  #if MXNET_USE_MKLDNN == 1
+    wanted_mode = DispatchMode::kFComputeEx;
+  #endif


it doesn't work for gpu.

TaoLv · 2019-01-25T09:43:01Z

src/operator/rnn-inl.h

-    param_.Init(kwargs);
+#if MXNET_USE_MKLDNN == 1
+template<typename DType>
+static RNNOp<DType> &GetMKLDNNRNNOp(const RNNParam &param,


Agree to move MKLDNN code to mkldnn_rnn_impl.h. Please refer what we did for other operators.

TaoLv · 2019-01-25T09:47:01Z

src/operator/rnn-inl.h

+                     const std::vector<NDArray> &inputs,
+                     const std::vector<OpReqType> &req,
+                     const std::vector<NDArray> &outputs) {
+  RNNParam& param = (RNNParam&)nnvm::get<RNNParam>(attrs.parsed);


Many code in this function should be moved to MKLDNNRNNForward.

TaoLv · 2019-01-25T09:48:54Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+
+  auto prim_desc_0 = mkldnn::rnn_forward::primitive_desc(layer_desc_0, cpu_engine);
+  auto dst_layer_memory_0 = mkldnn::memory(prim_desc_0.dst_layer_primitive_desc());
+  auto dst_iter_memory_0 = mkldnn::memory(prim_desc_0.dst_iter_primitive_desc());


It will allocate memory.

TaoLv · 2019-01-25T09:51:54Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+    auto prim_desc = mkldnn::rnn_forward::primitive_desc(layer_desc, cpu_engine);
+    auto dst_layer_memory = mkldnn::memory(prim_desc.dst_layer_primitive_desc());
+    dst_layer_memory.set_data_handle(y);
+    auto dst_iter_memory = mkldnn::memory(prim_desc.dst_iter_primitive_desc());


it will allocate memory.

TaoLv · 2019-01-25T09:53:27Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+    auto prim_desc = mkldnn::rnn_forward::primitive_desc(layer_desc, cpu_engine);
+    auto dst_layer_memory = mkldnn::memory(prim_desc.dst_layer_primitive_desc());
+    dst_layer_memory.set_data_handle(y);
+    auto dst_iter_memory = mkldnn::memory(prim_desc.dst_iter_primitive_desc());


memory allocation again.

TaoLv · 2019-01-25T09:54:09Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+                    user_src_iter_memory, (*weight_bias_memory)[3],
+                    (*weight_bias_memory)[4], (*weight_bias_memory)[5],
+                    dst_layer_memory, dst_iter_memory, null_memory_));
+    stream(stream::kind::eager).submit(rnn_net2).wait();


MKLDNNStream

TaoLv · 2019-01-25T09:56:53Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+}
+
+template <typename DType>
+void MKLDNNRNNForward(bool state_outputs,


This function is too long to review and maintain. Can you split it into several functions and combine them together here. I guess some blocks can also be shared when we implement backward.

TaoLv · 2019-01-25T09:57:08Z

src/operator/nn/mkldnn/mkldnn_rnn_impl.h

+  auto dst = mkldnn::memory(concat_pd.dst_primitive_desc());
+
+  auto c = mkldnn::concat(concat_pd, inputs, dst);
+  stream(stream::kind::eager).submit({c}).wait();


MKLDNNStream

TaoLv · 2019-01-25T09:58:26Z

No need to wait for all comments addressed. Github will help to maintain the review history.

…nn_impl.h etc

Li, Hao H added 2 commits January 11, 2019 19:02

change RNN OP to nnvm

3843499

icc19 and mkldnn memory cache

4fec2f7

TaoLv reviewed Jan 24, 2019

View reviewed changes

ciyongch reviewed Jan 25, 2019

View reviewed changes

TaoLv reviewed Jan 25, 2019

View reviewed changes

Li, Hao H added 2 commits January 25, 2019 18:45

fix some comments

88d305b

fix comments about split into functions, move mkldnn code to mkldnn_r…

39dacb0

…nn_impl.h etc


		auto user_src_layer_memory_0 = mkldnn::memory(
		{ user_src_layer_md_0, cpu_engine }, x_0);

Nnvm RNN op with icc and mkldnn memory cache #10

Are you sure you want to change the base?

Nnvm RNN op with icc and mkldnn memory cache #10

Conversation

lihaofd commented Jan 22, 2019

Description

Checklist

Essentials

Changes

Comments

TaoLv left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TaoLv commented Jan 25, 2019

lihaofd commented Jan 25, 2019

TaoLv commented Jan 25, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TaoLv commented Jan 25, 2019