Add backward implementation for LSTM operator. by qingqing01 · Pull Request #5115 · PaddlePaddle/Paddle

qingqing01 · 2017-10-26T04:18:11Z

Fix #5114

some enhancements will be done in next PR.

Now set the cell state grad to zero to keep the same with the old framework, will support to non-zero cell state gradient.
Support initial hidden state and cell state.
In this PR, use the fixed activations (Sigmoid and Tanh) since there is a bug for activation function pointer. Will support to activations specified by users.

… lstm_bp

…he activation function pointer. It will be fixed later.

luotao1 · 2017-10-30T04:50:25Z

paddle/operators/lstm_op.cc

+             " - Bias = {b_c, b_i, b_f, b_o, W_ic, W_fc, W_oc}.")
+        .AsDispensable();
+    AddOutput("Hidden",
+              "(LoDTensor) the hidden state lod tensor of LSTM operator. "


the hidden state of LSTM operator，中间的lod tensor多余？

Done. Remove lod tensor and fix the shape info.

luotao1 · 2017-10-30T04:50:35Z

paddle/operators/lstm_op.cc

+              "(LoDTensor) the hidden state lod tensor of LSTM operator. "
+              "The shape and lod is the same with the `Input`.");
+    AddOutput("Cell",
+              "(LoDTensor) the cell state lod tensor of LSTM operator. "


luotao1 · 2017-10-30T04:50:56Z

paddle/operators/lstm_op.cc

-              "The shape and lod is the same with the `Input`.");
+    AddOutput("BatchCellPreAct",
+              "(LoDTensor) This LoDTensor is get in the forward and used "
+              "in the backward.")


luotao1 · 2017-10-30T04:59:30Z

paddle/operators/lstm_op.h

+    auto* batch_cell_pre_act = ctx.Input<LoDTensor>("BatchCellPreAct");
+
+    auto* hidden_g = ctx.Input<LoDTensor>(framework::GradVarName("Hidden"));
+    // auto* cell_g = ctx.Input<LoDTensor>(framework::GradVarName("Cell"));


158行可以删掉？

luotao1 · 2017-10-30T05:03:33Z

paddle/operators/lstm_op.h

+      // auto bias_g_e = EigenVector<T>::Flatten(bias_mat);
+      // auto gate_g_e = EigenMatrix<T>::From(batch_gate_g);
+      // Eigen::array<int, 1> dims{{0}};
+      // bias_g_e.device(ctx.GetEigenDevice<Place>()) = gate_g_e.sum(dims);


295-304行是TODO？

They are the equivalent code by Eigen, but the Eigen does not support double type on GPU device, so use GEMV. And I remove these lines.

luotao1 · 2017-10-30T05:03:53Z

paddle/operators/lstm_op.h

+      lstm_grad.gateGrad = gate_g.data<T>();
+      lstm_grad.outputGrad = out_g.data<T>();
+
+      if (n != 0) {


if (n != 0) -》 if (n) ？

luotao1 · 2017-10-30T05:07:18Z

paddle/operators/math/math_function_test.cu

+      ASSERT_FLOAT_EQ(data_c[i], sum);
+    }
+  }
+}


这里看上去和cc的单测，很多代码都是一样的。后续会考虑共用么？

后续可以改进下。

可以建立一个issue，把URL贴在这里就好。这样日后不会忘记了。

建立个issue: #5234

luotao1 · 2017-10-30T05:08:36Z

some enhancements will be done in next PR.

具体是哪些呢？

… lstm_bp

qingqing01

@luotao1 Thanks for your review.

The enhancements needed to do are updated in the comments: #5115 (comment)

And, there are TODO comments in the code.

qingqing01 · 2017-10-30T08:30:57Z

paddle/operators/lstm_op.cc

+             " - Bias = {b_c, b_i, b_f, b_o, W_ic, W_fc, W_oc}.")
+        .AsDispensable();
+    AddOutput("Hidden",
+              "(LoDTensor) the hidden state lod tensor of LSTM operator. "


Done. Remove lod tensor and fix the shape info.

qingqing01 · 2017-10-30T08:31:02Z

paddle/operators/lstm_op.cc

+              "(LoDTensor) the hidden state lod tensor of LSTM operator. "
+              "The shape and lod is the same with the `Input`.");
+    AddOutput("Cell",
+              "(LoDTensor) the cell state lod tensor of LSTM operator. "


qingqing01 · 2017-10-30T08:31:06Z

paddle/operators/lstm_op.cc

-              "The shape and lod is the same with the `Input`.");
+    AddOutput("BatchCellPreAct",
+              "(LoDTensor) This LoDTensor is get in the forward and used "
+              "in the backward.")


qingqing01 · 2017-10-30T08:31:15Z

paddle/operators/lstm_op.h

+    auto* batch_cell_pre_act = ctx.Input<LoDTensor>("BatchCellPreAct");
+
+    auto* hidden_g = ctx.Input<LoDTensor>(framework::GradVarName("Hidden"));
+    // auto* cell_g = ctx.Input<LoDTensor>(framework::GradVarName("Cell"));


qingqing01 · 2017-10-30T08:31:24Z

paddle/operators/lstm_op.h

+      lstm_grad.gateGrad = gate_g.data<T>();
+      lstm_grad.outputGrad = out_g.data<T>();
+
+      if (n != 0) {


qingqing01 · 2017-10-30T08:33:19Z

paddle/operators/lstm_op.h

+      // auto bias_g_e = EigenVector<T>::Flatten(bias_mat);
+      // auto gate_g_e = EigenMatrix<T>::From(batch_gate_g);
+      // Eigen::array<int, 1> dims{{0}};
+      // bias_g_e.device(ctx.GetEigenDevice<Place>()) = gate_g_e.sum(dims);


They are the equivalent code by Eigen, but the Eigen does not support double type on GPU device, so use GEMV. And I remove these lines.

qingqing01 · 2017-10-30T08:33:57Z

paddle/operators/math/math_function_test.cu

+      ASSERT_FLOAT_EQ(data_c[i], sum);
+    }
+  }
+}


后续可以改进下。

… lstm_bp

wangkuiyi

I am far from an expert on this change, but it seems that it has stabilized for a while, so I approved it.

qingqing01 added 4 commits October 25, 2017 15:37

Add LSTM backward implenmentation.

3d8b6eb

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

d2bd735

… lstm_bp

Add gradient check unit testing and fix bug.

cd38286

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

2e02987

… lstm_bp

qingqing01 added the OpPorting label Oct 26, 2017

qingqing01 added 2 commits October 26, 2017 18:02

Add unit testing for gemv and fix the gradien check for bais.

ac3370a

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

bcc0dad

… lstm_bp

qingqing01 requested review from luotao1 and reyoung October 26, 2017 11:14

fix compiling warning.

bd680f1

qingqing01 force-pushed the lstm_bp branch from 078476b to bd680f1 Compare October 26, 2017 11:57

Use fixed activation in the lstm kernel, since there is some bug in t…

b50c33f

…he activation function pointer. It will be fixed later.

qingqing01 requested a review from guoshengCS October 30, 2017 05:00

luotao1 reviewed Oct 30, 2017

View reviewed changes

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

1d7c03e

… lstm_bp

qingqing01 force-pushed the lstm_bp branch from eb948b8 to 2c5d4c6 Compare October 30, 2017 08:41

qingqing01 commented Oct 30, 2017

View reviewed changes

qingqing01 added 2 commits October 31, 2017 19:59

Clean code and update doc.

6f658bb

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

7061e01

… lstm_bp

qingqing01 force-pushed the lstm_bp branch from 2c5d4c6 to 7061e01 Compare October 31, 2017 12:01

wangkuiyi approved these changes Oct 31, 2017

View reviewed changes

qingqing01 merged commit 36d2060 into PaddlePaddle:develop Nov 1, 2017

qingqing01 deleted the lstm_bp branch March 7, 2018 12:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add backward implementation for LSTM operator. #5115

Add backward implementation for LSTM operator. #5115
qingqing01 merged 11 commits intoPaddlePaddle:developfrom
qingqing01:lstm_bp

qingqing01 commented Oct 26, 2017 •

edited

Loading

luotao1 Oct 30, 2017

qingqing01 Oct 30, 2017

luotao1 Oct 30, 2017

qingqing01 Oct 30, 2017

luotao1 Oct 30, 2017

qingqing01 Oct 30, 2017

luotao1 Oct 30, 2017

qingqing01 Oct 30, 2017

luotao1 Oct 30, 2017

qingqing01 Oct 30, 2017

luotao1 Oct 30, 2017

qingqing01 Oct 30, 2017

luotao1 Oct 30, 2017

qingqing01 Oct 30, 2017

wangkuiyi Oct 30, 2017

qingqing01 Oct 31, 2017

luotao1 commented Oct 30, 2017

qingqing01 left a comment

qingqing01 Oct 30, 2017

qingqing01 Oct 30, 2017

qingqing01 Oct 30, 2017

qingqing01 Oct 30, 2017

qingqing01 Oct 30, 2017

qingqing01 Oct 30, 2017

qingqing01 Oct 30, 2017

wangkuiyi left a comment

Labels

3 participants

Conversation

qingqing01 commented Oct 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luotao1 commented Oct 30, 2017

qingqing01 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wangkuiyi left a comment

Choose a reason for hiding this comment

Labels

3 participants

qingqing01 commented Oct 26, 2017 •

edited

Loading