Add backward implementation for LSTM operator. #5115
Add backward implementation for LSTM operator. #5115qingqing01 merged 11 commits intoPaddlePaddle:developfrom
Conversation
…he activation function pointer. It will be fixed later.
paddle/operators/lstm_op.cc
Outdated
| " - Bias = {b_c, b_i, b_f, b_o, W_ic, W_fc, W_oc}.") | ||
| .AsDispensable(); | ||
| AddOutput("Hidden", | ||
| "(LoDTensor) the hidden state lod tensor of LSTM operator. " |
There was a problem hiding this comment.
the hidden state of LSTM operator,中间的lod tensor多余?
There was a problem hiding this comment.
Done. Remove lod tensor and fix the shape info.
paddle/operators/lstm_op.cc
Outdated
| "(LoDTensor) the hidden state lod tensor of LSTM operator. " | ||
| "The shape and lod is the same with the `Input`."); | ||
| AddOutput("Cell", | ||
| "(LoDTensor) the cell state lod tensor of LSTM operator. " |
| "The shape and lod is the same with the `Input`."); | ||
| AddOutput("BatchCellPreAct", | ||
| "(LoDTensor) This LoDTensor is get in the forward and used " | ||
| "in the backward.") |
paddle/operators/lstm_op.h
Outdated
| auto* batch_cell_pre_act = ctx.Input<LoDTensor>("BatchCellPreAct"); | ||
|
|
||
| auto* hidden_g = ctx.Input<LoDTensor>(framework::GradVarName("Hidden")); | ||
| // auto* cell_g = ctx.Input<LoDTensor>(framework::GradVarName("Cell")); |
paddle/operators/lstm_op.h
Outdated
| // auto bias_g_e = EigenVector<T>::Flatten(bias_mat); | ||
| // auto gate_g_e = EigenMatrix<T>::From(batch_gate_g); | ||
| // Eigen::array<int, 1> dims{{0}}; | ||
| // bias_g_e.device(ctx.GetEigenDevice<Place>()) = gate_g_e.sum(dims); |
There was a problem hiding this comment.
They are the equivalent code by Eigen, but the Eigen does not support double type on GPU device, so use GEMV. And I remove these lines.
paddle/operators/lstm_op.h
Outdated
| lstm_grad.gateGrad = gate_g.data<T>(); | ||
| lstm_grad.outputGrad = out_g.data<T>(); | ||
|
|
||
| if (n != 0) { |
| ASSERT_FLOAT_EQ(data_c[i], sum); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
这里看上去和cc的单测,很多代码都是一样的。后续会考虑共用么?
There was a problem hiding this comment.
可以建立一个issue,把URL贴在这里就好。这样日后不会忘记了。
具体是哪些呢? |
qingqing01
left a comment
There was a problem hiding this comment.
@luotao1 Thanks for your review.
The enhancements needed to do are updated in the comments: #5115 (comment)
And, there are TODO comments in the code.
paddle/operators/lstm_op.cc
Outdated
| " - Bias = {b_c, b_i, b_f, b_o, W_ic, W_fc, W_oc}.") | ||
| .AsDispensable(); | ||
| AddOutput("Hidden", | ||
| "(LoDTensor) the hidden state lod tensor of LSTM operator. " |
There was a problem hiding this comment.
Done. Remove lod tensor and fix the shape info.
paddle/operators/lstm_op.cc
Outdated
| "(LoDTensor) the hidden state lod tensor of LSTM operator. " | ||
| "The shape and lod is the same with the `Input`."); | ||
| AddOutput("Cell", | ||
| "(LoDTensor) the cell state lod tensor of LSTM operator. " |
| "The shape and lod is the same with the `Input`."); | ||
| AddOutput("BatchCellPreAct", | ||
| "(LoDTensor) This LoDTensor is get in the forward and used " | ||
| "in the backward.") |
paddle/operators/lstm_op.h
Outdated
| auto* batch_cell_pre_act = ctx.Input<LoDTensor>("BatchCellPreAct"); | ||
|
|
||
| auto* hidden_g = ctx.Input<LoDTensor>(framework::GradVarName("Hidden")); | ||
| // auto* cell_g = ctx.Input<LoDTensor>(framework::GradVarName("Cell")); |
paddle/operators/lstm_op.h
Outdated
| lstm_grad.gateGrad = gate_g.data<T>(); | ||
| lstm_grad.outputGrad = out_g.data<T>(); | ||
|
|
||
| if (n != 0) { |
paddle/operators/lstm_op.h
Outdated
| // auto bias_g_e = EigenVector<T>::Flatten(bias_mat); | ||
| // auto gate_g_e = EigenMatrix<T>::From(batch_gate_g); | ||
| // Eigen::array<int, 1> dims{{0}}; | ||
| // bias_g_e.device(ctx.GetEigenDevice<Place>()) = gate_g_e.sum(dims); |
There was a problem hiding this comment.
They are the equivalent code by Eigen, but the Eigen does not support double type on GPU device, so use GEMV. And I remove these lines.
| ASSERT_FLOAT_EQ(data_c[i], sum); | ||
| } | ||
| } | ||
| } |
wangkuiyi
left a comment
There was a problem hiding this comment.
I am far from an expert on this change, but it seems that it has stabilized for a while, so I approved it.
Fix #5114
some enhancements will be done in next PR.
SigmoidandTanh) since there is a bug for activation function pointer. Will support to activations specified by users.