Update activations for MKL-DNN#10597
Conversation
771ac6f to
598ae8e
Compare
What's the changes do you mean? |
|
Changes for register activation operators (look at changes in activation_op.cc file, e.g. macros to create REGISTER_ACTIVATION_OP_MAKER class XXXOpMaker, new macros for register operators e.g. INPLACE, etc.) |
There was a problem hiding this comment.
Why changes of activation_op.cc?
- Are the previous registration not worked? But I see that the unit-test
test_activation_mkldnn_op.pyworks well. - Current changes are not suitable for other devices. For example, if there is
amd_relu_op, how should it be defined?
598ae8e to
bf447fc
Compare
There was a problem hiding this comment.
Why only save src_memory, how about dst_memory?
There was a problem hiding this comment.
eltwise_grad needs input data (e.g. look at activation_op.cc ReluAddInput("X", ...)). I have no access to this data directly in eltwise_grad function. Only eltwise_forward has access to this input data.
dst_memory is not used in eltwise_grad.
There was a problem hiding this comment.
For example, in MKL-DNN sqrt_bwd has different implementation than PaddlePaddle:
// MKL-DNN
template
T sqrt_bwd(T dd, T s) {
return s > 0 ? dd / (2 * ::sqrtf(s)) : 0;
}
// PP
void operator()(Device d, X x, Out out, dOut dout, dX dx) const {
const Out out_conj = Eigen::numext::conj(out);
dx.device(d) = static_cast(0.5) * dout / out_conj;
}
this the reason I need input data in eltwise_grad
There was a problem hiding this comment.
I know why src_memory is needed, but my point is why not also save dst_memory to context to save time since you have already saved src.
There was a problem hiding this comment.
I can only see setting forward_pd every forward iteration but not retrieving the existed one?
There was a problem hiding this comment.
forward_pd is retrieved in eltwise_grad() function, line with dev_ctx.GetBlob(key_eltwise_pd)
There was a problem hiding this comment.
Actually, my point is I can only see you always create and set this pd to context every forward iteration, but no reuse it in next iteration. Could we avoid this recreation among every iterations to enhance performance?
There was a problem hiding this comment.
Yes, of course. I will improve it in next commit.
There was a problem hiding this comment.
I added suport for reusing memory buffers to MKL-DNN activations.
bf447fc to
f6404f0
Compare
|
Could you possibly continue your review ? I added suport for reusing memory buffers. |
tensor-tang
left a comment
There was a problem hiding this comment.
Sorry for the late reply, I have one question.
There was a problem hiding this comment.
Is that possible two fc ops with same src dims, then would this be right? They would share same hash code?
There was a problem hiding this comment.
They will be reusable except input data. I improved hash code for input data (key_src_data)
f6404f0 to
24904b9
Compare
Updated activations for MKL-DNN after changes in PaddlePaddle activations.