Row convolution operation.#2373
Conversation
There was a problem hiding this comment.
请在layers.rst补充row_conv信息,另外能否把layer.multiplex信息一并补上(#2308 )
hedaoyuan
left a comment
There was a problem hiding this comment.
另外,在实现计算kernel的时候尽量用template,避免用real,这样,后续���要支持float16或其他类型的计算时会更容易。
paddle/function/RowConvOp.cpp
Outdated
| CHECK_EQ(in.shape().ndims(), 2UL); | ||
| CHECK_EQ(out.shape().ndims(), 2UL); | ||
| CHECK_EQ(in.shape()[1], out.shape()[1]); | ||
| CHECK_EQ(in.shape()[0], out.shape()[0]); |
There was a problem hiding this comment.
147-149三行可以换成CHECK(in.shape() == out.shape());
paddle/function/RowConvOp.cpp
Outdated
| CHECK_EQ(in.shape().ndims(), 2UL); | ||
| CHECK_EQ(outGrad.shape().ndims(), 2UL); | ||
| CHECK_EQ(in.shape()[1], outGrad.shape()[1]); | ||
| CHECK_EQ(in.shape()[0], outGrad.shape()[0]); |
There was a problem hiding this comment.
CHECK(in.shape() == outGrad.shape());
CHECK(in.shape() == inGrad.shape());
| MatrixPtr wGrad = weight_->getWGrad(); | ||
| size_t h = getInputValue(0)->getHeight(); | ||
| size_t w = getInputValue(0)->getWidth(); | ||
| outputs.addArg( |
There was a problem hiding this comment.
inputGrad和weightGrad最好还是拆成两个Funciton计算。这样,可以避免创建一个空的参数。
if (inGrad) {
backwardInput(...);
}
if (wGrad) {
backwardWeight(...);
}
There was a problem hiding this comment.
加了TODO,后续会拆分成两个function.
| resetOutput(height, width); | ||
|
|
||
| const auto startPos = getInput(0).sequenceStartPositions->getVector(useGpu_); | ||
| wDims_ = TensorShape({contexLength_, width}); |
There was a problem hiding this comment.
wDims_是weight_ shape吧,这里contexLength_ != weight_.height_吗?
There was a problem hiding this comment.
这里contexLength_ == weight_.height_
There was a problem hiding this comment.
嗯,那修改成根据weight属性创建wDims更可读吧TensorShape(weight.height_, weight,width_)。
|
|
||
| __shared__ real sw[BLOCK_H][BLOCK_W]; | ||
|
|
||
| for (int i = tidy; i < context; i += blky) { |
There was a problem hiding this comment.
如果,context > 32,这里外层增加一个for,是否就可以去掉KeRowConv2了?
There was a problem hiding this comment.
paper里面context是19,考虑到可能这个值大于32的情况也不多,就分成了2个kernel,每个kernel读写都相对简洁一些~
| } | ||
|
|
||
| template <> | ||
| void RowConvGrad<DEVICE_TYPE_CPU>(const CpuMatrix& outG, |
There was a problem hiding this comment.
这里,filterG和inG放在一个Kernel里实现并没有提高计算性能,还是写层两个Kernel函数实现较好。
| // check | ||
| CHECK_EQ(2UL, inputs.size()); | ||
| CHECK_EQ(1UL, outputs.size()); | ||
| CHECK_EQ(outputs[0].getArgType(), ADD_TO); |
There was a problem hiding this comment.
forward可以实现一下ASSIGN_TO,这样可以加速inference的计算。
There was a problem hiding this comment.
加两个TODO, 后续PR再修改~
|
@qingqing01 这里可能有个问题:DS2中的row convolution是需要加在RNNs之后的,即其input layer输出类型是sequence, 而不是dense vector. 我们讨论下是否有必要开发sequence类型的2-D conv (现有sequence类型的1-D conv且不带stride),否则就需要多次sequence到dense_vector的来回转换,可能有损性能。 另外,Lookahead conv kernel 在本实现中也未支持? |
这里实现的row convolution的输入输出都是 sequence类型,可以看
依据Row Conv的输入支持sequence类型,所以这个是没有必要的吧。
这句我没有太明白指的是什么 |
是filter吧? https://github.com/PaddlePaddle/Paddle/pull/2373/files#diff-7017557ab2e2d717f6816645d86eee5cR30 这里支持可学习filter (weight), 用户可以配置conv kernel的大小, 只是这里配置的context_len等于 |
|
Got it. Thanks @qingqing01 . |
Fix #2228
Add a row convolution function.