Conversation
A method to copy a tensor with stride and dimension. It is useful for Crop, Concat, etc.
paddle/operators/tensor_copy.h
Outdated
| namespace operators { | ||
|
|
||
| // Copy a tensor from src to dst. | ||
| // The src and dst should be both on dev_ctx.GetPlace() |
paddle/operators/tensor_copy.h
Outdated
| // Copy a tensor from src to dst. | ||
| // The src and dst should be both on dev_ctx.GetPlace() | ||
| // | ||
| // the stride of an array (also referred to as increment, pitch or step size) is |
paddle/operators/tensor_copy_test.cc
Outdated
| ASSERT_EQ(4, dst[3]); | ||
| } | ||
|
|
||
| #ifndef PADDLE_ONLY_CPU |
There was a problem hiding this comment.
It seems that we need to split this file into two -- a .cc file and a .cu file?
There was a problem hiding this comment.
Nop, all lines in this .cc file is invoking pure C++ methods. However, some of them are not defined when ONLY_CPU.
paddle/operators/tensor_copy.h
Outdated
| const framework::DDim& dst_dim, | ||
| const framework::DDim& dst_stride, T* dst) { | ||
| using namespace detail; | ||
| TensorCopyDimVisitor<T> func(dev_ctx, src, src_stride, dst_stride, dst); |
There was a problem hiding this comment.
Why define a functor and a visitor here? Could we define a function template instead? If we do have to use the visitor pattern, it would be great to put the reason in detail/tesnor_copy.h
There was a problem hiding this comment.
We must use visitor pattern here to make DDim to Dim.
The implementation of Visitor is in detail/tensor_copy.h.
Also, @Canpio and I think this method should be given a better name, like StridedMemcpy.
| auto& cpu_place = boost::get<platform::CPUPlace>(place); | ||
| memory::Copy(cpu_place, dst, cpu_place, src, sizeof(T) * dst_dim.head); | ||
| } else { | ||
| #ifndef PADDLE_ONLY_CPU |
There was a problem hiding this comment.
不会的,那个问题和这个写法没关系的。
Add unittests for Crop and Concat
| inline void StridedMemcpy(const platform::DeviceContext& dev_ctx, const T* src, | ||
| const framework::DDim& src_stride, | ||
| const framework::DDim& dst_dim, | ||
| const framework::DDim& dst_stride, T* dst) { |
There was a problem hiding this comment.
Do we need to follow the parameter order convention in memcpy? BecauseStrideMemcpy is a name so similar to c system library function, cuda library function cudaMemcpy.
Or we may add a note to emphasize the difference.
Beside the question, I want to tocao,memcpy violates the google style. : <
There was a problem hiding this comment.
Actually, we should follow google C++ style. It follows the Google C++ style for parameter ordering.
There was a problem hiding this comment.
@dzhwinter memcpy appeared much earlier than Google C++ style. So the fact is that Google C++ style violates memcpy.
| inline void StridedMemcpy(const platform::DeviceContext& dev_ctx, const T* src, | ||
| const framework::DDim& src_stride, | ||
| const framework::DDim& dst_dim, | ||
| const framework::DDim& dst_stride, T* dst) { |
There was a problem hiding this comment.
I think we can get dst_stride from dst_dim like this:
DDim stride(const DDim& ddim) {
std::vector<int64_t> strides(ddim.size());
strides[ddim.size()-1] = 1;
for (int i = ddims.size()-2; i>=0; --i) {
strides[i] = strides[i+1] * ddim[i+1]
}
return make_ddim(strides);
}
So dst_stride is not necessary in parameter list?
There was a problem hiding this comment.
Well, we may provide another method to generate stride. However, src_stride and dst_stride are very useful in concat or other situations.
JiayiFeng
left a comment
There was a problem hiding this comment.
LGTM. The interface of calculating stride automatically can be added later.
dzhwinter
left a comment
There was a problem hiding this comment.
we can merge it ASAP since some op porting work dependents on it.
A method to copy a tensor with stride and dimension. It is useful
for Crop, Concat, etc.