Skip to content

Feature/fc converter#11043

Merged
Superjomn merged 9 commits intoPaddlePaddle:developfrom
Superjomn:feature/fc_converter
Jun 1, 2018
Merged

Feature/fc converter#11043
Superjomn merged 9 commits intoPaddlePaddle:developfrom
Superjomn:feature/fc_converter

Conversation

@Superjomn
Copy link
Contributor

No description provided.

@Superjomn Superjomn force-pushed the feature/fc_converter branch from 943a90c to 5871e04 Compare May 31, 2018 12:39
@Superjomn Superjomn requested a review from luotao1 June 1, 2018 00:26
framework::OpDesc op_desc(op, nullptr, nullptr);
PADDLE_ENFORCE_EQ(op_desc.Input("X").size(), 1);
PADDLE_ENFORCE_EQ(op_desc.Input("Y").size(), 1); // Y is a weight
PADDLE_ENFORCE_EQ(op_desc.Output("Out").size(), 1); // Y is a weight
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

61行的注释错了。

for (int h = 0; h < shape.h(); ++h) {
for (int w = 0; w < shape.w(); ++w) {
odata[h * ostrides.h() + w * ostrides.w()] =
idata[h * ostrides.h() + w * ostrides.w()];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

从函数实现看,odata[i] = idata[i],所以还需要转么?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

从 TF 里拷贝的,试了下,很奇怪,这个函数好像必须要用。���有一个 Reorder4的函数,后面用到的时候我再仔细看下。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后面写Reorder4的时候,可以这里再优化下:

for (int h = 0; h < shape.h(); ++h) {
  for (int w = 0; w < shape.w(); ++w) {
      int index = h * ostrides.h() + w * ostrides.w();
      odata[index] = idata[index];
namespace inference {
namespace tensorrt {

template <typename T>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reorder2和ReorderCKtoKC和函数功能和参数含义,请加一下注释。

PADDLE_ENFORCE_NOT_NULL(Y_v);
auto* Y_t = Y_v->GetMutable<framework::LoDTensor>();
// This may trigger a CPU->GPU copy.
// TODO(Superjomn) use some smarter mutable_data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Y_v从scope里读取的话,可以一开始就在gpu环境下,这里就不用拷贝了。

Copy link
Contributor Author

@Superjomn Superjomn Jun 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

更新了 comment


void DeclOutputVar(const std::string& name, const nvinfer1::Dims& dims) {
DeclVar(name, dims);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeclParamVar和DeclOutputVar是一模一样的,需��封两个函数么?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

用的时候的确需要不同语义,这里区分下 program 和 output, 不然代码里两处都是 DeclVar,看不出区别

for (const auto& output : op_desc_->OutputArgumentNames()) {
std::vector<float> fluid_out;
std::vector<float> trt_out(200);
std::vector<float> trt_out(200, 2008.);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请注释说下200和2008的含义。

engine_.reset(new inference::tensorrt::TensorRTEngine(
max_batch_, max_workspace, nullptr));
// TODO(Superjomn) parameters should be passed be analysised and passed from
// outside.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

34-35行注释请更新下,有两个be?

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

for (int h = 0; h < shape.h(); ++h) {
for (int w = 0; w < shape.w(); ++w) {
odata[h * ostrides.h() + w * ostrides.w()] =
idata[h * ostrides.h() + w * ostrides.w()];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后面写Reorder4的时候,可以这里再优化下:

for (int h = 0; h < shape.h(); ++h) {
  for (int w = 0; w < shape.w(); ++w) {
      int index = h * ostrides.h() + w * ostrides.w();
      odata[index] = idata[index];
@Superjomn Superjomn merged commit 0c0c5df into PaddlePaddle:develop Jun 1, 2018
@Superjomn Superjomn deleted the feature/fc_converter branch June 1, 2018 07:39
@Superjomn Superjomn mentioned this pull request Jun 1, 2018
@luotao1 luotao1 mentioned this pull request Jun 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants