Fix 2078 by pkuyym · Pull Request #2165 · PaddlePaddle/Paddle

pkuyym · 2017-05-16T11:04:01Z

fixes #2078

…iple metrics

pkuyym · 2017-05-16T11:05:09Z

@guoshengCS Thanks for testing and document writing.

pengli09 · 2017-05-17T01:54:14Z

paddle/gserver/evaluators/ChunkEvaluator.cpp

+    this->storeLocalValues();
+    std::vector<std::string> buffers;
+    paddle::str::split(name, '.', &buffers);
+    auto it = this->values_.find(buffers[buffers.size() - 1]);


change buffers[buffers.size() - 1] to buffers.back()

pengli09 · 2017-05-17T01:58:29Z

paddle/gserver/evaluators/ChunkEvaluator.cpp

+
+private:
+  void storeLocalValues() const {
+    CHECK_GT(numOutputSegments_, 0);


Change CHECK_GT to CHECK_GE, numOutputSegments_ can be 0 in practice. For example, the label sequence is O O O O.

pengli09 · 2017-05-17T01:58:33Z

paddle/gserver/evaluators/ChunkEvaluator.cpp

+private:
+  void storeLocalValues() const {
+    CHECK_GT(numOutputSegments_, 0);
+    CHECK_GT(numLabelSegments_, 0);


Change CHECK_GT to CHECK_GE.

pengli09 · 2017-05-17T01:59:52Z

paddle/gserver/evaluators/ChunkEvaluator.cpp

+  void storeLocalValues() const {
+    CHECK_GT(numOutputSegments_, 0);
+    CHECK_GT(numLabelSegments_, 0);
+    double precision = (double)numCorrect_ / numOutputSegments_;


Change it to double precision = !numOutputSegments_ ? 0 : (double)numCorrect_ / numOutputSegments_;

pengli09 · 2017-05-17T02:00:16Z

paddle/gserver/evaluators/ChunkEvaluator.cpp

+    CHECK_GT(numOutputSegments_, 0);
+    CHECK_GT(numLabelSegments_, 0);
+    double precision = (double)numCorrect_ / numOutputSegments_;
+    double recall = (double)numCorrect_ / numLabelSegments_;


Change it to double recall = !numLabelSegments_ ? 0 : (double)numCorrect_ / numLabelSegments_;

pengli09 · 2017-05-17T02:30:11Z

python/paddle/trainer_config_helpers/evaluators.py

+    To make it clear, let's illustrate by a NER example.
+    Assuming that there are two named entity types including ORG and PER which are called 'chunk type' here,
+    if 'IOB' scheme were used, the label set will be extended to a set including B-ORG, I-ORG, B-PER, I-PER and O,
+    in which B-ORG for begining of ORG and I-ORG for end of ORG.


change end of ORG to inside of

pengli09 · 2017-05-17T02:33:23Z

python/paddle/trainer_config_helpers/evaluators.py

+    .. code-block:: python

-    'plain' means the whole chunk must contain exactly the same chunk label.
+    Realizing that the number of is chunk type is 2 and number of tag type is 2, it is easy to validate this.


We should change the example to make chunk type and tag type of different number. In this case, both of them are 2 and may fail to help the users to clarify their misunderstanding.

pengli09 · 2017-05-17T02:46:54Z

python/paddle/trainer_config_helpers/evaluators.py

-
-    For each label in the label sequence, we have:
+    To use chunk evaluator, the construction of label dict should obey the following rules:
+    (1) Use one of the listed labelling schemes. These schemes differ in ways indicating chunk boundry.


I think we should define "chunk type", "tag type" before the following table. And we'd better have a running example to show how to label the words using different schemes. In fact, the following table is the protocol for assigning tag types, not the definition of the schemes. Therefore, I think we also need another table for the definitions.

pengli09 · 2017-05-17T02:52:02Z

python/paddle/trainer_config_helpers/evaluators.py

-    The total number of different labels is numTagType*numChunkTypes+1.
-    We support 4 labelling scheme.
-    The tag type for each of the scheme is shown as follows:
+    (2) Map can be done correctly by the listed equations.


Change Map to Mapping
Change can be done to is done. I think is done is better. Because if can be done was used, it may mislead the users to think this is only one of the feasible options. However, this is the only feasible one because we hard coded it.

pengli09 · 2017-05-17T02:53:35Z

python/paddle/trainer_config_helpers/evaluators.py

-       IOB    0     1      -     -
-       IOE    -     0      1     -
-       IOBES  0     1      2     3
+    Continue the NER example, and the label dict should like this to satify above equations:


Change like to look like

pkuyym

1.Override getTypeImpl instead of getType.
2.I think holding precision, recall and F1-score into an unified map could make the code cleaner and easier to maintain and the extra computation cost is trivial.
3.Revise the document following the review comments.

pengli09

LGTM except getNames(). Please consult other members to make the final decision.

luotao1 · 2017-05-18T08:58:16Z

请参考 http://www.paddlepaddle.org/develop/doc_cn/howto/dev/write_docs_cn.html 看一下生成出的文档格式是否正确。

luotao1 · 2017-05-18T09:02:44Z

python/paddle/trainer_config_helpers/evaluators.py

+      IOB      Two labels for chunk type X, B-X for chunk begining and I-X for chunk inside. 
+      IOE      Two labels for chunk type X, E-X for chunk ending and I-X for chunk inside.
+      IOBES    Four labels for chunk type X, B-X for chunk begining, I-X for chunk inside, E-X for chunk end and S-X for single word chunk. 
+    .. code-block:: python


366行可以去掉，只需要一个.. code-block::python即可。下同。
请生成下文档看下显示是否正确。目前粗看，会有一些问题。

reyoung · 2017-05-18T09:10:30Z

python/paddle/trainer_config_helpers/evaluators.py

+    The construction of label dict should obey the following rules:
+    (1) Use one of the listed labelling schemes. These schemes differ in ways indicating chunk boundry.

    .. code-block:: python


Why code-block is python? It seems a plain text?

..[SPACE]code-block:: [language] [EMPTY_LINE] [SPACE][SPACE][SPACE]Your texts.

.. code-block:: text abc def

See this documentation.

reyoung

Basically LGTM.

* fix update time * add rerun * fix permission error * fix delete container * fix mlu env * fix mlu ci error * fix cleanup * fix cleanup

yangyaming added 2 commits May 16, 2017 15:52

overload several virtual functions to make ChunkEvaluator output mult…

8411a73

…iple metrics

modify usage document of chunk evaluator

a74060d

wangkuiyi requested review from pengli09 and reyoung May 16, 2017 15:41

pengli09 requested changes May 17, 2017

View reviewed changes

pkuyym commented May 17, 2017

View reviewed changes

Override getValueImpl and revise document

f3eb9cb

pengli09 reviewed May 17, 2017

View reviewed changes

lcy-seso mentioned this pull request May 18, 2017

Floating point exception #1961

Closed

luotao1 requested changes May 18, 2017

View reviewed changes

reyoung reviewed May 18, 2017

View reviewed changes

reyoung approved these changes May 18, 2017

View reviewed changes

fix document formation bugs.

dfc27aa

luotao1 approved these changes May 19, 2017

View reviewed changes

pengli09 approved these changes May 22, 2017

View reviewed changes

pkuyym merged commit d3e003b into PaddlePaddle:develop May 22, 2017

pkuyym deleted the fix-2078 branch May 24, 2017 06:30

lcy-seso mentioned this pull request Sep 4, 2017

paddle v2 CTC error在训练日志中不显示 #3802

Closed

fsylmxx pushed a commit to fsylmxx/Paddle that referenced this pull request Nov 25, 2025

Modified submodule update time (PaddlePaddle#2165)

2660429

* fix update time * add rerun * fix permission error * fix delete container * fix mlu env * fix mlu ci error * fix cleanup * fix cleanup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix 2078#2165

Fix 2078#2165
pkuyym merged 4 commits intoPaddlePaddle:developfrom
pkuyym:fix-2078

pkuyym commented May 16, 2017

pkuyym commented May 16, 2017

pengli09 May 17, 2017

pkuyym May 17, 2017

pengli09 May 17, 2017

pkuyym May 17, 2017

pengli09 May 17, 2017

pkuyym May 17, 2017

pengli09 May 17, 2017

pkuyym May 17, 2017

pengli09 May 17, 2017

pkuyym May 17, 2017

pengli09 May 17, 2017

pkuyym May 17, 2017

pengli09 May 17, 2017

pkuyym May 17, 2017

pengli09 May 17, 2017

pkuyym May 17, 2017

pengli09 May 17, 2017

pkuyym May 17, 2017

pengli09 May 17, 2017

pkuyym May 17, 2017

pkuyym left a comment

pengli09 left a comment

luotao1 commented May 18, 2017

luotao1 May 18, 2017

pkuyym May 19, 2017

reyoung May 18, 2017

reyoung May 18, 2017

pkuyym May 19, 2017

reyoung left a comment

Labels

4 participants

Conversation

pkuyym commented May 16, 2017

pkuyym commented May 16, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkuyym left a comment

Choose a reason for hiding this comment

pengli09 left a comment

Choose a reason for hiding this comment

luotao1 commented May 18, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reyoung left a comment

Choose a reason for hiding this comment

Labels

4 participants