Skip to content

Conversation

@mattbit
Copy link
Member

@mattbit mattbit commented Jun 21, 2023

Same as #1169 but now on main!

@linear
Copy link

linear bot commented Jun 21, 2023

GSK-1275 Importance of metrics calculated on partial data slice

User KD_A on reddit pointed out that

And does the # samples refer to the # samples in the data slice? It should be the denominator of the metric for the slice. For example, if that second row's recall of 0.111 is 1 predicted positive / 9 true positive, it's debatable whether to flag that.

This is right. We may have 1000 samples in our data slice, but to calculate for example the recall we only use the positive samples, which may be just a few samples out of the total, making the detection a false positive.

@mattbit mattbit requested a review from andreybavt June 21, 2023 17:39
@sonarqubecloud
Copy link

@mattbit mattbit merged commit d7047f6 into main Jun 21, 2023
@Hartorn Hartorn deleted the task/fix-scan-metrics branch September 13, 2023 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants