Skip to main content
Rollback to Revision 16
Source Link
Makyen
  • 26k
  • 8
  • 50
  • 87

This problem was solved in 2009over 90 years ago. (See here for xkcd author Randall's explanation).

Of course, this answer is late, so it will never get upvoted :)


Update: There are now two user scripts that augment answer lists with Wilson scores: Sort Best First and Wilson confidence rating calculator. You are welcome to try them in order to assess how much of an improvement it would be to switch to sorting by Wilson score.


[Edit] Layman summary: we want to determine what the upvote-percentage would be if everyone voted. But since we only have a small sampling of votes, we use fancy statistics to determine a range of percentages we can be fairly certain the real percentage falls under. We take the lower end of that range, to err on the side of caution.

Here is the output of the equation. You'll notice that when there are many votes, the output is close-ish to positiveVotes/totalVotes, but when there are few votes it's much smaller. This is exactly what we want.

Here is some code:

///<summary>
///Returns a rating for the given post.  Larger is better.
///Based on the equation found at http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
///</summary>
public double GetPostRating(int numPositiveVotes, int numNegativeVotes)
{
    int totalVotes = numPositiveVotes + numNegativeVotes;
    if(totalVotes == 0)
        return 0;

    const double z = 1.96; //Constant used for 95% confidence interval in a p-distribution
    double positiveRatio = ((double)numPositiveVotes)/totalVotes;

    //Crazy equation to find the "Lower bound of Wilson score confidence interval for a Bernoulli parameter"
    //Again, see the above webpage
    return (positiveRatio + z*z/(2*totalVotes) - z * Math.sqrt((positiveRatio*(1-positiveRatio)+z*z/(4*totalVotes))/totalVotes))/(1+z*z/totalVotes);
}

Note that the above equation assumes upvotes and downvotes have the same frequency. Since upvotes are way more common, downvotes should ideally be weighted more harshly (in other words, three downvotes says a lot more about an answer than three upvotes).

Also, I believe newer answers should be given preferential treatment, at least for a few minutes (see my comment below).

But even without these, this is a neat improvement.

This problem was solved in 2009. (See here for xkcd author Randall's explanation).

Of course, this answer is late, so it will never get upvoted :)


Update: There are now two user scripts that augment answer lists with Wilson scores: Sort Best First and Wilson confidence rating calculator. You are welcome to try them in order to assess how much of an improvement it would be to switch to sorting by Wilson score.


[Edit] Layman summary: we want to determine what the upvote-percentage would be if everyone voted. But since we only have a small sampling of votes, we use fancy statistics to determine a range of percentages we can be fairly certain the real percentage falls under. We take the lower end of that range, to err on the side of caution.

Here is the output of the equation. You'll notice that when there are many votes, the output is close-ish to positiveVotes/totalVotes, but when there are few votes it's much smaller. This is exactly what we want.

Here is some code:

///<summary>
///Returns a rating for the given post.  Larger is better.
///Based on the equation found at http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
///</summary>
public double GetPostRating(int numPositiveVotes, int numNegativeVotes)
{
    int totalVotes = numPositiveVotes + numNegativeVotes;
    if(totalVotes == 0)
        return 0;

    const double z = 1.96; //Constant used for 95% confidence interval in a p-distribution
    double positiveRatio = ((double)numPositiveVotes)/totalVotes;

    //Crazy equation to find the "Lower bound of Wilson score confidence interval for a Bernoulli parameter"
    //Again, see the above webpage
    return (positiveRatio + z*z/(2*totalVotes) - z * Math.sqrt((positiveRatio*(1-positiveRatio)+z*z/(4*totalVotes))/totalVotes))/(1+z*z/totalVotes);
}

Note that the above equation assumes upvotes and downvotes have the same frequency. Since upvotes are way more common, downvotes should ideally be weighted more harshly (in other words, three downvotes says a lot more about an answer than three upvotes).

Also, I believe newer answers should be given preferential treatment, at least for a few minutes (see my comment below).

But even without these, this is a neat improvement.

This problem was solved over 90 years ago. (See here for xkcd author Randall's explanation).

Of course, this answer is late, so it will never get upvoted :)


Update: There are now two user scripts that augment answer lists with Wilson scores: Sort Best First and Wilson confidence rating calculator. You are welcome to try them in order to assess how much of an improvement it would be to switch to sorting by Wilson score.


[Edit] Layman summary: we want to determine what the upvote-percentage would be if everyone voted. But since we only have a small sampling of votes, we use fancy statistics to determine a range of percentages we can be fairly certain the real percentage falls under. We take the lower end of that range, to err on the side of caution.

Here is the output of the equation. You'll notice that when there are many votes, the output is close-ish to positiveVotes/totalVotes, but when there are few votes it's much smaller. This is exactly what we want.

Here is some code:

///<summary>
///Returns a rating for the given post.  Larger is better.
///Based on the equation found at http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
///</summary>
public double GetPostRating(int numPositiveVotes, int numNegativeVotes)
{
    int totalVotes = numPositiveVotes + numNegativeVotes;
    if(totalVotes == 0)
        return 0;

    const double z = 1.96; //Constant used for 95% confidence interval in a p-distribution
    double positiveRatio = ((double)numPositiveVotes)/totalVotes;

    //Crazy equation to find the "Lower bound of Wilson score confidence interval for a Bernoulli parameter"
    //Again, see the above webpage
    return (positiveRatio + z*z/(2*totalVotes) - z * Math.sqrt((positiveRatio*(1-positiveRatio)+z*z/(4*totalVotes))/totalVotes))/(1+z*z/totalVotes);
}

Note that the above equation assumes upvotes and downvotes have the same frequency. Since upvotes are way more common, downvotes should ideally be weighted more harshly (in other words, three downvotes says a lot more about an answer than three upvotes).

Also, I believe newer answers should be given preferential treatment, at least for a few minutes (see my comment below).

But even without these, this is a neat improvement.

deleted 10 characters in body
Source Link
WGroleau
  • 1.3k
  • 6
  • 11

This problem was solved over 90 years agoin 2009. (See here for xkcd author Randall's explanation).

Of course, this answer is late, so it will never get upvoted :)


Update: There are now two user scripts that augment answer lists with Wilson scores: Sort Best First and Wilson confidence rating calculator. You are welcome to try them in order to assess how much of an improvement it would be to switch to sorting by Wilson score.


[Edit] Layman summary: we want to determine what the upvote-percentage would be if everyone voted. But since we only have a small sampling of votes, we use fancy statistics to determine a range of percentages we can be fairly certain the real percentage falls under. We take the lower end of that range, to err on the side of caution.

Here is the output of the equation. You'll notice that when there are many votes, the output is close-ish to positiveVotes/totalVotes, but when there are few votes it's much smaller. This is exactly what we want.

Here is some code:

///<summary>
///Returns a rating for the given post.  Larger is better.
///Based on the equation found at http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
///</summary>
public double GetPostRating(int numPositiveVotes, int numNegativeVotes)
{
    int totalVotes = numPositiveVotes + numNegativeVotes;
    if(totalVotes == 0)
        return 0;

    const double z = 1.96; //Constant used for 95% confidence interval in a p-distribution
    double positiveRatio = ((double)numPositiveVotes)/totalVotes;

    //Crazy equation to find the "Lower bound of Wilson score confidence interval for a Bernoulli parameter"
    //Again, see the above webpage
    return (positiveRatio + z*z/(2*totalVotes) - z * Math.sqrt((positiveRatio*(1-positiveRatio)+z*z/(4*totalVotes))/totalVotes))/(1+z*z/totalVotes);
}

Note that the above equation assumes upvotes and downvotes have the same frequency. Since upvotes are way more common, downvotes should ideally be weighted more harshly (in other words, three downvotes says a lot more about an answer than three upvotes).

Also, I believe newer answers should be given preferential treatment, at least for a few minutes (see my comment below).

But even without these, this is a neat improvement.

This problem was solved over 90 years ago. (See here for xkcd author Randall's explanation).

Of course, this answer is late, so it will never get upvoted :)


Update: There are now two user scripts that augment answer lists with Wilson scores: Sort Best First and Wilson confidence rating calculator. You are welcome to try them in order to assess how much of an improvement it would be to switch to sorting by Wilson score.


[Edit] Layman summary: we want to determine what the upvote-percentage would be if everyone voted. But since we only have a small sampling of votes, we use fancy statistics to determine a range of percentages we can be fairly certain the real percentage falls under. We take the lower end of that range, to err on the side of caution.

Here is the output of the equation. You'll notice that when there are many votes, the output is close-ish to positiveVotes/totalVotes, but when there are few votes it's much smaller. This is exactly what we want.

Here is some code:

///<summary>
///Returns a rating for the given post.  Larger is better.
///Based on the equation found at http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
///</summary>
public double GetPostRating(int numPositiveVotes, int numNegativeVotes)
{
    int totalVotes = numPositiveVotes + numNegativeVotes;
    if(totalVotes == 0)
        return 0;

    const double z = 1.96; //Constant used for 95% confidence interval in a p-distribution
    double positiveRatio = ((double)numPositiveVotes)/totalVotes;

    //Crazy equation to find the "Lower bound of Wilson score confidence interval for a Bernoulli parameter"
    //Again, see the above webpage
    return (positiveRatio + z*z/(2*totalVotes) - z * Math.sqrt((positiveRatio*(1-positiveRatio)+z*z/(4*totalVotes))/totalVotes))/(1+z*z/totalVotes);
}

Note that the above equation assumes upvotes and downvotes have the same frequency. Since upvotes are way more common, downvotes should ideally be weighted more harshly (in other words, three downvotes says a lot more about an answer than three upvotes).

Also, I believe newer answers should be given preferential treatment, at least for a few minutes (see my comment below).

But even without these, this is a neat improvement.

This problem was solved in 2009. (See here for xkcd author Randall's explanation).

Of course, this answer is late, so it will never get upvoted :)


Update: There are now two user scripts that augment answer lists with Wilson scores: Sort Best First and Wilson confidence rating calculator. You are welcome to try them in order to assess how much of an improvement it would be to switch to sorting by Wilson score.


[Edit] Layman summary: we want to determine what the upvote-percentage would be if everyone voted. But since we only have a small sampling of votes, we use fancy statistics to determine a range of percentages we can be fairly certain the real percentage falls under. We take the lower end of that range, to err on the side of caution.

Here is the output of the equation. You'll notice that when there are many votes, the output is close-ish to positiveVotes/totalVotes, but when there are few votes it's much smaller. This is exactly what we want.

Here is some code:

///<summary>
///Returns a rating for the given post.  Larger is better.
///Based on the equation found at http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
///</summary>
public double GetPostRating(int numPositiveVotes, int numNegativeVotes)
{
    int totalVotes = numPositiveVotes + numNegativeVotes;
    if(totalVotes == 0)
        return 0;

    const double z = 1.96; //Constant used for 95% confidence interval in a p-distribution
    double positiveRatio = ((double)numPositiveVotes)/totalVotes;

    //Crazy equation to find the "Lower bound of Wilson score confidence interval for a Bernoulli parameter"
    //Again, see the above webpage
    return (positiveRatio + z*z/(2*totalVotes) - z * Math.sqrt((positiveRatio*(1-positiveRatio)+z*z/(4*totalVotes))/totalVotes))/(1+z*z/totalVotes);
}

Note that the above equation assumes upvotes and downvotes have the same frequency. Since upvotes are way more common, downvotes should ideally be weighted more harshly (in other words, three downvotes says a lot more about an answer than three upvotes).

Also, I believe newer answers should be given preferential treatment, at least for a few minutes (see my comment below).

But even without these, this is a neat improvement.

(While we are at it.) [<https://en.wikipedia.org/wiki/Xkcd>].
Source Link

This problem was solved over 90 years ago. (See here for XKCD-authorxkcd author Randall's explanation).

  

Of course, this answer is late, so it will never get upvoted :)


Update: There are now two user scripts that augment answer lists with Wilson scores: Sort Best First and Wilson confidence rating calculator. You are welcome to try them in order to assess how much of an improvement it would be to switch to sorting by Wilson score.


[Edit] Layman summary: we want to determine what the upvote-percentage would be if everyone voted. But since we only have a small sampling of votes, we use fancy statistics to determine a range of percentages we can be fairly certain the real percentage falls under. We take the lower end of that range, to err on the side of caution.

Here is the output of the equation. You'll notice that when there are many votes, the output is close-ish to positiveVotes/totalVotes, but when there are few votes it's much smaller.
  This is exactly what we want.

Here is some code:

///<summary>
///Returns a rating for the given post.  Larger is better.
///Based on the equation found at http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
///</summary>
public double GetPostRating(int numPositiveVotes, int numNegativeVotes)
{
    int totalVotes = numPositiveVotes + numNegativeVotes;
    if(totalVotes == 0)
        return 0;
    
    const double z = 1.96; //Constant used for 95% confidence interval in a p-distribution
    double positiveRatio = ((double)numPositiveVotes)/totalVotes;
    
    //Crazy equation to find the "Lower bound of Wilson score confidence interval for a Bernoulli parameter"
    //Again, see the above webpage
    return (positiveRatio + z*z/(2*totalVotes) - z * Math.sqrt((positiveRatio*(1-positiveRatio)+z*z/(4*totalVotes))/totalVotes))/(1+z*z/totalVotes);
}

Note that the above equation assumes upvotes and downvotes have the same frequency. Since upvotes are way more common, downvotes should ideally be weighted more harshly (in other words, 3three downvotes says a lot more about an answer than 3three upvotes).

Also, I believe newer answers should be given preferential treatment, at least for a few minutes (see my comment below).

But even without these, this is a neat improvement.

This problem was solved over 90 years ago. (See here for XKCD-author Randall's explanation).

 

Of course, this answer is late, so it will never get upvoted :)


Update: There are now two user scripts that augment answer lists with Wilson scores: Sort Best First and Wilson confidence rating calculator. You are welcome to try them in order to assess how much of an improvement it would be to switch to sorting by Wilson score.


[Edit] Layman summary: we want to determine what the upvote-percentage would be if everyone voted. But since we only have a small sampling of votes, we use fancy statistics to determine a range of percentages we can be fairly certain the real percentage falls under. We take the lower end of that range, to err on the side of caution.

Here is the output of the equation. You'll notice that when there are many votes, the output is close-ish to positiveVotes/totalVotes, but when there are few votes it's much smaller.
  This is exactly what we want.

Here is some code:

///<summary>
///Returns a rating for the given post.  Larger is better.
///Based on the equation found at http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
///</summary>
public double GetPostRating(int numPositiveVotes, int numNegativeVotes)
{
    int totalVotes = numPositiveVotes + numNegativeVotes;
    if(totalVotes == 0)
        return 0;
    
    const double z = 1.96; //Constant used for 95% confidence interval in a p-distribution
    double positiveRatio = ((double)numPositiveVotes)/totalVotes;
    
    //Crazy equation to find the "Lower bound of Wilson score confidence interval for a Bernoulli parameter"
    //Again, see the above webpage
    return (positiveRatio + z*z/(2*totalVotes) - z * Math.sqrt((positiveRatio*(1-positiveRatio)+z*z/(4*totalVotes))/totalVotes))/(1+z*z/totalVotes);
}

Note that the above equation assumes upvotes and downvotes have the same frequency. Since upvotes are way more common, downvotes should ideally be weighted more harshly (in other words, 3 downvotes says a lot more about an answer than 3 upvotes).

Also, I believe newer answers should be given preferential treatment, at least for a few minutes (see my comment below).

But even without these, this is a neat improvement.

This problem was solved over 90 years ago. (See here for xkcd author Randall's explanation).

 

Of course, this answer is late, so it will never get upvoted :)


Update: There are now two user scripts that augment answer lists with Wilson scores: Sort Best First and Wilson confidence rating calculator. You are welcome to try them in order to assess how much of an improvement it would be to switch to sorting by Wilson score.


[Edit] Layman summary: we want to determine what the upvote-percentage would be if everyone voted. But since we only have a small sampling of votes, we use fancy statistics to determine a range of percentages we can be fairly certain the real percentage falls under. We take the lower end of that range, to err on the side of caution.

Here is the output of the equation. You'll notice that when there are many votes, the output is close-ish to positiveVotes/totalVotes, but when there are few votes it's much smaller. This is exactly what we want.

Here is some code:

///<summary>
///Returns a rating for the given post.  Larger is better.
///Based on the equation found at http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
///</summary>
public double GetPostRating(int numPositiveVotes, int numNegativeVotes)
{
    int totalVotes = numPositiveVotes + numNegativeVotes;
    if(totalVotes == 0)
        return 0;

    const double z = 1.96; //Constant used for 95% confidence interval in a p-distribution
    double positiveRatio = ((double)numPositiveVotes)/totalVotes;

    //Crazy equation to find the "Lower bound of Wilson score confidence interval for a Bernoulli parameter"
    //Again, see the above webpage
    return (positiveRatio + z*z/(2*totalVotes) - z * Math.sqrt((positiveRatio*(1-positiveRatio)+z*z/(4*totalVotes))/totalVotes))/(1+z*z/totalVotes);
}

Note that the above equation assumes upvotes and downvotes have the same frequency. Since upvotes are way more common, downvotes should ideally be weighted more harshly (in other words, three downvotes says a lot more about an answer than three upvotes).

Also, I believe newer answers should be given preferential treatment, at least for a few minutes (see my comment below).

But even without these, this is a neat improvement.

redditblog link is now also dead; so using wayback machine link again
Source Link
wovano
  • 309
  • 1
  • 4
  • 16
Loading
Added SLL.
Source Link
bad_coder
  • 28.5k
  • 8
  • 52
  • 136
Loading
added 363 characters in body
Source Link
user3840170
  • 4.8k
  • 2
  • 23
  • 38
Loading
adjusted relative timeframe for increased ambiguity
Source Link
ashleedawg
  • 1.2k
  • 8
  • 20
Loading
updated relative timeframe
Source Link
Loading
Replace 'WayBackWhen' archive with direct link to Reddit Blog
Source Link
Stevoisiak
  • 16.9k
  • 3
  • 37
  • 94
Loading
Randall's comment doesn't exist anymore, found it in the way back machine
Source Link
icc97
  • 101
  • 5
Loading
deleted 1 character in body
Source Link
BlueRaja
  • 2.6k
  • 1
  • 23
  • 31
Loading
deleted 2 characters in body
Source Link
BlueRaja
  • 2.6k
  • 1
  • 23
  • 31
Loading
added 528 characters in body
Source Link
BlueRaja
  • 2.6k
  • 1
  • 23
  • 31
Loading
added 51 characters in body
Source Link
BlueRaja
  • 2.6k
  • 1
  • 23
  • 31
Loading
SYntax highlighting
Source Link
Eric
  • 2.4k
  • 14
  • 25
Loading
Adding code sample
Source Link
BlueRaja
  • 2.6k
  • 1
  • 23
  • 31
Loading
edited body; deleted 6 characters in body
Source Link
BlueRaja
  • 2.6k
  • 1
  • 23
  • 31
Loading
Post Made Community Wiki
Source Link
BlueRaja
  • 2.6k
  • 1
  • 23
  • 31
Loading