LOS --> Draws are irrelevant

Code, algorithms, languages, construction...
Post Reply
User923005
Posts: 616
Joined: Thu May 19, 2011 1:35 am

LOS --> Draws are irrelevant

Post by User923005 » Fri Jan 24, 2014 8:59 pm

According to this and many other sources:
http://talkchess.com/forum/viewtopic.ph ... 91&t=51003
draws are irrelevant for calculation of likelihood of superiority.

So, let's take it to the extreme.

We run one quadrillion games between engine A and engine B.
We get 1000000000000000 - 11 = 999999999999989 draws
We get 10 wins for engine A
We get 1 loss for engine A.

LOS says A is definitely much better.

Those engines are dead even. Ignoring the draws makes no sense to me.

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: LOS --> Draws are irrelevant

Post by BB+ » Fri Jan 24, 2014 9:38 pm

Ignoring draws is correct for the common model of LOS (though LOS itself is assuming a prior distribution).

In your example, the point is that the variance is quite low. For instance, switching to Elo units (maybe not the best choice, but), engine A is adjudged only 3.82*10^(-12) Elo better, but the standard deviation is even less than this, about 2.7 times smaller, in line with the 12/2^11 or 1 in 170 [thus about 2.78 deviations] expectation of a 10:1 or worse result from wins:losses from a fair coin.

LOS does not answer how much an engine might be better, only the yes/no as to whether it might be, and for this draws do not matter.

To make your example more computable [w/o rows of zeros obscuring it], take 999989 draws, 10 losses, 1 win, then the win% is 0.4999955, so the difference from 1/2 [the bias] is 0.0000045, while the deviation is 0.00000166, or about 2.7 times less than the bias from 1/2.

User923005
Posts: 616
Joined: Thu May 19, 2011 1:35 am

Re: LOS --> Draws are irrelevant

Post by User923005 » Fri Jan 24, 2014 9:49 pm

I suppose that my argument is that we simply cannot ignore draws completely.
At some point, it is obvious that if the draw count dominates, it adds uncertainty to the measure.
Eventually, if the count of wins + losses is extremely small compared to the draw count, then the wins + losses are very likely noise compared to the general trend.
So I think that the draws have to contribute to the width of the standard deviation, at least. Otherwise, the calculation produces a result that I would not trust.

Post Reply