Folgende 3 Benutzer sagen Danke zu udo für den nützlichen Beitrag: | ||
|
||||||||||||
Re: H8 Bug
If the question arises in the context of rating evaluation, to my humble opinion it is as simple as:
- are you rating the device? If so, H8-bug blunders leading to lost games must be taken in the calculation. A Saitek Chess Challenger deserves a slightly lower CElo rating than a GK2100, because it will statistically lose some extra games due to the bug. - are you rating the chess engine, focusing on the author's 32K program? Is so, you should use a bug-free device or, if unavailable to you, neutralize the H8-bug occurrences; whether replaying the game or using the workaround. During a competition, that's another story. The tournament rules should apply. Long time ago, mainframe computers chess tournaments granted a decent but limited time for recovering from a computer or program crash, or communication failure, to restore the position and restart thinking... just because it was frequent. MfG, Tibono Sorry if I keep coming back to this topic. I've spent my whole life evaluating the impact on products, production, sales and profits. So I tend to feel like something is creeping up my spine until everything is covered and examined. So I have to stop that scratch and see if I can make it go away. I wouldn't be Nick otherwise Your post seems to cover all the bases except for one small problem I see. We all want to be able to pull up ELO lists and look at the ratings to get a feel for where the computers stand. This is especially important when visitors are searching for rating. Who only look at a number, to make a purchase decision. Let's, assume this error was with a new computer wanting to make the list and this new computer plays 16 games against Explorer Pro that has a rating 2051. We know Explorer Pro's rating includes Bugs. But what about the new computer. Let's assume the sixteen games are as follows: Without Bug the result is in favor of Explorer +8-4=4 Performance of New Computer 1964 ELO +/- About 170 With Bug the result is +7-5=4 = Performance of New Computer - 2008 ELO Difference = 44 ELO So this new computer now makes this list. (I know it needs 4 opponents maybe 4 games each but let's assume extreme): 127 Mephisto Nigel Short 2012 128 Mephisto Monte Carlo IV LE 2006 129 Novag Star Ruby / Obsidian 2004 130 Phoenix Revelation Rebell 5.0 2004 131 Novag Super Expert / Forte B 6 MHz 2003 132 Mephisto Milano M 2003 133 Novag Turquoise / Emerald classic plus 2001 134 Mephisto Roma 68000 / Montreal 68000 2001 135 Saitek D+ 8 MHz R 1999 136 Novag Super Constellation 16 MHz *** 1998 137 Mephisto Dallas 68000 / Mondial 68000XL 1995 138 Fidelity Kishon Chesster 8 MHz *** 1993 139 Mephisto MM V + HG 550 1991 140 Mephisto Modena 8 MHz *** 1987 141 Mephisto Miami 24 MHz *** 1973 142 Rebel Portoroz R 1968 143 Mephisto Polgar M 1967 144 Mephisto Mega IV 1965 He now sits after the extra win and one less loss behind Nigel Short and above Monte Carlo IV. But probably he should be sitting just below Mega IV. So, the issue if you want to rate the Explorer Pro with its faults, then that is fine. But at the same time, you are rewarding the other computer with a win that it never won. So, I think in a correct world to assist in minimizing unnecessary extra errors creeping into ratings. I would ok punish Explorer Pro with the loss so you can record its impact but don't give the other computer the win. Rate the other game as 0-0 and play an extra game. Explorer may or may not end up being +8-5=4 (17 games) and other computer +4-8=4 (16 games). Assuming that were possible in a rating list. The reason is that unless all the above computers played Explorer Pro and experienced a free win, the ratings are corrupted through these decisions. However small these ratings impacts are, why do a reward and add extra uncertainties knowingly. Regards Nick |
|
|||||||||||
Re: H8 Bug
Hi Nick,
definitely no worries about that, the whole thread is about the H8 Bug! The concern you raised is correct, I fully agree. I rephrase it: rating a new device should not heavily rely on a previously rated device which is known for bugs that may occur - or not. This introduces too much randomness. My intention with the case I wrote about '- are you rating the device?' only relates to rating the "H8-bug owner" device, to make it crystal clear. Your previous post indeed completes the coverage, thanks! All the best, Eric |
Folgende 2 Benutzer sagen Danke zu Tibono für den nützlichen Beitrag: | ||
kamoj (04.11.2024), spacious_mind (03.11.2024) |
|
|