To kick off some technical discussions
-
- Posts: 1242
- Joined: Thu Jun 10, 2010 2:13 am
- Real Name: Bob Hyatt (Robert M. Hyatt)
- Location: University of Alabama at Birmingham
- Contact:
Re: To kick off some technical discussions
As a follow-up, I must be doing _something_ right. My first program played its first move in the Fall of 1968 when I was a junior at the University of Southern Mississippi. I have not stopped working on it since, which will be 42 years this October (if memory serves me correctly). Obviously I enjoy what I am doing, which is all I need. I can't think of the friends I have made over the years that slowly fell by the wayside as they lost interest or burned out or became frustrated.
Re: To kick off some technical discussions
For sacrifices, I'd consider them "sound unless proven unsound", if Qxh7 is obviously unsound, then this classification can be applied.Chris Whittington wrote:we have unclear sacrifices left which are the type I concentrate on.
Then we can reward the engine for any sacrifice that it does that doesn't evidently lose the game (it may lose after further investigation, but we don't do this investigation). This way style can be attempted to be measured:
First, you need both opponents to be about the same strength, as I said previously, an engine with the highest amount of style that loses most games is undesirable. I think one opponent being 50 elo than the other is about right, the idea is to get rid of the strength from the equation so that the true style of the engine can appear (for instance, it would be disruptive if an engine of great style can't show it due to a very strong opponent attacking and it having to defend instead of showing off). But one doesn't need to be obsessive about getting the strength right.
After both engines are calibrated (I suggest time handicap for this), then you play a few games between them, from those games, you look at sacrifices from both sides, and give them different points, as a score, like, for each time a knight captures a pawn and is recaptured the next turn, add 2 points, for each time a knight is left hanging and a move that doesn't save it is made, add 1 point, for each time a rook captures a knight, add 1.5, for when the opponent hangs the queen and the engine does not capture it, we add 3 points, etc. You would just need to be aware of long series of captures where no sacrifice is happening and it'll be fine.
At the end, both engines will end with some score (we ignore game results, those should only matter to calibrate the engines to the same strength and nothing else), we divide this number by the total number of games, and then we have the engine's style measured. For a more accurate measure I'd suggest repeating this process against a different engine.
The subjective part comes about, how many points do we give to each situation? (like, what should be more rewarded, a Rook for pawn sacrifice very far away from the action or a knight sacrifice that opens the king's shield for a killer attack?) And which scenarios should we look for? (we could also add negative values for things like useless piece shuffling, so that too much of it will eventually cancel out sacrifices).
This is just a thought experiment, so one can tweak those rewards depending on the output (like, and engine with a clearly better style that scores worse would easily signal what rewards need tweaking or need to be added).
The good thing is that this already can be applied to engines that are close in strength, since those already have thousands of games played that could be checked out for style measurement.
Re: To kick off some technical discussions
I tend to agree with this somewhat. Is computer chess becoming nothing more than a bunch of "parameter tuning", or is there new thought going into some aspects of it? Obviously any one project tends to the minutiae after a suitable time (which is why some authors "start from scratch" every however-many years), but on the other hand, many projects seem so set-in-their-way that they don't even seem to want to rip up "legacy" code to implement new ideas (for instance, the number of top 10 engines that still don't use bitboards is rather surprising to me, but maybe they are not so impressive in the 32-bit market).a) Current Chess IA is nearly dead. 99% people here apply known algorithms and the only research done is about changing what I call "inputs in the formula". We research if 325 as value for knight is good or if it is better 300, or if LMR must be doing from 4th move or from 3th. For me this research has nothing to contribute to chess computer.
To try to get a "sacrifice" engine, has anyone tried extending all bad-SEE moves, reducing moves that lead to (boring) pawn-up endgames, etc.? It will be maybe 200+ ELO worse, but perhaps more spectacular?
-
- Posts: 1242
- Joined: Thu Jun 10, 2010 2:13 am
- Real Name: Bob Hyatt (Robert M. Hyatt)
- Location: University of Alabama at Birmingham
- Contact:
Re: To kick off some technical discussions
BB+ wrote:I tend to agree with this somewhat. Is computer chess becoming nothing more than a bunch of "parameter tuning", or is there new thought going into some aspects of it? Obviously any one project tends to the minutiae after a suitable time (which is why some authors "start from scratch" every however-many years), but on the other hand, many projects seem so set-in-their-way that they don't even seem to want to rip up "legacy" code to implement new ideas (for instance, the number of top 10 engines that still don't use bitboards is rather surprising to me, but maybe they are not so impressive in the 32-bit market).a) Current Chess IA is nearly dead. 99% people here apply known algorithms and the only research done is about changing what I call "inputs in the formula". We research if 325 as value for knight is good or if it is better 300, or if LMR must be doing from 4th move or from 3th. For me this research has nothing to contribute to chess computer.
To try to get a "sacrifice" engine, has anyone tried extending all bad-SEE moves, reducing moves that lead to (boring) pawn-up endgames, etc.? It will be maybe 200+ ELO worse, but perhaps more spectacular?
Chris is overlooking a key point. Yes, there is creativity and new ideas. For us (the Crafty group) this is a near-daily event. But once we implement a new idea, we can then test various tunings of it to really maximize the result. In the past, we often added something, it played better (sometimes with too few games which gave false positives) and we kept that change and moved on. Today we can extract every bit of "oil" from the "well" before moving on to bore another well. That part is boring, but it produces significant results. The new ideas drive the process however. We are not continually iterating over all the scoring terms and material values, trying random adjustments to see if Elo goes up or down. That we do _not_ do. We do try alternative values for most any new idea when the value is not "obvious".
Re: To kick off some technical discussions
But that's not the only point made by Chris, and I don't think you fully addressed the implications of his initial point in the thread. His point was that in effect, you test against a player pool consisting of Steinitz, Rubinstein, Tartakower, Marshall, and Maroczy. But you're never getting any better against Capablanca, Alekhine, and Lasker because they occupy a different hill in the chess landscape. But you don't necessarily know this because you can't test with any precision your overall progress against the super-top programs.hyatt wrote:BB+ wrote:I tend to agree with this somewhat. Is computer chess becoming nothing more than a bunch of "parameter tuning", or is there new thought going into some aspects of it? Obviously any one project tends to the minutiae after a suitable time (which is why some authors "start from scratch" every however-many years), but on the other hand, many projects seem so set-in-their-way that they don't even seem to want to rip up "legacy" code to implement new ideas (for instance, the number of top 10 engines that still don't use bitboards is rather surprising to me, but maybe they are not so impressive in the 32-bit market).a) Current Chess IA is nearly dead. 99% people here apply known algorithms and the only research done is about changing what I call "inputs in the formula". We research if 325 as value for knight is good or if it is better 300, or if LMR must be doing from 4th move or from 3th. For me this research has nothing to contribute to chess computer.
To try to get a "sacrifice" engine, has anyone tried extending all bad-SEE moves, reducing moves that lead to (boring) pawn-up endgames, etc.? It will be maybe 200+ ELO worse, but perhaps more spectacular?
Chris is overlooking a key point. Yes, there is creativity and new ideas. For us (the Crafty group) this is a near-daily event. But once we implement a new idea, we can then test various tunings of it to really maximize the result. In the past, we often added something, it played better (sometimes with too few games which gave false positives) and we kept that change and moved on. Today we can extract every bit of "oil" from the "well" before moving on to bore another well. That part is boring, but it produces significant results. The new ideas drive the process however. We are not continually iterating over all the scoring terms and material values, trying random adjustments to see if Elo goes up or down. That we do _not_ do. We do try alternative values for most any new idea when the value is not "obvious".
Possible results of this deficiency:
1) You'll never see the holes in your program that only the top tier programs can see, only the ones your stable can see.
2) In plugging holes against the stable, new holes open against the top tier of which you cannot be aware, adding to the holes you already can't see.
3) Maybe you test an idea that does nothing against your stable of opponents, but it might address an unknown weakness in stronger opponents. But you'll never find out because you tossed that idea, since you "proved" in testing against your stable that it does nothing. Oops.
- thorstenczub
- Posts: 593
- Joined: Wed Jun 09, 2010 12:51 pm
- Real Name: Thorsten Czub
- Location: United States of Europe, germany, NRW, Lünen
- Contact:
Re: To kick off some technical discussions
the strength of CSTAL was, that is did something the others never expected it to do.
it was far their horizon. they pruned it away.
now while the others MAXIMIZED their material-score/positional-score,
CSTAL followed weired ideas about attack and pin and fork opponent.
the sacs were unsound, the others expected those moves to be a WIN for them.
when the moves were played, it took the END of the combination until the others
saw: the end position is not won for us. in fact we are in great trouble.
CSTAL saw this because it did NOT prune away the whole combination, AND evaluated the END position with its large evaluation function.
the main job was to make the program sac unsound but not to throw away pieces for nothing.
this was the main job.
we knew we are succesful when winning enough games against fritz, genius, mchess, ...
but the main strength came from the fact that CSTAL was different, and the others could not find out what it will move.
of course today the search of CSTAL is too slow.
todays chess programs ALL do much forward pruning.
and cstal (beeing from 1998) gets outsearched.
it would need time to adapt the search to the latest tricks.
but in those days it worked.
when Mark V won in 1981, he was also very selective.
people at that time called it b-strategy.
only following a few moves in the move list. and forget about the others.
it worked in 1981.
levy tried this method with taylow later with cyrus68000.
CSTAL is nothing else but this idea, only added with lots of king safety knowledge and also with some normal human endgame knowledge.
it was far their horizon. they pruned it away.
now while the others MAXIMIZED their material-score/positional-score,
CSTAL followed weired ideas about attack and pin and fork opponent.
the sacs were unsound, the others expected those moves to be a WIN for them.
when the moves were played, it took the END of the combination until the others
saw: the end position is not won for us. in fact we are in great trouble.
CSTAL saw this because it did NOT prune away the whole combination, AND evaluated the END position with its large evaluation function.
the main job was to make the program sac unsound but not to throw away pieces for nothing.
this was the main job.
we knew we are succesful when winning enough games against fritz, genius, mchess, ...
but the main strength came from the fact that CSTAL was different, and the others could not find out what it will move.
of course today the search of CSTAL is too slow.
todays chess programs ALL do much forward pruning.
and cstal (beeing from 1998) gets outsearched.
it would need time to adapt the search to the latest tricks.
but in those days it worked.
when Mark V won in 1981, he was also very selective.
people at that time called it b-strategy.
only following a few moves in the move list. and forget about the others.
it worked in 1981.
levy tried this method with taylow later with cyrus68000.
CSTAL is nothing else but this idea, only added with lots of king safety knowledge and also with some normal human endgame knowledge.