Hi all,
I will try to describe here, a method for calculating the entropy associated with a chess position.
For a given position, this value reflects the probability of finding (by chance ...) the current disposition of pieces, among all valid and legal diagrams.
Of course, some abstractions and approximations are required in order to implement a practical method. I've retained the following:
- All pieces are viewed as similar tokens (of same color, weight and figure)
- A chessboard position is specified by the number of "pieces" on each row and each column. For example, a board with 5 pieces can be described, by one of the following sentences:
1°) Sentence 1 : "5 columns with 1 piece" (abbreviated hereafter by "5@1" )
2°) Sentence 2 : " 1 column with 5" ,
3°) sentence 3 : "2 with 1" and "one with 3", etc.
Under this framework, the problem consists in determining, for a given number of pieces on a board, the probability that a specific configuration, aka sentence, as noted above, occurs. This probability will be the input for the common entropy formulae: Prob * Log(1/prob).
After launching the PC fans at full speed, it appears there is 409 ways of configuring a board of 25 pieces, according to a column View. On 409, the 2 most rapidly found are: {"5@5"} and {"3@8"; 1@1}.
Note: Beyond 32 pieces, results are immediate by replacing the word "pieces" by "holes" or "empty squares". For 33 pieces, results data is the same as for 31 pieces.
I have also tabulated all the probabilities associated to each and every sentence.
As an example, the most probable columns content of a set of 32 pieces is:
{1@2;2@3;2@4;2@5;1@6}. The most improbable is the setup Position {8@4}
And for a set of 10 pieces the most probable is {3@1;2@2;1@3}.
The same method applies when "columns" is replaced By "Rows". However a column-view-sentence is not always compatible with any other sentence of the same set of sentences.
If a board has a column filled with 5 pieces {1@5}, this is compatible with a ROW-view such as {5rows@1} but not with a row view such as {2@2;1@1}.
When a Column-view specification is compatible with a Row-view specification, the probability of a 2D view can be calculated and by the way the associated entropy.
For a 10 pieces position the most probable 2D view is [{3@1;2@2;1@3},{3@1;2@2;1@3}]
For a 20 pieces position the most probable 2D view is [{ 1@1,3@2;3@3;1@4},{ 1@1,3@2;3@3;1@4}] .Spoiler : a solution is given by putting a grain of rice on squares : 1,2,6,8,9,11,15,17,21,27,36,38,39,41,44,51,52,56,61,64.
Note: It is particularly difficult to draw heavy loaded and high entropic positions on paper (with pencil and rubber) although many valid solutions exist. On the contrary, the most improbable positions are the easiest.
As a conclusion, it appears that positions from 4 to 7 pieces are those which can provide the highest entropy and thus the highest rate of disorder or randomness. A 6 "pieces" Board has
35 2D views but one of these is highly probable (p= 0.4), it is the 2D view: [{4@1,1@2},{4@1,1@2}]
In other words, for low-loaded positions, a hash code only based on the number of pieces by rows and cols would provide many collisions, but for heavy loaded positions (15 To 32) where the entropy is low and (+/-) constant, a hash code based on the 2D View's ID may be used.
Last but not least, a 32 pieces board has almost 100 000 2D views, as specified above. And a table of 500 000 entries is required to record all 2D views identifiers (64 bits) for each and every size of board.
Perhaps, in the future, I will ask my evaluation function (in progress.. ) to bonus shortest paths , in the openings, towards highest entropy levels as the setup position is the worst entropic position.
I would be please to hear you on these subjects.
Best Regards
denis.begaud@neuf.fr