Page 1 of 1

Extracting Principal Variation during search

Posted: Fri Nov 18, 2016 6:18 am
by Dabil
Hello all!

I am a fairly new chess engine programmer. I made a pretty simple chess engine that I am trying to figure out how to find the Principal Variation during a search. I have read many places that say this is one of the simplest ways to do it. However I just can't seem to get it right. A couple quick words, I am writing out the code and a very simple way just to understand how chess engines work. So I am currently simply using integer variables to represent the moves. I have found it necessary to use 4 variable to represent a single move:

moveType: ie, capture, quiet, promotion, short castle, etc.
moveFrom: what square the moving piece is moving from
moveTo:: square we are going to
Promotion: if there is a pawn promotion, what piece are we promoting to.

That should all be obvious, but I wrote it out for newbies who might stumble across this like me. ;) Later I will pack all this information into a single integer as an optimization, I know this isn't ideal.

Next I am using two seperate functions, an alpha() and beta() instead of a negamax() style function where they are combined and negated for each side. I do this for simplicity just to understand. So now that is explained my method for collecting the principal variation that I thought of is this: I read that
PV-Nodes
PV-nodes (Knuth's Type 1) are nodes that have a score that ends up being inside the window. So if the bounds passed are [a,b], with the score returned s, a<s<b. These nodes have all moves searched, and the value returned is exact (i.e., not a bound), which propagates up to the root along with a principal variation.

Root Node and the Leftmost Nodes are always PV-nodes
In Principal Variation Search, PV-nodes are defined by beta-alpha>1
All Siblings of a PV-node are expected Cut-nodes
So am I to understand that I can simply record the PV at any node where alpha < evaluation score < beta? That seems to be what it is saying to my understanding. So I am using this idea in my search. I simply pass a pointer to a move list structure to each node, and if the node is a PV-Node (ie a<s<b) I record the move in the current node, and append the deeper moves from down the tree to the end.

Sounds great in theory, yet it does not appear to work. The PV I get back is not connected. Seems like just some random moves. I have been trying to think about this but just can't reason as to why this would be. So my question is, is the method I am using good? Or is there something I am missing?

Code below for the alpha() function

Code: Select all

	for (int moveIndex = 0; moveIndex < moveList.count; moveIndex++)
	{
		// begin processing move list
		MakeMove(position, moveList.moves[moveIndex].moveType, moveList.moves[moveIndex].moveFrom, moveList.moves[moveIndex].moveTo, moveList.moves[moveIndex].promotion);
		val = Beta(depth - 1, position, alpha, beta, &localpV);

		UndoMove(position);

		if (val >= beta)
		{
			return beta; // fail hard beta-cutoff
		}
		if (val > alpha)
		{
                        // inside here is the PV-Node (ie a<s<b) yes?

			alpha = val; // alpha acts like max in minmax

			// save principal variation for current pv node
			pV->moves[position->currentDepth - depth - 1].score         = moveList.moves[moveIndex].score;
			pV->moves[position->currentDepth - depth - 1].moveType   = moveList.moves[moveIndex].moveType;
			pV->moves[position->currentDepth - depth - 1].moveFrom  = moveList.moves[moveIndex].moveFrom;
			pV->moves[position->currentDepth - depth - 1].moveTo      = moveList.moves[moveIndex].moveTo;
			pV->moves[position->currentDepth - depth - 1].promotion  = moveList.moves[moveIndex].promotion;
            
                        // append subtree nodes to end
			for (int i = position->currentDepth - depth; i < position->currentDepth; i++)
			{
				// save principal variation for current pv node
				pV->moves[i].score     = localpV.moves[i].score;
				pV->moves[i].moveType  = localpV.moves[i].moveType;
				pV->moves[i].moveFrom  = localpV.moves[i].moveFrom;
				pV->moves[i].moveTo    = localpV.moves[i].moveTo;
				pV->moves[i].promotion = localpV.moves[i].promotion;
			}
		}
	}

        return alpha;
Code below for the beta()

Code: Select all

	for (int moveIndex = 0; moveIndex < moveList.count; moveIndex++)
	{
		// begin processing move list
		MakeMove(position, moveList.moves[moveIndex].moveType, moveList.moves[moveIndex].moveFrom, moveList.moves[moveIndex].moveTo, moveList.moves[moveIndex].promotion);
		val = Alpha(depth - 1, position, alpha, beta, &localpV);
		UndoMove(position);

		if (val <= alpha)  
		{
			return alpha; // fail hard alpha-cutoff
		}
		if (val < beta)
		{
                       // inside here is the PV-Node (ie a<s<b) yes?

			beta = val; // beta acts like min in minmax

			// save principal variation for current pv node, this saves the current best move
			pV->moves[position->currentDepth - depth - 1].score     = moveList.moves[moveIndex].score;
			pV->moves[position->currentDepth - depth - 1].moveType  = moveList.moves[moveIndex].moveType;
			pV->moves[position->currentDepth - depth - 1].moveFrom  = moveList.moves[moveIndex].moveFrom;
			pV->moves[position->currentDepth - depth - 1].moveTo    = moveList.moves[moveIndex].moveTo;
			pV->moves[position->currentDepth - depth - 1].promotion = moveList.moves[moveIndex].promotion;

			for (int i = position->currentDepth - depth; i < position->currentDepth; i++)
			{
				// save principal variation for current pv node, this adds the pV from down in the tree to the end
				pV->moves[i].score     = localpV.moves[i].score;
				pV->moves[i].moveType  = localpV.moves[i].moveType;
				pV->moves[i].moveFrom  = localpV.moves[i].moveFrom;
				pV->moves[i].moveTo    = localpV.moves[i].moveTo;
				pV->moves[i].promotion = localpV.moves[i].promotion;
			}
		}
	}

	return beta;
of course this is not the complete code, just the for loop that iterates through the move list alone with the PV collecting idea. Well any advice or help would be appreciated. Thank you!

Re: Extracting Principal Variation during search

Posted: Fri Nov 18, 2016 12:02 pm
by H.G.Muller
I always do it i the following way: I maintain a global stack of moves (encoded as int, say) in an array, and a global variable pvPtr pointing to the first free entry in that array. All PVs stored in there are terminated by a 0 code (which is invalid as move).

At the start of each node I do

Code: Select all

int *pvStart = pvPtr; // this is where we will store the PV for this node
*pvPtr++ = 0; // create an empty PV
At the end of the node, just before returning, I restore the situation with

Code: Select all

pvPtr = pvStart;
This leaves pvPtr, which is also accessible from the parent node, pointing at the PV of the node that just returned.

Then, during the search of moves in the node, I update the PV every time a move scores between alpha and beta, (e.g. in the rarely executed code after increasing alpha and having established there is no beta cutoff) in the following way:

Code: Select all

int *p = pvPtr; // points to (first move of) PV returned by daughter
pvPtr = pvStart; // pop the previous PV of this node
*pvPtr++ = move; // new PV starts with the move just searched
while((*pvPtr++ = *p++)); // append PV of the daughter node behind it, copying it upto and including the 0 sentinel
That is all. After returning from the root the PV will be held in the array, starting at its beginning.

Re: Extracting Principal Variation during search

Posted: Sat Nov 19, 2016 5:55 am
by Dabil
Thanks for your reply and for validating that I have the right idea. Your description sounds a lot like What I am trying to do. Only instead of using global variables I am passing a pointer through. I thought that doing it that way would be easier and safer with less chance of side effects. I may try using the globals if I can't figure it out soon. I must just having something wrong with the code. I will have to try to trace it out on paper or with the debugger.

Thanks again.