DEEP BLUE

DEEP BLUE

Motivation

The quest of a computer scientist is to build a machine that can match a human mind.

One of the central conundrums of the last 40 years in AI research is that problems we thought were hard turned out to be fairly easy, and that problems we thought were easy have turned out to be profoundly difficult.

One such field ,where major break through has been achieved is chess playing.

Deep Blue is the culmination of a multi year effort to build a world class chess machine.

Rich history of cumulative ideas

Minimax search, evaluation function learning (1950).

Alpha-Beta search (1966).

Transposition Tables (1967).

Iterative deepening DFS (1975).

End game data bases ,singular extensions(1977, 1980)

Parallel search and evaluation(1983 ,1985)

Circuitry (1987)

Chess

Chess as a game has fascinated the academia .Alan Turing is known to have developed his chess playing algorithm(never implemented)

A vast collection of various defence ,attack ,gambit etc have been deeply studied and play a major role in deep blue's strategy called ”Open book”.

Deep Blue uses around 8000 different evaluation functions each based on heuristics designed by players over years of experience.

The teams The Deep Blue team consisted of Feng-hsiung Hsu and Murray

Campbell, veterans from the Chip Test and Deep Thought days at Carnegie Mellon University, and IBM additions C.J. Tan, Joseph Hoane, and Jerry Brody.

vs

Match History..

The first match (series of six matches), between Deep Blue and Gary Kasparov was played in February 1996 in Philadelphia, Pennsylvania

Result 4 -2 (Kasparov).

The rematch was held May 3-11 at the Equitable Center in downtown Manhattan (1997).

Result 3.5 – 2.5 (Deep blue)

The rematch witnessed the shortest game between man and machine at this level.

Chinese wall effect

This is how humans excel at chess.

System overview

Deep Blue is a massively parallel system designed for carrying out chess game tree searches( smart brute force) .

The system is composed of a 30-node (30-processor) IBMRS/6000 SP computer and 480 single-chip chess search engines, with 16 chess chips per SP processor.

Deep Blue is organized in three layers. One of the SP processors is designated as the master, and the remainder as workers.

The master searches the top levels of the chess game tree, and then distributes “leaf” positions to the workers for further examination. The workers carry out a few levels of additional search, and then distribute their leaf positions to the chess chips, which search the last few levels of the tree.

Prelude

All fixed ply algorithms suffer from “horizon effect” and ways to overcome it like “singular extension”.

Deep chess knowledge behind evaluation function. Difference (between player and opponent) of

Material Mobility King position Bishop pair Rook pair Open rook files Control of central positions.

Prelude

“Transpositions” in the game . Same board position may be reached playing different set of moves

Using this property for optimization.

Chess game tree

12

Chess Game Tree

A game of chess can be considered as a large n-ary tree The chess tree is very bushy (usually about 35 branches from each position), and very deep

One way to search the complete tree is to stop at nodes only when some player wins.

Obviously searching each and every node recursively takes lots of time and space.

So Shannon proposed limiting how many moves from current position should be searched.

He used MinMax

13

MinMaxLet,at the root position, it's White's turn to move.and

White wants a more positive score if possible, the move with the largest score is selected as best, So does the Max Function

The "Min" function works in reverse. The "Min" function is called when it's Black's turn to move, and black wants a more negative score, so the move with the most negative score is selected.

These functions are dual recursive, meaning that they call each other until the desired search depth is reached. When the functions "bottom out", they return the result of the "Evaluate" function.

14

MinMax Algorithm

15

NegaMaxNega-max is just min-max with an optimization.

The "Evaluate" function returns scores that are positive if the side to move at the current node is ahead, and everything else is also viewed from the perspective of the side to move.

When the value is returned, it is negated, because it is now being viewed from the perspective of the other side.

This function traverses the same nodes as "min-max" in the same order, and produces the same result.

It's much less code, which means that there is less opportunity to create a bug due to code replication, and the code is easier to maintain.

16

Alpha-BetaThe AlphaBeta search procedure gets two additional

arguments:

Alpha, the best score that can be forced by some means. Anything worth less than this is of no use, because there is a strategy that is known to result in a score of alpha. Anything less than or equal to alpha is no improvement.

Beta, worst-case scenario for the opponent. It's the worst thing that the opponent has to endure, because it's known that there is a way for the opponent to force a situation no worse than beta, from the opponent's point of view.

17

Alpha-BetaFail-low :If a move results in a score that was less than

or equal to alpha, it was just a bad move and it can be forgotten about, since, there is known to be a strategy that gets the moving side a position valued at alpha.

Fail-High: If a move results in a score that is greater than or equal to beta, this whole node is trash, since the opponent is not going to let the side to move achieve this position, because there is some choice the opponent can make that will avoid it.

If a move results in a score that is greater than alpha, but less than beta, this is the move that the side to move is going to plan to play, unless something changes later on. So alpha is increased to reflect this new value.

18

Alpha-Beta Algorithm

19

Problem with fixed depth Searches:

if we only search n moves ahead, it may be possible that the catastrophy can be delayed by a sequence of moves that do not make any progress

also works in other direction (good moves may not be found)

20

Horizon Effect

The problem with abruptly stopping a search at a fixed depth is called the 'horizon effect'

The negative horizon effect - MAX may try to avoid a bad situation which is actually inevitable.

The positive horizon effect - MAX may not realise that something good is going to be achievable.

21

Quiescence Search

This involves searching past the terminal search nodes (depth of 0) and testing all the non-quiescent or 'violent' moves until the situation becomes calm, and only then apply the evaluator.

Enables programs to detect long capture sequences and calculate whether or not they are worth initiating.

Expand searches to avoid evaluating a position where tactical disruption is in progress.

22

Quiescence Search

which moves are likely to cause a drastic change in the balance of power on the board?

material balance tends to be the overwhelming consideration in the evaluator,

so anything that changes material is fair game: captures (especially those of major pieces) and pawn promotions certainly qualify, checks may also be worth a look

quiescence search considers extremely narrow, but dangerous lines.

23

Quiescence Search explosion

If any capture allowed, and searched in any old order, you'll destroy the efficiency of search and create a quiescent search explosion.

will result in dramatically reduced depthand may cause a program crash.

a couple of ways of trying to avoid a quiescent explosion are:

MVV/LVA (Most Valuable Victim/Least Valuable Attacker):a move ordering technique to search the best capture firstSEE(Static Exchange Evaluation):improves move ordering. and allows to prune "bad" capturing moves, without many important captures being pruned out erroneously,

24

Forward Pruning

25

Forward Pruning: Null Move

Null-move forward pruning is a step performed prior to searching any of the moves.

You ask the question, "If I do nothing here, can the opponent do anything?“

a reduced-depth search, (with the opponent to move first)

if that search results in a score >= beta, simply return beta without searching any moves.

26

Forward Pruning: Null MoveNull-move forward pruning doesn't work in some cases:

in zugzwang... in endgames.

Null move during search has several advantages related to speed and accuracy:

null-move search may only consume 3% of the resources required by a full depth-N

examination. if, in a given position during quiescence search, it is

revealed that the null move is better than any capture, this is a position where the evaluation function itself should be applied!

Overall, the null-move heuristic save between 20% and 75% of the effort required by a given

search.

27

Nega Scout

28

Order Evaluation

The order of the evaluation of the nodes is crucial

Good move order is crucial for good performance

29

30

Heuristics

Capture moves first

Forward moves first

Remembers moves that produced most cutoffs at each level of search (Killer Heuristic)

Maintains a table of all possible moves with history score (History Heuristic)

31

Minimal Window Search

If we have a good guess about the value of the position, we can further increase efficiency of Alpha-Beta by starting with a narrower interval than [−∞, +∞]

Extreme case: Minimal Window β = α + 1

Possible results:

FAIL HIGH:

Value ≥ β = α + 1 ,Value > α

FAIL LOW:

Value <= α

32

NegaScout

NegaScout assumes that the first node is best

If the value of a node is lower (FAIL LOW), we can prune the node

If FAIL-HIGH, we need to re-search the tree with a bigger window

33

NegaScout Example

34

Alpha Beta Nega-Max Formulation

35

NegaScout

36

Opening Game

37

Open Book

Contains a set of positions, along with associated recommended moves.

Computer selects one of the recommended moves by way of some random mechanism, then plays without further computation.

38

Extended Book

It was derived automatically from a Grandmaster game database.

For each position arising in first 30 moves, the system computes an evaluation for each move that has been played

The move found, is searched by offsetting alpha-beta window by value of bonus.

39

Opening Game

Deep Blue first checks whether a move is available from the opening book. Finding a move, it plays it immediately.Otherwise, it consults the extended book; if it finds the position there, it uses the evaluation information to award bonuses and penalties to a subset of the available moves. Deep Blue then carries out a search, with some preference for following successful Grandmaster moves.Automatic extended-book: In some situations, where the bonus for a move is unusually large, Deep Blue can make a move without computation.

40

41

Deep Blue Hardware

System configurationWas based on an IBM RS/6000 SP supercomputer

which could be viewed as a collection of IBM RS/6000 processors or workstations connected through a high-speed switching network.Each processor in the system controlled upto 16 chess chips,

distributed over two MicroChannel buses( a bus architecture like ISA , PC-AT )

The 1997 Deep Blue had a 30-way machine with 30 RS/6000 processors.

The SearchOccurs in parallel on two levels.

One over the IBM RS/6000 SP switching network Two over the MicroChannel bus inside a workstation node.

For a 12-ply search The master workstation node would search the first four plies

in software. All 30 nodes including the master , then search these new

positions ( genrated in the step above). At this point , the chess chips jump in and finish the last four

plies of the search , including quiescence search.

Chess Chips

The chess chip divides into four parts : The move generator.The smart-move stack.The evaluation function.The search control.

The chess chips basic search algorithm

Entering a chess position after making a move , the chess machine processes two parallel paths : move generation and decision evaluation The move generation path :

It first checks the legality of the opponent's most recent move by checking if we can capture the opponents king. If yes then this is an illegal moveand it returns immediately.If the last move was legal , the move generation process is started.

If we cannot find a move , ( i.e. No legal moves exist , or in case of quiescence search no suitable forcing moves exist ) , we return to the parent position.If we do have a move and the evaluation function say we cannot exit ,we cannot exit , we continue to search in the next level

The evaluation path :We first check whether it's a leaf position ( usually by checking if we have reached sufficient depth ) .If not , we do not need to carry out evaluation , and merge with the move generation process.

The move generator

8x8 array of combinational logic. Has a hard-wired finite state machine controlling move generation. Can generate capturing , checking , check evasion moves and

attacking moves. The basic move generation algorithm is like that of the Belle move

generator. The combinational logic array is effectively a silicon chess board : Each cell in the array has four major components :

A Find Victim Transmitter A Find Attacker Transmitter A ReceiverA Distributed ArbiterA four bit piece register.

The move generation algorithm

Move generation consists of 2 phases :

A Find Victim Phase

A Find Attacker Phase

Find Victim Cycle

The Find-Victim Transmitter radiates appropriate attacking signals for the resident piece.If a square is vacant , incoming attack signals from a ray piece ( bishop , rook or a queen ) pass through a cell.The radiated attacking signals then reach the receiver ,

and a vote is taken to find the highest value victim.At the receiver , if is some piece of opposing colour is attacking

the resident piece, the receiver asserts a priority signal based on the piece type.

Since we want to find victims , the priority rises for higher value pieces , with the queen highest , then

rook , bishop , knight , pawn and empty square in descending

order.The priority signals from all the square go to the

arbitration network to find the highest valued victim.

Find Attacker CycleWith the victim chosen the find-attacker cycle executes.

The Find-Attacker transmitter on the victim cell transmits reverse attacking signals as if it were a super-piece.

The receivers on all squares then detect whether on incoming reverse attacking signal matches the resident piece type.

If the resident is an appropriate attacker , the system asserts a priority signal.

Since we want to use the lowest valued attacker , the priority of pieces is reversed.

The priority signals then go through the arbitration network and with both the attacker and victim chosen , we have a move.

Generating Checking Moves

It activates all the find-victim transmittersas well as the opposing kings find attacker transmitter.

When both sets of signals collide on the same square ,we have a square from which we can issue a check.

When ray signals align properly on a square with a piecebelonging to the moving side , the piece can give a discovered check.

Evaluation Function

Contains about 66,000 gates

Has two parts :Fast Evaluation.

Computes in single cycleContains all easily computedevaluation terms.

Slow Evaluation. Scans the chess board one column

at a time Has a latency of 3 cycles per column and takes

11 cycles to compute.

Slow EvaluationSingle most complicated element on chip , occupying closeto half of the chip core.It computes values for chess concepts such as :

Square Control.Pins. X-rays.King SafetyPawn structurePassed PawnsRay ControlRook on the 7th BlockadeTrapped PiecesDevelopment and so on.

This chess evaluation function is probably more complicated thananything ever described in the computer chess literature.

Conclusion

Using techniques which emerged out of years of research , it waspossible for man to build a machine which could match the human brain ,

in the game of chess

Even grandmasters like Anand ( current rank 1 ) , rate these machines very highly

Having said this , there are other games which have not yet been explored like Go, reasons being , a much larger board size ( thus combinatorial

complexity ) and lack of abundant expert knowledge.

Feng-Hsiung Hsu now claims that in another 10 years , he can build a machinethat will beat the human Go players ! A contest has already been issued for the same.

http://www.spectrum.ieee.org/oct07/5552

References

Campbell, M. 1999. Knowledge discovery in deep blue. Commun. ACM 42, 11 (Nov. 1999), 65-67.

Feng-hsiung Hsu, "IBM's Deep Blue Chess Grandmaster Chips," IEEE Micro, vol. 19, no. 2, pp. 70-81, Mar/Apr, 1999

Boule, M.; Zilic, Z., "An FPGA based move generator for the game of chess," Custom Integrated Circuits Conference, 2002. Proceedings of the IEEE 2002 , vol., no., pp.71-74, 2002

Campbell, M., Hoane, A. J., and Hsu, F. 2002. Deep Blue. Artif. Intell. 134, 1-2 (Jan. 2002), 57-83.

www.seanet.com/~brucemo/~topis.html

A. Reinefeld, "An improvement of the SCOUT tree search algorithm," <i>J. Int. Comput. Chess Assoc.</i>, vol. 6, no. 4, pp. 4-14, 1983.

Wikipedia

http://www.seanet.com/~brucemo/~topis.html

Date post:	03-Jan-2016
Category:	Documents
Upload:	avram-lang
View:	33 times
Download:	0 times

DEEP BLUE

Documents