+ All Categories
Home > Documents > Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo...

Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo...

Date post: 28-Feb-2021
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
21
Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010
Transcript
Page 1: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

Enhancements for Multi-Player Monte-Carlo Tree Search

J. (Pim) A.M. NijssenMark H.M. Winands

29 September 2010

Page 2: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 2

Overview• Introduction• Progressive History• MP-MCTS-Solver• Test domains• Experiments and Results• Conclusions• Future Research

Page 3: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 3

Introduction• Enhancements for Multi-Player Monte-

Carlo Tree Search– More than 2 players– Techniques

• maxn (Luckhardt and Irani, 1986)• Paranoid (Sturtevant and Korf, 2000)

– Games• Chinese Checkers• Hearts

Page 4: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 4

Introduction• Enhancements for Multi-Player Monte-

Carlo Tree Search– Best-first search technique– Monte Carlo simulations– Four phases

• Selection (UCT)• Expansion (1 node per sample)• Playout (ε-greedy)• Backpropagation

Page 5: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 5

Introduction• Enhancements for Multi-Player Monte-

Carlo Tree Search– Stores tuple of size N in nodes– Game returns tuple of size N

• Winner gets a score of 1, losers get a score of 0• Score is split in case of multiple winners

– e.g. [½, ½, 0] is returned if Players 1 and 2 both win

Page 6: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 6

Introduction• Enhancements for Multi-Player Monte-

Carlo Tree Search– Progressive History– Multi-Player Monte-Carlo Tree Search Solver

Page 7: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 7

Progressive History• Combination of Progressive Bias (Chaslot

et al., 2008) and the history heuristic (Schaeffer, 1983)

• Move selection strategy uses action information

• More information available• Information is less accurate• Influence decreases over time

Page 8: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 8

Progressive History

1)ln(

+−×+×+=

iia

a

i

p

i

ii sn

Wns

nn

Cnsv

History heuristic Progressive Bias

Divide by number of losses

UCT

Page 9: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 9

MP-MCTS-Solver• Multi-Player version of MCTS-Solver

(Winands et al., 2008)• Updating game-theoretical values• Update rules

– Standard (mate in one, one winner)– Paranoid– First winner

Page 10: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 10

MP-MCTS-Solver

A

B C D

E F G H I

Player 3

Player 1

[0,1,0][…]

[1,0,0]

[0,1,0]

[1,0,0]

[1,0,0]

[0,1,0]

[0,1,0]

[?] Paranoid [0,1,0]

[1,0,0]First winner

Page 11: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 11

Test domains• Multi-player games• Zero-sum• Perfect information

• Focus• Chinese Checkers

Page 12: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 12

Focus• Capturing pieces

by creating stacks• Goal

– Total number of pieces captured

– Number of pieces captured from each opponent

Page 13: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 13

Focus• Moving

– Only stacks one owns– Orthogonally– Move as many squares

as the number of pieces

– Maximum stack size is 5

• Capture pieces by creating larger stacks

Page 14: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 14

Chinese Checkers• Goal: move pieces to

other side of the board

• Move pieces to adjacent fields or jump over other pieces– Sequential jumps

Page 15: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 15

Experiments and Results• Processor: AMD64 2.4 GHz• Programming language: Java 6

• MCTS settings: C = 0.2, ε = 0.05

• Time: 2.5s per turn• 3360 games per tournament• All possible configurations

Page 16: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 16

Experiments and Results• Progressive History in Focus

W 2 players 3 players 4 players

0 52.0% 51.2% 50.8%

0.5 59.0% 61.1% 57.5%

0.1 59.8% 63.0% 58.9%

0.25 61.3% 62.9% 59.4%

0.5 64.1% 65.5% 59.9%

1 66.0% 65.4% 58.2%

3 62.2% 65.2% 59.6%

5 57.9% 63.8% 59.6%

7.5 51.3% 60.6% 57.1%

10 47.4% 57.8% 56.9%

Page 17: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 17

Experiments and Results• Progressive History in Chinese Checkers

W 2 players 3 players 4 players

0.25 52.8% 59.0% 56.9%

0.5 58.2% 62.8% 58.3%

1 67.8% 63.5% 61.9%

3 79.9% 66.7% 66.4%

5 83.5% 65.8% 66.8%

10 83.2% 65.3% 69.6%

15 81.0% 65.0% 69.2%

20 60.8% 60.2% 63.2%

Page 18: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 18

Experiments and Results• Divide by number of losses

Game 2 players 3 players 4 players

Focus 64.8% 61.0% 52.0%

Chinese Checkers 57.6% 54.8% 53.9%

Page 19: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 19

Experiments and Results• MP-MCTS-Solver in Focus

Update rule 2 players 3 players 4 players

Standard 53.0% 54.9% 53.3%

Paranoid 51.9% 50.4% 44.9%

First winner 52.8% 51.5% 43.4%

Page 20: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 20

Conclusions• Progressive history

– Significant enhancement in Chinese Checkers and Focus

– Dividing by number of losses in Progressive Bias part increases performance

• MP-MCTS-Solver– Small but significant enhancement in Chinese

Checkers– Standard update rule works best

Page 21: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010

5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 21

Future Research• Test Progressive History in other games• Compare Progressive History with similar

techniques, like RAVE, prior knowledge (Gelly and Silver, 2007), Gibbs Sampling (Björnsson and Finnsson, 2009), etc.

• Create new update rules for MP-MCTS-Solver


Recommended