Post on 20-Apr-2020
transcript
Quantum Games
Seminar in Quantum Information Processing
Yaniv Carmeli 043356880
February 13, 2007
Introduction
Game theory is the study of decision making of competing agents in conflict situations.
Game theory was first formalized by von Neumann and Morgenstern in the 1930’s, when
exploring economic problems. Game theory is used in many disciplines besides economy,
like diplomacy, social sciences, biology and engineering.
Recent advances in quantum computing and quantum information processing caused
several researchers in the area to explore if and how quantum effects can shed a new
light on game theory. Namely, how the quantum superposition, quantum interference
and entanglement can effect the agents’ optimal strategy. The main motivation behind
this attempt is the hope to understand the advantages of quantum computing better; In
spite of the fast growing number of researchers in the field of quantum information, very
few quantum algorithms are discussed. They are largely limited to the Deutsch-Jozsa,
Simon, Shor and Grover algorithms, and we still lack the understanding of which problems
are more likely to have an efficient quantum solution. We also don’t quite understand
the source of the power of quantum computing, and it was the hope of quantum games
researchers to get closer to the answers to these questions by investigating the effects
of applying quantum strategies to well-studied classical games, and by exploring the
advantages of a player utilizing quantum strategies over an opponent who can use classical
strategies alone.
It is interesting to note that every problem can be (although it may be a bit artificial)
thought of as a game. For instance, quantum communication can be though of as a game
in which the goal of the player is to maximize the effective communication. Quantum
cryptography can be thought of as a 3-player game, where the goal of Alice and Bob is to
transfer a message between them without Eve learning anything about the message, while
Eve’s goal is exactly the opposite. Quantum algorithms can be thought of as games where
the player wins if he manages to solve the problem (e.g. factor a number) successfully,
and loses if he fails. This observation further increases the motivation to explore the field
of quantum games.
Game Theory - Concepts and Terminology
In this section we will define some of the terms we will use when exploring quantum
games:
• Player - each of the competing agents in the game.
• Move - a possible course of action that the player may choose.
• Strategy - A full set of instructions to the agent, which tells him which move to take
at every possible game situation.
1
B1 B2
A1 (3,3) (0,5)
A2 (5,0) (1,1)
Figure 1: Example of a payoff matrix. Each cell lists the utilities for both players when
certain pure strategies are selected. For instance: if Alice chooses the strategy A1 and
Bob chooses the strategy B2 then the utility for Alice is 0 and the utility for Bob is 5
– A pure strategy - consists of always playing a given move.
– A mixed strategy - flip a coin to select between several possible moves. In
order to represent a mixed strategy we will use a vector (p1, . . . , pn) of the
probabilities of all the pure strategies (where ∀i : pi > 0 and∑
pi = 1).
• Utility - A numerical measure of the desirability of a certain outcome to a certain
player.
• Payoff matrix - Shows the utilities of all the players in the game, for each of the
possible game outcomes. The number of dimensions of the payoff matrix is the
number of players in the game, and the size of the matrix in each of the dimensions
is the number of pure strategies available to the corresponding player. An example
of a payoff matrix is displayed in Figure 1.
• Pareto Optimal Strategies (Outcome) - A combination of a strategy for each player,
where it is impossible to increase one player’s utility without decreasing the utility
of other players.
• Dominant Strategy - A strategy that does at least as well as any other strategy for
every move the opponent makes. In the game described by the payoff matrix of
Figure 1 A2 is a dominant strategy - Alice scores better than if she played A1 no
matter what strategy Bob picks.
• Zero-Sum Game - A game in which for each possible outcome of the game the sum
of all the utilities of the players is 0. It can be though of as a game where money
only changes hands and no money can be introduced to the game or removed from
it.
• A Nash Equilibrium (NE) - A combination of strategies, in which no player can
improve his or her utility by unilaterally changing their strategy. For instance, in
the game descried by the payoff matrix of Figure 1 the combination [A2, B2] is a
Nash Equilibrium in pure strategies, because if Alice changes her strategy to A1
2
her utility drops from 1 to 0, and if she goes with a mixed strategy of A1 with
probability p − 1 and A2 with probability p (where 0 ≤ p < 1), then her utility
drops from 1 to p. The same goes for Bob.
• Nash Theorem - Claims that in a zero-sum 2 player game, there will always be a
NE in mixed strategies.
PQ Coin Flip
Meyer [8] describes what he calls a PQ Games. PQ games are games between two players
where one of them is a quantum player applying quantum strategies(Q), while the other
is a classical player, applying probabilistic classical strategies only (P). The purpose of
exploring these games is to try to learn about the advantages of the quantum player
over the classical one, and maybe learn about the origins of the advantage of quantum
computing over classical computing. The names that are usually used in these PQ games
are Q for Q and Picard for P. The names are borrowed from the well-known TV series
Star-Trek, in which Jean-Luc Picard, the human captain of the USS starship Enterprise,
often finds himself playing games against the omnipotent being Q, who tries to test him
and his inferior human abilities.
We will try to explore the game of Coin Flip because its concept is very simple, and we
will be able to use it to demonstrate the game theory concepts described in the previous
section. We will begin by describing the rules of the classical game: The game consists
of a coin which at first is lying head up in a closed box. First Q puts his hand inside the
box, and he can choose if he wishes to flip the coin, or leave it as is. Then it’s Picard’s
turn to do the same, and after he finishes Q gets another opportunity to flip or not to
flip. After they both carry out their choices they open the box. If the coin is head up Q
wins (Picard has to pay him $1), and if the coin is tail up Picard wins (and Q has to pay
him $1). The payoff matrix for this game is:
NN NF FN FF
N (-1,1) (1,-1) (1,-1) (-1,1)
F (1,-1) (-1,1) (-1,1) (1,-1)
where F represents Flip and N represents Not Flip.
The first immediate observation we can make about the classical game is that the
result of the game depends on the parity of the number of flip operations the players have
decided to make during the game, and because of that the specific moves Q makes do not
matter, but only the exclusive-or of his moves.
The classical game has no NE in pure strategies (because for each combination of pure
strategies the losing player can unilaterally choose a different strategy and become the
3
winner). An example for a NE in mixed strategies is [(12, 1
2), (1
2, 1
2)]× [(1
2, 1
2)] (On the left
are the probabilities of the choices of P for his first and second flip, and on the right the
probabilities for Q’s choice; In this case, they choose to flip with probability 12
each turn).
The expected payoff for each of the players in this case is 0 (This NE is not unique: For
instance, the strategies [N, (12, 1
2)]× [(1
2, 1
2)] also form a NE).
After exploring this rather simple classical game, we will try to represent it as a
quantum game. The main guideline is that the classical game has to remain a sub-
game of the quantum game (i.e., by allowing only operations which represent classical
strategies we should be left with the classical game). In our quantum game the coin will
be represented by a qubit, and we will use the computational basis to represent the coin’s
position: heads will be represented by |0〉 and tails by |1〉. After all players carry out
their moves the qubit will be measured in the computational basis, and the winner will
be declared according to the result of the measurement.
Since we allow mixed strategies we will have to work with the density matrices of these
states, which are:
ρ0 = |0〉 〈0| =(
1 0
0 0
)ρ1 = |1〉 〈1| =
(0 0
0 1
)
The analogues to the classical operations will then be:
F = σx =
(0 1
1 0
)N = I =
(1 0
0 1
)
And when applying F with probability p and N with probability (1−p) the transformation
of the state ρ is:
pFρF † + (1− p)NρN †
For the simplicity of the notation we will denote this strategy as pF + (1− p)N , but it is
not the operator itself (unless the state is represented by a vector).
Given a density matrix of a mixed state
ρ =
(a b
c d
)
the probability to measure |0〉 is a2 and the probability to measure |1〉 is d2.
We will now examine what happens if Q is allowed to perform unitary strategies on
the qubit. Q’s first operation will then be:
U1 = U(a, b) =
(a b∗
b −a∗
), aa∗ + bb∗ = 1
And the state of the qubit after the move will be:
ρ1 = U1ρ0U†1 =
(aa∗ ab∗
ba∗ bb∗
)
4
After Picard’s mixed strategy the state of the qubit will be:
ρ2 = pFρ1F† + (1− p)Nρ1N
† =
(pbb∗ + (1− p)aa∗ pba∗ + (1− p)ab∗
pab∗ + (1− p)ba∗ paa∗ + (1− p)bb∗
)
Let us consider, for the sake of the discussion, a game that would end here. If Q were to
employ a strategy for which aa∗ 6= bb∗ Picard can get a payoff of |aa∗ − bb∗| by selecting
p = 0 (if aa∗ > bb∗) or p=1(if aa∗ < bb∗).
If Picard were to choose any mixed strategy where p 6= 12
(or a pure strategy), then Q
can get an expected payoff of |2p−1| by selecting a = 1 (if p > 12) or b = 1 (if p < 1
2). Thus,
it is obvious that for any pair of strategies where aa∗ 6= bb∗ or p 6= 12
one of the players can
improve his payoff, and it is not a NE. The pairs ([12F + 1
2N ], [U(a, b)]), aa∗ = 1
2= bb∗ are
a NE, because no player can obtain a better payoff by unilaterally changing his strategy.
These pairs actually represent the same results as the classical game. A quantum player
has no advantage if he has only one move, and there is no entanglement involved (His
move will have the same outcome as a mixed classical move with adequate probabilities).
Now we move on to the full 3-round game. Instead of exploring the exact cumbersome
mathematical expression for the state of the qubit after a second quantum move we will
show that Q does have an advantage, and he can win with probability 1 with no regard
to the strategy Picard takes (Of course, there can be no better outcome for him). If Q
will choose the Hadamard operator for both of his moves (U1 = U3 = U( 1√2, 1√
2) = H)
then the state of the qubit after Q’s first move will be ρ1 = U1ρ0U†1 = 1
2
(1 1
1 1
), which is
an equal superposition of |0〉 and |1〉.Picard’s move (for every general classical strategy he may choose) does not change the
state of the qubit: ρ2 = pFρ1F† + (1− p)Nρ1N
† = 12
(1 1
1 1
)
After Q applies his second move the state is: ρ3 = U3ρ2U†3 =
(1 0
0 0
), which is the
density matrix of |0〉. When the measurement will be applied the coin will be head up
with probability 1, and Q will win every time - an apparent advantage to the quantum
player. The formal representation of the equilibria:
([pF + (1− p)N ], [U(1√2,
1√2), U(
1√2,
1√2)])
We may also want to consider the coin flip game where both the players are quantum
players. It turn out that there are no equilibria in pure strategies in this case. For every
pair of strategies ([U2], [U1, U3]):
• If U3U2U1 |0〉 6= |0〉 (Meaning that Q does not win every time) Q can improve his
expected payoff by selecting U3 = U−11 U−1
2 , and then he would win every time.
5
• If U3U2U1|0〉 6= |1〉 (P does not win every time) P can improve his expected payoff
by selecting U2 = U−13 FU−1
1 , which will cause him to win every time.
A NE in mixed quantum strategies, however always exist (Theorem 3 in [9]). The proof
of this last theorem is based on the fact that the set of quantum operators form a convex
subset of a finite-dimensional vector space, and the proof is a generalization of the proof
of Nash Theorem of classical game theory.
But is the penny flip game a good example of the advantage of the quantum player
over the classical one? van Enk [10] claims it is not. The first reason he gives is that it is
not the same game anymore. By allowing Q to make his choice of strategy from a wider
variety of strategies, we have actually given him an advantage, which is not quantum.
It is no wonder that we have given Q more options and now he wins. van Enk’s second
reason for objecting to the PQ flip game as an example for quantum advantage is that
the game can be represented as a classical one.
The classical flipping of the coin can be thought of as rotation of the coin on one
predefined horizontal axis. Applying other quantum operators means the freedom to
choose another axis for the rotation, namely bring the coin to stand on its edge while its
plane is normal to the ”classical” rotation axis. This way ”classical” rotations of the coin
will not change the position of the head side or the tail side with respect to this axis.
When the coin is then rotated back to its original vertical alignment on the table the head
side will be up (Though the orientation of the coin may change. If we have started with
a coin where the number is facing us, we may end up with the number facing another
direction). This happens because the coin has more degrees of freedom than relevant to
this game.
Meyer replied to van Enk’s claims[7] that classical implementations of the game are
not scalable, while quantum implementations are. If we consider several entangled qubits
we will see the advantage in a game that has no obvious classical analogue.
Non-Zero 2×2 Quantum Games
In this section we consider 2×2 games. 2×2 games are games with 2 players, where
each has the choice between two strategies only (It is called 2×2 games, because the
size of the payoff matrix is 2×2). Each player has to chose his move before knowing his
opponent’s move, and without communicating with him. The payoff matrix is known to
both players and their goal is to maximize their expected payoff. Under the assumption
that the opponent is also rational, each player can estimate what are the considerations
his opponent is facing, and what may be his chosen move.
6
Examples of Non-Zero 2×2 Games
We will present several examples of 2×2 games: The prisoners’ Dilemma (PD) and
Chicken will be presented in detail (as they are used later to demonstrate the quan-
tum versions of the games), while Stag Hunt and Battle of The Sexes will be briefly
mentioned (they are also presented by [5]). Other 2×2 games can be easily found in
internet resources.
• The Prisoners’ Dilemma
In the Prisoners’ Dilemma two players have to independently decide whether they
want to defect(D) or cooperate(C). The payoff is determined according to the fol-
lowing payoff matrix.
Bob : C Bob : D
Alice : C (3, 3) (0, 5)
Alice : D (5, 0) (1, 1)
The dilemma is that D is a dominant strategy (it does better then C no matter
how the opponent plays) so rational reasoning causes the player to defect. [D,D] is
a Nash equilibrium in dominant strategies and the payoff is (1,1). However [C,C]
is the Pareto optimal outcome - the payoffs are (3,3), it is clear that the players
would be better off choosing [C,C]. By choosing C, the players risk a payoff of zero
(if the opponent chooses D) so rational reasoning makes them choose D and ensure
a payoff of at least 1.
The most common version of this game talks about 2 suspects of a major crime
held in separate cells. Since there is not enough evidence to convict them they
are informed of the following policy: If one of them decides to confess (That is,
defect(D) and betray his partner) while the other does not, then the confessor will
be released while his partner will be sentenced for 9 years in jail. However, if both
decide to betray each other (defect(D)), then the DA will show some leniency and
they will both get 6 years in jail, and if both decide to deny (cooperate(C) with
each other) they will only be convicted of a minor crime and be sentenced to 1 year
in jail.
It is important to note that an iterative version of the PD (where the game is played
several times) has some different consideration. More information can be found in
[12]. [12] also brings several very interesting examples of situations where PD is
observed in real life (situations where mutual cooperation will benefit both players,
yet they still choose not to cooperate and thus get a worse outcome). Some of these
examples were brought in class.
• Chicken
7
In chicken each player gets the best payoff if the other player backs down (Cooperate
- C) while they keep their stand (Defect - D). However, if they choose the same
strategy it is better for them if they both cooperate than if they both defect. A
typical payoff matrix for chicken is:
Bob : C Bob : D
Alice : C (3, 3) (1, 4)
Alice : D (4, 1) (0, 0)
This problem arises in many military/diplomatic conflicts and negotiations, where
it is best for each of the players that the other one makes all the concession. The
next best option is if both of them make some concessions but the deal still goes
through, and the worst outcome is when both stick to their original stand and the
deal is cancelled.
The game takes its name from the well-known street contest where two drivers are
driving towards each other along an empty road. They can each choose to swerve
(C) or not to swerve (D). If one swerves and the other does not, he is proclaimed a
’chicken’ and receives a poor payoff, while the other is proclaimed ’hero’ and receives
the best payoff. If both swerve they are neither heros nor chickens, but at least they
are still alive - Unfortunately, this cannot be said in case they both decide not to
swerve.
The Nash equilibria in this game are [C,D], [D,C], and they are both Pareto optimal.
The problem the players are facing is: Which of the NE will be chosen? Who will
get the payoff of 4 and who will settle for the payoff of 1?
• Stag Hunt
Stag Hunt deals with two hunters who go out hunting. They can each choose if
they want to hunt a stag (S) or a hare (H). They have to make their decision before
knowing what the other hunter had decided. In order to catch a stag each hunter
needs the cooperation of his friend, while hares can caught by a single hunter (but
are worth less than a stag).
A typical payoff matrix for Stag Hunt is:
Bob : S Bob : H
Alice : S (4, 4) (1, 3)
Alice : H (3, 1) (3, 3)
• Battle of The Sexes
Kelly and Chris were planning to go on a date, but forgot where they were supposed
to go. They each can choose if they want to go to the opera (O) - which is Kelly’s
8
favorite pastime, but a nightmare for poor Chris, or to the football field (F) - where
Chris’s team is playing, but is of no interest for Kelly. They have to make their
decisions without communicating with each other, and they both prefer being with
each other over doing their favorite thing alone.
A typical payoff matrix for Battle of The Sexes is:
Chris : O Chris : F
Kelly : O (3, 2) (0, 0)
Kelly : F (0, 0) (2, 3)
There are 2 Nash Equilibria in pure strategies, which are both Pareto optimal.
2×2 Games - The Quantum Version
In order to consider a game where one player or both the players are quantum players,
we first have to move the game to the quantum realm. While doing so we must ensure
that the classical game is faithfully entailed in the quantum game.
In each practical implementation of a 2×2 game, the players have to announce their
chosen move in some matter, such that they don’t know the opponent’s move before
they announce their own. Plausible implementations can be writing the decision on notes
and revealing them only after both have finished writing, or using an arbiter who will
hear the choices of the players separately, and only after listening to both of them will
announce their choices (or payoffs). In the quantum version - the choices will be conveyed
using a qubit. Each of the players will receive a qubit in a specific initial state, and
by manipulating the state of his qubit he will announce his choice. After both players
finish their manipulation the qubits are measured in a predefined basis, and the payoffs
are determined according to the result of the measurement and the payoff matrix. For
example, if we measure the state |ψf〉 in the computational basis, and the payoff matrix
is P:
〈$A〉 = PACC |〈ψf |CC〉|2 + PA
CD|〈ψf |CD〉|2 + PADC |〈ψf |DC〉|2 + PA
DD|〈ψf |DD〉|2
〈$B〉 = PBCC |〈ψf |CC〉|2 + PB
CD|〈ψf |CD〉|2 + PBDC |〈ψf |DC〉|2 + PB
DD|〈ψf |DD〉|2
The Hilbert space of the two-qubit system is H = HA ⊗HB, and the quantum strategies
are quantum operations acting on HA or HB respectively. If the initial state is ρ when
Alice chooses the unitary operation UA and Bob chooses UB the final state will be:
σ = (UA ⊗ UB)ρ(UA ⊗ UB)†
An important observation made by [4] is that using an initial state which is separable will
not give any advantage over the classical game. If the initial state is
|ψ0〉 = |ψA〉 ⊗ |ψB〉
9
then after applying the local quantum operators of Alice and Bob the state will be:
|ψf〉 = (A⊗ B)(|ψA〉 ⊗ |ψB〉) = A|ψA〉 ⊗ B|ψB〉
Since the state remains separable (local operation cannot transform a separable state to
an entangled one) the outcome of the measurement will have the same probabilities as in
applying a classical (mixed) strategy.
Quantum Prisoners
Eisert and Wilkens[2] consider the Prisoners’ Dilemma with quantum players. For this
section the initial state of the system is the maximally entangled state:
|ψ0〉 =1√2(|00〉+ i|11〉)
The basis for the measurement of the qubits is:
|ψCC〉 = 1√2(|00〉+ i|11〉)
|ψCD〉 = 1√2(|01〉 − i|10〉)
|ψDC〉 = 1√2(|10〉 − i|01〉)
|ψDD〉 = 1√2(|11〉+ i|00〉)
And thus the expected payoff for the final state σ is:
〈$A〉 = PACCtr [πCCσ] + PA
CDtr [πCDσ] + PADCtr [πDCσ] + PA
DDtr [πDDσ]
〈$B〉 = PBCCtr [πCCσ] + PB
CDtr [πCDσ] + PBDCtr [πDCσ] + PB
DDtr [πDDσ]
where:πCC = |ψCC〉〈ψCC | πCD = |ψCD〉〈ψCD|πCD = |ψCD〉〈ψCD| πDD = |ψDD〉〈ψDD|
The choice of this basis is indeed such that the classical game is entailed in the quantum
game. Particularly, the classical strategies C and D correspond to the the following
operations:
C ∼(
1 0
0 1
)D ∼
(0 1
−1 0
)(1)
We will examine what happens to the equilibria of the game when instead of operating just
C and D, we allow the players to choose from a broader variety of quantum operations.
1. One Parameter Set of Strategies
10
We first consider a case where Alice and Bob can choose their strategy from S(CL),
the set of local rotations with one parameter:
U(θ) =
(cos θ
2sin θ
2
− sin θ2
cos θ2
)θ ∈ [0, π]
In this case C ∼ U(0), D ∼ U(π). We will denote the strategies of Alice and Bob
by A = U(θA) and B = U(θB) respectively. The expected payoff in this case is:
PA(θA, θB) = 3
∣∣∣∣cosθA
2cos
θB
2
∣∣∣∣2
+ 5
∣∣∣∣cosθB
2sin
θA
2
∣∣∣∣2
+
∣∣∣∣sinθA
2sin
θB
2
∣∣∣∣2
PB(θA, θB) = 3
∣∣∣∣cosθA
2cos
θB
2
∣∣∣∣2
+ 5
∣∣∣∣sinθB
2cos
θA
2
∣∣∣∣2
+
∣∣∣∣sinθA
2sin
θB
2
∣∣∣∣2
From the equations we can see that we have gotten the same results as in the
classical game where cooperation was chosen with probability p = cos2 θA
2by Alice
and p = cos2 θB
2by Bob. Therefore the NE is still [D,D].
2. Two Parameter Set of Strategies
We now allow Alice and Bob to choose their strategy from S(TP ), the set of two
parameters operations:
U(θ, φ) =
(eiφ cos θ
2sin θ
2
− sin θ2
e−iφ cos θ2
)θ ∈ [0, π], φ ∈ [0,
π
2]
The classical PD game is still entailed in this game because now C ∼ U(0, 0),
D ∼ U(π, 0). We will denote the strategies of Alice and Bob by their chosen values
of θ and φ: A = U(θA, φA), B = U(θB, φB).
The expected payoff of Alice is:
PA(θA, φA, θB, φB) = 3∣∣cos(φA + φB) cos θA
2cos θB
2
∣∣2+5
∣∣sin φA cos θA
2sin θB
2− cos φB cos θB
2sin θA
2
∣∣2+
∣∣sin(φA + φB) cos θA
2cos θB
2+ sin θA
2sin θB
2
∣∣2
Now [D,D] is no longer an equilibrium, as the players can benefit from deviating
and choosing
Q ∼ U(0,π
2) =
(i 0
0 −i
)
Moreover, [Q,Q] is a new equilibrium, and PA(Q,Q) = PB(Q,Q) = 3. In order to
show that [Q,Q] is an equilibrium, we will show that when Bob plays Q Alice can’t
get a payoff greater then 3. Indeed, for every strategy U(θ, φ), (θ ∈ [0, π], φ ∈ [0, π2])
of Alice:
11
Q
C
D
Q
C
D
12345
Q
CUB UA$A
Figure 2: The expected payoff of Alice for the different strategies of the game when
allowing operations from S(TP )
PA(θ, φ, 0, π2) = 3
∣∣cos(φ + π2) cos θ
2cos 0
2
∣∣2+5
∣∣sin φ cos θ2sin 0
2− cos φB cos 0
2sin θ
2
∣∣2+
∣∣sin(φ + π2) cos θ
2cos 0
2+ sin θ
2sin 0
2
∣∣2= cos2 θ
2
(1 + 2 sin2 φ
) ≤ 3
Figure 2 shows a plot of the payoff of Alice and it can also be seen there that [Q,Q]
is an equilibrium under these conditions (Alice can’t get a better payoff by deviating
from it, the same applies to Bob).
Since PA(Q,Q) = PB(Q,Q) = 3 then [Q,Q] is also an optimal solution (There is no
other possible solution which will not decrease the payoff of one of the players).
[Q,Q] is also a unique equilibrium, as shown in [3].
It seems as though we have solved the dilemma. Under S(TP ) the players will choose
the strategies which will result in optimal payoffs. Unfortunately, there is no a-priori
reason to restrict the players to operations from S(TP ). As Benjamin and Hayden[1]
point out, the set of two-parameter operations is not closed under composition. (In
the sense that if Alice chooses Q, the Pauli matrix σy, then Bob needs to operate
σx in order to counter it. But σx is not included in the allowed set of operations.
σx is equal to 2 consecutive allowed operations: U(0, π2)U(π, 0)).
3. General Unitary Operations
It was shown by [1] and [2] that in this case there is no equilibrium in pure quan-
tum strategies. The reason is that under S(GU), the set of general local unitary
operations, for any strategy Bob may choose, there exists a strategy for Alice that
will result in the final state σ = |ψDC〉 〈ψDC |, and payoffs will be 5 for Alice and 0
12
for Bob. For instance, if Bob chooses B =
(a b
c d
)where a, b, c, d are appropriate
complex numbers, Alice can choose A =
(b ia
id −c
). Because this consideration is
symmetric, there can be no equilibrium in pure strategies.
However, there is an equilibrium in mixed quantum strategies:
Alice chooses between A1 and A2 with probabilities p1A = 1
2and p2
A = 12
and bob
chooses B1 and B2 with probabilities p1B = 1
2and p2
B = 12, where:
A1 =
(1 0
0 1
)A2 =
(−i 0
0 i
)p1
A = p2A = 1
2
B1 =
(0 1
−1 0
)B2 =
(0 −i
−i 0
)p1
B = p2B = 1
2
(2)
The payoffs for the above strategies are:
PA(A(1), B(1)) = PA(A(2), B(2)) = 0 PA(A(1), B(2)) = PA(A(2), B(1)) = 5
PB(A(1), B(1)) = PB(A(2), B(2)) = 5 PB(A(1), B(2)) = PB(A(2), B(1)) = 0
And the expected payoff for each of the players is 2.5. In order to show that there
is no other strategy Bob is better of selecting, we will consider a general mixed
strategy which consists of the strategies s1B, . . . , sN
B , with probabilities p1B, . . . , pN
B .
The final state is given by the convex combination:
σ =∑i=1,2
∑j
piApj
B(Ai ⊗ sjB)ρi
The convex combination cannot lead to a payoff greater than the payoff of the
optimal pure strategy amongst s1B, . . . , sN
B . Since B1, B2 are both optimal in the
sense that Bob gets a payoff of 2.5 when applying them, and there are no pure
strategies that will yield a better payoff, every mixed strategy bob will choose will
get him a payoff of 2.5 or less. However, the pure strategies B1, B2 will not form
an equilibrium with Alice’s strategy (because she is better off choosing another
strategy). The mixed strategy that consists of applying both B1 and B2 with even
probabilities will form an equilibrium (It can be shown in the same way that Alice
cannot improve her payoff by deviating from the strategy in equation 2 when Bob
sticks to his strategy).
13
The above NE is not unique. For instance, another possible NE is:
A′1 =
(0 1
−1 0
)A′2 =
(0 i
i 0
)p′1A = p′2A = 1
2
B′1 =
(−1 0
0 −1
)B′2 =
(−i 0
0 i
)p′1B = p′2B = 1
2
(3)
This NE has the same quality as the previous one -
PA(A′(1), B′(1)
) = PA(A′(2), B′(2)
) = 0 PA(A′(1), B′(2)
) = PA(A′(2), B′(1)
) = 5
PB(A′(1), B′(1)
) = PB(A′(2), B′(2)
) = 5 PB(A′(1), B′(2)
) = PB(A′(2), B′(1)
) = 0
and [2] shows how to obtain more tuples of mixed strategies that constitute Nash
equilibria which has this quality. Which of these equilibria will be chosen? In
order to answer this question we have to rely on the focal points concept. Game
theory defines focal points as outcomes or strategies which are distinguished from
other outcomes or strategies on the basis of some characteristic not included in the
formalism of the model. Those characteristics may distinguish an outcome as a
result of some psychological or social process and may even seem trivial (such as
the names of the actions). If there are more than one Nash equilibrium and one of
them can be considered a focal equilibrium then it is the one that will be chosen.
In our case, there is one equilibrium that is different than the others, and it is the
only one that gives a payoff of 2.25 for both of the players. According to [6] for
every state ρ there exist probabilities p1A, . . . , pN
A and unitary operations U1A, . . . , UN
A
s.t.:
∑i
p(i)A
(U
(i)A ⊗ 1
)ρ
(U
(i)A ⊗ 1
)†=
1
2
1
1
1
1
(4)
If Alice has already selected this strategy then the system is in the totally mixed
state, and Bob’s unitary operation will not change it (and vice versa..). Since this
is the only equilibrium that yields a payoff of 2.25, then this is the one that will be
chosen.
4. Completely Positive Trace-Preserving Maps Corresponding to Local Op-
erations
We now allow Alice and Bob to select any operation from S(CP ), the set of completely
positive trace-preserving maps. The difference between this set and S(GU) is that
now the remaining state after the application of Alice’s or Bob’s operation does not
14
have to be the totally mixed state. However the tuples of dual equilibria such as
the ones in Eq 2 and Eq 3 still hold.
Alice and bob now have options; they can measure their qubit using the measure-
ment operators:
A1 = |0〉 〈0| A2 = |1〉 〈1| , B1 = D |0〉 〈0| B2 = D |1〉 〈1|
where D is as in Eq. 1. This will result in the final mixed state: σ = 12(|01〉 〈01|+ |10〉 〈10|),
and an expected payoff of 2.5. However, just like the equilibria in Eq. 2 and Eq. 3,
this type of equilibria are symmetric and thus come in pairs (Another measure-
ment that will be able to yield the same results is with the measurement operators:
A′1 = D |0〉 〈0| A′
2 = D |1〉 〈1| , B′1 = |0〉 〈0| B′
2 = D |1〉 〈1|)
Since the equilibrium of Eq. 4 still holds, and it is still the only equilibrium invariant
under the exchange of strategies between the players, then it is still the focal equilibrium,
and is the solution of the game in this case also. Although it is not the optimal outcome
of the game (where both players get a payoff of 3) it is still better then the equilibrium
in the classic game.
Planting Some Doubt
van Enk and Pike [11] raise two important questions regrading quantum game theory in
general and the results of [2] (brought in the previous section) in particular.
1. Does the quantum solution coincide with the spirit of the PD game?
The Prisoners’ Dilemma is a non-cooperative game. i.e., the players are not allowed
to have any communication between them, they cannot enter binding agreements,
and cannot correlate random variables. By giving the players qubits in an entangled
quantum state, we allow them to use the correlations present in such a state. This
goes against the spirit of non-cooperative games, and thus it is not right to compare
the classical version of the game to the quantum version [2] suggests.
2. To what extent does the quantum solution solve the classical game?
In order to consider quantum strategies we had to suggest a quantum model for the
game. We made sure that the quantum game entails the classical one, but does that
really make it a quantum version of the original classical game? Does the solution
[Q,Q] which was found by [2] has any importance in the original classical game?
Let us consider another classical game, where each of the two players has the choice
between 3 pure strategies: C,D and Q. Q is in essence a superposition of C and
D, but it can also be described classically: If both players choose C or D then the
15
payoffs are just as in the original PD game. If Alice chooses Q and Bob chooses C
or D then his choice changes to the one he didn’t pick (and Alice’s choice counts as
a D), and vice versa. If both choose Q then the payoff is as if both choose C. The
payoff matrix for this game is:
Alice/Bob C D Q
C (3, 3) (0, 5) (1, 1)
D (5, 0) (1, 1) (0, 5)
Q (1, 1) (5, 0) (3, 3)
The classical solution of this classical game is [Q,Q], and it is also the Pareto optimal
solution. The question that is being asked is did [3] solve the PD game, or our new
classical game? Games have rules. In order to construct a quantum game the rules
have to be changes a bit so it will comply with the broader model, but in doing so
we actually change the game, and solve a completely different one.
A Quantum Player vs. a Classical Player
This section reviews the work of Flitney and Abbott [5], which also consider 2×2 games,
but this time one of the players is a quantum player and the other one is classical. We
now take a slightly different approach to resolving the conflict than the one taken by [2],
in the sense that instead of looking for equilibria in the game we ask what is the best
strategy to choose while not knowing what strategy the other player will choose.
[5] suggest a slightly different model for the quantum version of 2×2 games; In their
model the system starts in the separable state |ψi〉 = |00〉, and then an entangling operator
J(γ) is applied, where J(γ) can be written without loss of generality as:
J(γ) = exp(iγ
2σx ⊗ σx
)=
(I ⊗ I
)cos
γ
2+ i (σx ⊗ σx) sin
γ
2
After J(γ) is applied Alice and Bob can manipulate their respective qubits using one of
the operations available to them, and then J†(γ) is applied. J must commute with the
direct product of any pair of classical strategies of the players (And thus the classical game
is entailed in the quantum one). The qubits are then measured in the basis {|0〉 , |1〉},and the payoff is determined according to the result of the measurement and the payoff
matrix of the game. The final state of the system is:
|ψf〉 = J†(A⊗ B
)J |ψi〉
The flow of information in this model is presented in Figure 3.
For a classical player, the classical pure strategies available are:
I =
(1 0
0 1
)F = iσx =
(0 i
i 0
)
16
|0〉
|0〉⊗ |ψf〉J J†
A
B
-time
Figure 3: The flow of information in a general two person quantum game.
For a quantum player the available strategies are the unitary operators in SU(2):
U(θ, α, β) =
(eiα cos θ
2ieiβ sin θ
2
ie−iβ sin θ2
e−iα cos θ2
)θ ∈ [0, π] α, β ∈ [−π, π]
In our case Alice is a classical player and bob is a quantum player. A classical mixed
strategy (as seen in the previous section as well) is an operator from the set U(θ) =
U(θ, 0, 0). It corresponds to playing I with probability cos θ2
and F with probability sin θ2.
It is important to note that Alice’s strategies give the same result as a classical mixed
strategy only when both players use this kind of strategies, but if Bob employs a quantum
strategy he can use the entanglement to his advantage.
When choosing γ = π2
(The maximally entangled state) Bob has an unfair advantage
because he can produce any desired final state by applying local operations on his qubit
(as discussed in the previous section). The operator
f =
(0 1
−1 0
)
has the property
(I ⊗ f
) 1√2
(|00〉+ i |11〉) =(F ⊗ I
) 1√2
(|00〉+ i |11〉)
and using it Bob can effectively flip Alice’s bit.
We will now try to find the minimal value of γ for which Bob still has an advantage
over a classical player is a classical game. We will use a general payoff matrix at first to
obtain an expression for the payoffs:
Bob : C Bob : D
Alice : C (p, p′) (q, q′)
Alice : D (r, r′) (s, s′)
where the unprimed values represent Alice’s payoffs and the primed values represent Bob’s.
17
Since Bob doesn’t know the value of θ chosen by Alice, his best move is to assume she
has picked U(π2), and counter it by using:
V = U(π
2, 0,
π
2) =
1√2
(1 −1
1 1
)
and then apply the move of his choice, according to the final state he desires:
M00 = V
M01 = f V = i√2
(1 1
1 −1
)
M10 = f V = 1√2
(1 1
−1 1
)
M11 = F f V = i√2
(−1 1
1 1
)
[5] refer to these as Bob’s Miracle Moves.
If Alice uses U(θ) and the entangling operator is J(γ) then the expected values for
the payoff of Alice when Bob uses each of the miracle moves are:
< $00 >= p2(cos θ
2+ sin θ
2sin γ)2 + q
2cos2 θ
2cos2 γ
+ r2(sin θ
2− cos θ
2sin γ)2 + s
2sin2 θ
2cos2 γ
< $01 >= p2cos2 θ
2cos2 γ + q
2(cos θ
2+ sin θ
2sin γ)2
+ r2sin2 θ
2cos2 γ + s
2(sin θ
2− cos θ
2sin γ)2
< $10 >= p2(cos θ
2− sin θ
2sin γ)2 + q
2cos2 θ
2cos2 γ
+ r2(sin θ
2+ cos θ
2sin γ)2 + s
2sin2 θ
2cos2 γ
< $11 >= p2cos2 θ
2cos2 γ + q
2(cos θ
2− sin θ
2sin γ)2
+ r2sin2 θ
2cos2 γ + s
2(sin θ
2+ cos θ
2sin γ)2
(5)
In order to get Bob’s payoff we have to add primes to p, q, r and s in equation 5.
Quantum Chickens
We now apply the general results of the previous discussion to the game of chicken, and
try to see what values of γ give an advantage to Bob. We will use a general payoff matrix
of chicken:Bob : C Bob : D
Alice : C (b, b) (c, a)
Alice : D (a, c) (d, d)
18
where a > b > c > d and 2b > a+ c (The most common values used are a = 4, b = 3, c =
1, d = 0, but we will not restrict ourselves to this particular case). The equilibria results
for Bob are a and c, since a > c we will consider an outcome which is guaranteed to be
greater then c an advantage over the classical game.
Bob’s preferred outcome is [C,D], so he will apply the miracle move M01. If Alice
cooperates their payoffs are:
〈$A〉 = b−d2
cos2 γ + c+d2
〈$B〉 = b−d2
cos2 γ + a+d2
and it can be seen that increasing the entanglement is bad for both players. Bob does
better than his classical outcome (c) when:
b−d2
cos2 γ + a+d2
> cb−d2
cos2 γ > 2c−a−d2
cos2 γ > 2c−a−db−d
sin2 γ < a+b−2cb−d
sin γ <√
a+b−2cb−d
for the payoffs (4,3,1,0) it means γ can take any value.
If Alice defects the payoffs are:
〈$A〉 = a−c2
cos2 γ + c+d2
〈$B〉 = a−c2
sin2 γ + c+d2
Now increasing the entanglement will improve Bob’s payoff, but worsen Alice’s. If γ < π4
Bob will have a lower payoff than Alice. Bob does better than his lower NE result (c)
when:a−c2
sin2 γ + c+d2
> ca−c2
sin2 γ > c−d2
sin2 γ > c−da−c
sin γ >√
c−da−c
for the payoffs (4,3,1,0) it means that sin γ > 1√3.
It remains to show that Alice does not have a better strategy to play when Bob plays
M01, and that it is in fact an equilibrium. By deriving the expression for < $01 > from
equation 5 we get that Alice’s optimal strategy is given by
tan θ =2(c− d)
b + c− a− d
sin γ
cos2 γ
And for the default payoffs of (4,3,1,0) the optimal value is indeed θ = π2, and this is
a Nash equilibrium. We can see in figure 4 that if Bob chooses to play the quantum
miracle move, Alice does best by choosing θ = π2, for every level of entanglement. Against
19
π/4
π/2
γ C
π/2
D
θ0
1
2
(a) <$>
π/4
π/2
γ
π/4
π/2
γ C
π/2
D
θ01234
(b) <$>
π/4
π/2
γ
Figure 4: The expected payoffs for (a) Alice and (b) Bob when Bob plays M01, as a
function of Alice’s strategy θ and the degree of entanglement γ, for payoffs
(a, b, c, d) = (4, 3, 1, 0).
this strategy Bob scores between two and four, an improvement for all γ > 0 over the
payoff he could expect playing a classical strategy. In figure 5 we see that if the level of
entanglement drops below arcsin 1√3
then Bob should not use his miracle move M01 and
choose cooperation.
[5] gives also final results for the critical entanglement in the Prisoners’ Dilemma,
Deadlock, Stag Hunt and Battle of the Sexes (All with a general payoff matrix).
Summary
We have taken a look at quantum games from several different angles. First, we have
seen an example of the PQ Penny Flip game, where we have seen how a quantum player
can beat a classical player with probability 1. We also explored how allowing different
quantum strategies to both players effect their payoffs and the equilibriums obtained in
the quantum version of Prisoners’ Dilemma. And last, we explored the game of chicken
20
0.25 π 0.5 πγ
0.5
1
1.5
payoff
A
B
B cooperates
Figure 5: The payoffs for Alice and Bob versus the level of entanglement (γ) when Alice
defects for (a, b, c, d) = (4, 3, 1, 0). The solid lines correspond to the results when Bob
plays the quantum move M01 and the dashed line gives Bob’s payoff when he cooperates.
and have seen that the quantum player has an advantage over the classical player for
different levels of entanglement, but below a critical level his advantage disappears.
While the conclusions we have drawn are correct in regard to the quantum games we
have explored - it is tough to determine if these games are indeed extended versions of
the classical games, or rather different games with new rules (and as such we have little
interest in them, as they do not raise a dilemma for the players).
21
Bibliography
[1] Simon C. Benjamin and Patrick M. Hayden. Comment on ‘quantum games
and quantum strategies’. Physical Review Letters, 87:069801, 2001.
[2] J. Eisert and M. Wilkens. Quantum games. Journal of Modern Optics, 47:2543–
2556, January 2000.
[3] Jens Eisert, Martin Wilkens, and Maciej Lewenstein. Quantum games and
quantum strategies. Phys. Rev. Lett., 83(15):3077–3080, Oct 1999.
[4] A. P. Flitney and D. Abbott. An introduction to quantum game theory. ArXiv
Quantum Physics e-prints, August 2002.
[5] A. P. Flitney and D. Abbott. Advantage of a quantum player over a classical
one in 2×2 quantum games. Royal Society of London Proceedings Series A,
459:2463–2474, October 2003.
[6] Albert W. Marshall and Ingram Olkin. Inequalities: Theory of Majorization
and Its Application. Academic Press, 1979.
[7] D. A. Meyer. Meyer Replies:. Physical Review Letters, 84:790–+, January 2000.
[8] D. A. Meyer. Quantum games and quantum algorithms. ArXiv Quantum
Physics e-prints, April 2000.
[9] David A. Meyer. Quantum strategies. Phys. Rev. Lett., 82(5):1052–1055, Feb
1999.
[10] S. J. van Enk. Quantum and classical game strategies. Phys. Rev. Lett.,
84(4):789, Jan 2000.
[11] S. J. van Enk and R. Pike. Classical rules in quantum games. Phys. Rev. A,
66(2):024306, Aug 2002.
[12] Wikipedia, The Free Encyclopedia. Prisoner’s Dilemma.
http://en.wikipedia.org/wiki/Prisoner’s dilemma.
22