Joseph Malkevitch: Nash equilibria Worked Example: 2018

Worked Example: Nash Equilibrium (2018)

Prepared by:

Joseph Malkevitch
Department of Mathematics
York College (CUNY)
Jamaica, New York 11451

email:

malkevitch@york.cuny.edu

web page:

http://york.cuny.edu/~malk

The 2x2 game matrix below (Figure 1) is a typical example of a non-zero-sum matrix game involving two players, Row and Column. Each of the players can take one of two actions and gets a payoff as in Figure 1. We will consider the case where the game is to be played many times, though many might say the analysis would still hold for only a single play of the game. The pair (-1, 4) means that Row would lose 1 and Column would win 4 when Row plays Row 1 and Column plays Column II. Note that the payoffs for players don't show any symmetry. We will also assume the players are interacting in a "non-cooperating" way.

Column I Column II

Row 1 (5, 0) (-1, 4)

Row 2 (3, 2) (2, 1)

Figure 1

It is easy to see that there is no pure strategy equilibrium for this game. For example, if Row has been playing Row 1 for several rounds and Column has been playing Column II for a while, Row might be well advised to move to Row 2, since if Column continues to play Column II, Row will not only change his/her loss of 1 to a gain of 2 but also change Column's outcome from winning 4 to winning only 1. From cell Row 2, Column II, Column will want to move to cell Row 2, Column I. From this cell, Row will want to move to Row 1, Column I, etc. Thus, if one constructs a "motion diagram" one sees there is no pure strategy equilibrium. Note that while the payoffs to Column are never negative, Row can lose 1 unit but the sum of what he can sometimes win (10) is larger then the sum of what Column can win, 7.

However, Nash's Theorem says this: Very general games always have equilibria values which are:

a. Pure strategy equilibria only

b. Mixed strategy equilibria only

c. Both pure and mixed strategy equilibria.

So the matrix game above must have a mixed strategy equilibrium.

Intuitively, a Nash equilibrium (pure or mixed) is a way of playing such that if either play deviates from the equilibrium strategy, he/she can only do worse, not better, so there is no incentive to move away from a Nash equilibrium even if other outcomes may seem more attractive.

We can compute a Nash equilibrium for 2x2 games without pure strategies equilibria by looking at what will be called Row's game, the 2x2 game below:

Row 1 5 -1

Row 2 3 2

Figure 2 (Row's Game; payoffs to Row)

and the 2x2 game shown below called Column's game.

Col I	Col II
0	4
2	1

Figure 3 (Column's Game; payoffs to Column )

It turns out that when Row plays a mixture of Row 1 and Row 2 so that Column gets the same payoff amount from playing Col I or from playing Col II in Column's game, and that when Column plays a mixture of Col I and Col II so that Row earns the same payoff amount from playing Row 1 or Row 2 in Row's game, the result will be an equilibrium, the Nash equilibrium we are seeking for the game in Figure 1.

For Row to equalize Column's payoffs in Column's game, what "spinner" (mixture of Row 1 and Row 2) should Row play? If Row plays Row 1 with probability p and Row 2 with probability 1-p, we have:

0p + 2(1-p) = 4p + 1(1-p)

Solving for p we get p = 1/5.

What will Column's payoff be? Substituting p = 1/5 into 0p + 2(1-p) yields the payoff of 8.5 = 1.6 for Column. You can double check your arithmetic by showing that 4(1/5) + 1(1-1/5) is also 8/5.

For Column to equalize Row's payoffs in Row's game, what spinner (mixture of Column I and Column II) should Column play? If Column plays Column 1 with probability q and Column II with probability 1-q we have:

5q -1(1-q) = 3q + 2(1-q)

Solving for q we get q = 3/5.

What will Row's payoff be? Substituting q = 3/5 into 5q -1(1-q) gives 13/5 = 2.6 for Row. You can double check your arithmetic by showing that 3(3/5) + 2( 1 - 3/5) = 13/5.

Thus, when both players play their Nash equilibrium mixed strategies Row gets 13/5 = 2.6 and Column gets 8/5 = 1.6 as payoffs.

Now Row might say, for this analysis I paid no attention to my own payoffs. Surely if I play so that my payoff in my own game (Row's game) is optimal I am better off. To do this is Row playing what has come to be called his/her prudential strategy. What is Row's optimal play for the game in Figure 2? The answer is, found in this example using dominating strategy analysis, that Row should always play Row 2. If Column plays his/her Nash equilibrium strategy against Row always playing Row 2 we get: 2(3/5) + 1(2/5) = 8/5 as the payoff to Column. No surprise here - when one player deviates from the Nash equilibrium value it leaves the payoff to the player who does not change the same. And because in this case we have Row playing Row 2 all of the time, if Column plays his/her Nash strategy we see Row earns 3(3/5) + 2(2/5) = 13/5.

Now suppose Column anticipating that Row might try to play his/her Prudential strategy asks what his/her best response is - perhaps doing better than using his/her Nash equilibrium mixture. Column's best response to Row playing this way would be to always play Column I. This strategy, the best response to Row's prudential strategy is called Column's counter prudential strategy. Column, knowing that Row will always play Row 2 should always play Column 1, earning as a result, 2. Thus when Row plays his/her prudential strategy and Column plays his/her counter prudential strategy is Row gets 3 and Column gets 2. Note, that both players are better off than when they play their Nash equilibrium strategy. However, it is not an equilibrium; if Column plays Col I all of the time Row has an improved outcome by playing Row 1.

Now, Column also has a prudential strategy, involving his/her best play in Column's game (Figure 3). The game in Figure 3 (payoffs are payoffs to column) does not simplify using dominating strategy analysis. Solving for the best mixture of columns to play, Column solves:

0q + 4(1-q) = 2q +(1-q)

So q = 3/5, and Column would earn 8/5.

What is the best response of Row to Column playing Col I 3/5 of the time and Column II 2/5 of the time? This is what is called Row's counter prudential strategy.

Here is the necessary calculation, where p and 1-p are used to represent the best mix for Row 1 and Row 2 by Row.

Value to Row: p(3/5)5 + 3(3/5)(1-p) -1(2/5)p +2(2.5)(1-p)

Simplifying we see that the coefficient of p is zero and Row earns 13/5 whatever mix of Row 1 and Row 2 he/she uses.

Summarizing:

Row can play Row's:

a. Nash equilibrium strategy

b. Row's prudential strategy

c. Row's counter prudential strategy (best choice for Row when Column plays his/her prudential strategy

Column can play Column's:

a. Nash equilibrium strategy

b. Column's Prudential strategy

c. Column's counter prudential strategy (best choice for Column when Row plays.

One can look at the payoff pairs from the 9 possible cases!

In general, the only equilibrium that results when playing a 2x2 non-zero-sum matrix game is the Nash equilibrium strategy (or strategies) that always exist. Unfortunately, in many games the solutions given by the Nash equilibrium are not particularly appealing (in particular in Prisoner's Dilemma where the equilbirium can deliver negative payoffs to both players but there are positive payoff solutions for both players which are available). Each individual game must be studied to try to give good advice about how to play the game.