Joseph Malkevitch: 2016-tc-non-zero-sum games

Games Against Nature and Non-Zero-Sum Games (Spring, 2016)

Prepared by:

Joseph Malkevitch
Department of Mathematics
York College (CUNY)
Jamaica, New York 11451

email:

malkevitch@york.cuny.edu

web page:

http://york.cuny.edu/~malk

Having looked at the situation of having two "competitive" players, each of whom tries separately to get the best payoff for themselves, it turns out that one can "specialize" this environment to study "games against nature." Here one has a player, say a farmer (Row), whose payoffs will depend on things that are to some extent beyond his/her control - e.g. the weather. So the farmer can take various actions, plant wheat, plant soybeans, etc. and the payoff will depend on which of a finite number of states "nature" (Column) is in. Thus, the payoffs for the different actions will depend on whether the weather is wet, dry, etc. Nature does not "actively conspire" to make matters hard for the farmer, unlike a "real opponent." So the question is what action should a player who has various actions to choose from "against" nature pick?

The issue arises whether the active player can assign some probabilities to the possible states of nature that might occur. There are examples where it might be very hard to do this (nuclear power plant accidents) but some theorists argue that using subjective probabilities in such cases would be better than not using any "probabilities." Others disagree. (There are many different approaches to probability theory with different schools about subjective probability as well as more "traditional" approaches (e.g. frequentist views.) This is a very complex and subtle area of mathematical modeling. For more information read up on Bayesian vs. Frequentist viewpoints.

Frequentist probability

http://en.wikipedia.org/wiki/Frequentist_probability

Bayesian Probability

http://en.wikipedia.org/wiki/Bayesian_probability

What approaches can be used for games against nature? We can apply the idea of dominating rows to this case (dominating column analysis does not apply because nature does not try to make things harder for the decision maker). We can also apply things like choosing a best-worst (maximin) approach. Another approach was developed by Leonard Savage, a pioneer of subjective probability.

Suppose the decision maker chooses Row i as his/her action. Suppose that the state of nature turns out to be Column j. So Row gets the payoff in cell (i, j). Now there may be higher payoffs in Column j associated with other row choices (not chosen). So Row will have a "regret" because he/she chose a row which was not as good as was possible. Thus, one can compute the regret matrix associated with a given decision matrix. Regret is the difference between the largest payoff one could have gotten versus the payoff one does get by choosing a particular row.

Here is an example of a Payoff Matrix for a game against nature:

You-Row\Nature I II III IV

1 4 4 0 2

2 2 2 2 2

3 0 8 0 0

4 2 6 0 0

and the associated Regret Matrix. For example, if Row played Row 4 and Column II occurred there would be a regret of 2 (8-6).

You-Row\Nature I II III IV

1 0 4 2 0

2 2 6 0 0

3 4 0 2 2

4 2 2 2 2

Presumably, one does not want to have high regret. So for each row one computes the Row maximum for the Regret Matrix; one can now choose the row as an action which corresponds to the smallest (minimum) regret. This is a max-min approach.

Even if one does not have probabilities for the columns, one might as a modeling assumption choose to assign the columns equal probability. Now one can choose that row as one's action which maximizes the expected value among the different rows. Operationally, a moment's thought will show that his is equivalent to selecting the row where the sum of the entries in that row is as large as possible.

Having studied the very important and elegant theory of 2-player zero-sum games we will move on to discuss 2-person non-zero-sum games. While the mathematics of zero-sum games is very appealing (it reduces in the "general case" to solving linear programming problems, there are that that many important realistic applications. In linear programming one is interested in maximizing or minimizing the value of a linear function subject to linear constraints. This means one is interested in finding the maximum or minimum for a linear function, usually at a point on the boundary or interior of a bounded convex polyhedron. (Bounded means that the polyhedron can be enclosed in a large "ball.") A famous theorem says that the optimal value is taken on at a "corner" point of such a polyhedron. However, the number of real world problems where one has zero-sum games is very limited. So the natural way to generalize the situation is to look at non-zero sum games, for simplicity in the 2-person case.

Now, the situation changes in that we can find many reasonably "realist" settings for such games. Over the past 75 years such games have been looked at in great detail and though we know a lot, there are still many avenues to explore.

Two of the most famous non-zero sum games are known as Prisoner's Dilemma and Chicken. Their game matrixes are shown below using "cardinal" payoffs, say in dollars. (The "umpire" collects the amounts from the player or pays them off when they both win. This plays the role of a modeling assumption.)

Prisoner's Dilemma A

Column I Column II

Row 1 (6, 6) (-10. 20)

Row 2 (20, -10) (-3, -3)

Prisoner's Dilemma B

	Column I	Column II
Row 1	(6, 6)	(-10, 30)
Row 2	(30, -10)	(-3, -3)

Chicken

	Column I	Column II
Row 1	(3, 3)	(-6, 4)
Row 2	(4, -6)	(-7, -7)

One reason these games have achieved such notoriety is that they are models for "real world" games that regularly recur (unions vs. management; US. vs. former Soviet Union, etc.) and have dramatic contrasts to the "rationality" that is involved with the analysis of zero-games. These games are "paradoxical" whether one thinks of them as non-cooperative games (players can't communicate and act on their own) or as cooperative games (with or without seemingly binding "agreements"). Prisoner's Dilemma was "named" by Albert Tucker (a Chairman of the Mathematics Department of Princeton and well known for his work in mathematical programming) though the game seems to have been devised by Merrill Flood and Melvin Dresher. Tucker provided a "story" to explain payoffs in the game involving the negotiations between 2 criminal suspects for a crime where the district attorney needed a confession from one or both suspects to make his/her case. Chicken is named for a "game" made famous in the movie with James Dean, Rebel Without a Cause. It can be used to model the Cuban Missile Crisis.

Further reading:

Luce, R. D. and H. Raiffa, Games and Decisions, John Wiley, New York, 1967. (There is a Dover Press reprint.)

Poundstone, W., Prisoner's Dilemma, Doubleday, New York, 1992.

Rapoport, A. and A. Chammah, PRisoner's Dilemma, U. Michagan Press, Ann Arbor, 1970.

Rapoport, A. and M. Guyer, D. Gordon, The 2x2 Game, U. Michigan Press, Ann Arbor, 1976.