This is a very rough first draft of notes on the Prisoner's Dilemma that I am hoping to expand into a self contained teaching module at some point...my plan is to develop the module using Joshua Epstein's computational model and ASCAPE software as the major teaching tools. Robert Axtell's The Evolution of Cooperation: Revised Edition is also a major influence.

I think it would help if the initial presentation did not focus so much on the one-shot version of the game. Prisoner Dilemma games can be divided into four distinct classes:

**The One-Shot Version:**This is the version emphasized in introductory level courses. In this version the dominant strategy is for both players to defect.**Repeated Game Version #1:**In this version of the Prisoner's Dilemma the players will play the game for an a priori known, finite number of rounds. The result...since the last round is in essence a one-shot game, mutual defection is the dominant strategy. Since both players know this, they will find it optimal to defect in the next to last round, this dynamic devolves all the way back to the beginning. The strategy of mutually defection still rules in this version.**Repeated Game Version #2:**In this version of the Prisoner's Dilemma the players will play the game for an infinite number of rounds! In this version mutual cooperation can be sustained as an equilibrium as long as the players care a sufficient amount about future rounds of the game (i.e., their discount factors are not too low).**Repeated Game Version #3:**In this version of the Prisoner's Dilemma the players will play the game for an unknown but finite number of rounds. In this world, which to me is the one that most closely resembles reality, the backwards induction argument used in repeated game version #1 doesn't work. Computational simulations can be explored to understand what ingredients are necessary/sufficient for cooperation to be sustained in the population and in what situations where defectors are able to invade the population.

A simple case in which 4. reduces to 2. is if the probability of the game ending in period t, given that it hasn't ended by period t-1, is some constant, say delta. Then if agents' true discount factor is beta, they will behave as if they were playing an infinitely repeated game with discount factor beta*delta.

ReplyDelete