Agent Strategies and the Shadow of the Future

Tit-for-Tat vs. All-Defection

Note: in order to run the simulation referred to in this slide, go here, to the Java Applet version. You will be directed to download the latest version of the Java plug-in.

In the graphic to the left, we set up two agents that are engaged in a series of Iterated Prisoner's Dilemma games. Press "Go" several times to advance the Society by several moves. During each move several things happen:

1. Each agent chooses to either cooperate or defect, and sends the corresponding signal to her partner.
2. Each agent receives the cooperation/defection signal from the partner, and records it in her history of moves.
3. The total payoff of each agent gets updated. The payoff is indicated both by the size of the agent's circle, and by the number under it.

One of the agents employs the All-Defection (All-D) strategy function, which is indicated by her red color. The other one, however, uses the Tit-for-Tat strategy function, shown by her blue color. Throughout the rest of this tutorial we will continue to associate a particular strategy used by the agent with a particular color.

While the two agents are connected by a cyan line, they are interacting within a single instance of an Iterated Prisoner's Dilemma. When the cyan line breaks off, it means that the current instance of Iterated PD is over, and the next move will start a new instance. Here we have set the "shadow of the future" parameter δ to 1/2. Recall from the previous slides that δ is the probability that any given instance of Iterated PD continues after every given move. Indeed, if you were to count the percentage of times the cyan line breaks off after the first move of every Iterated PD, you would find that it approaches 50%.

The two strategies are non-probabilistic, and thus the sequence of moves and associated payoffs is the same within each instance of Iterated PD:

1. On the first move, Tit-for-Tat cooperates, but All-D defects. As a result, All-D gets the maximal possible payoff of 5 points, while Tit-for-Tat gets the "sucker's payoff" of 0 points.
2. On the second move, Tit-for-Tat retaliates, and defects in response to All-D's last move. All-D defects again. Both agents, as a result, get only one point, a punishment for mutual defection.
3. All subsequent moves and payoffs are same as those of step 2 above: both agents keep defecting.

In a sense, every couple of moves Tit-for-Tat forgets just how badly All-D has treated her in the past, a new instance of Iterated PD starts, and All-D gets another chance to take advantage of Tit-for-Tat's propensity to cooperate when she doesn't have any prior information. Tit-for-Tat could definitely benefit from better memory (i.e. higher δ) in this case.