The iterated Prisoner's Dilemma

Introduction to the Prisoner's Dilemma

If we are to discuss the iterated Prisoner's Dilemma, we must first discuss it with 1 iteration. The idea in a single iteration, is that you and another convict have 2 choices available and there 4 possible outcomes. For simplicity, we will name the prisoner's Cain and Abel, and the options Betray and Cooperate. If Cain betrays, and abel cooperates, Cain gets free and Abel serves for 3 years. If Cain betrays, and abel betrays they both serve 2 years. If Cain cooperates and Abel cooperates, then they both serve a year. I will present this with the below tree.


Cain             Abel
--------------------------+
          /--> Cooperates | C=1, A=1
Cooperates                |
          \--> Betrays    | C=3, A=0
                          |
     /-------> Cooperates | C=0, A=3
Betray                    |
     \-------> Betrays    | C=2, A=2
--------------------------+

It's from the above, we can see that in both outcomes, Cain gets a lower score(lower=better in this instance) by betraying. If he Cooperated, and Abel Cooperated, he could've gotten 0 time. If he betrayed and Abel betrayed, he would've gotten 3 time. As Betrayal is the best option, it's what is referred to as a dominating strategy if taken from a purely rational perspective. Of course, usually the situation would be a more complex moral and ethical problem(like the trolley problem), and the context for who is who does come up, but this is beyond the point of this discussion.

Introduction to the iterated Prisoner's Dilemma

In the iterated prisoner's dilemma, we play this game multiple times, with both Cain and Abel being able to remember all the outcomes from before. The final outcome being the sum score of all iterations. If Cain and Abel always Cooperate this is how the scores will look. In this I will be playing up to iteration 3, you are free to extend it yourself:

i | Cain | Abel
--+------+-----
0 | 1    | 1
1 | 2    | 2
2 | 3    | 3

As we can see, they both end up with 3. Now lets compare this with always betraying.

i | Cain | Abel
--+------+-----
0 | 2    | 2
1 | 4    | 4
2 | 6    | 6

As we can clearly see, in an iterated prisoner's dilemma it pays to be nice. However, note what happens when one always cooperates and one always betrays.

i | Cain | Abel
--+------+-----
0 | 0    | 3
1 | 0    | 6
2 | 0    | 9

It appears that blind optimism that your oponent will always cooperate is faulty. Thus we can conclude a few things about behaving in an iterated dilemma.

There are several more complex strategies a person can apply. Tit-For-Tat(Tip-For-Tap is what it was originally called), there is also Tit-For-Tat with forgiveness, Tit-For-Two-Tat and Grim Trigger. All 4 of these are under the same umbrella of "Trigger Strategies" where if a certain Trigger is observed, they change behaviour. The simplest to demonstrate is Grim Trigger. Cain will demonstrate, and on iteration 2 Abel will betray and Cain will continue to betray for the rest of the game even if Abel cooperates after.

i | Cain | Abel
--+------+-----
0 | 1    | 1
1 | 2    | 2
2 | 4    | 2    <= Abel Betrays
3 | 6    | 4
4 | 6    | 7    <= Abel cooperates, Note that Cain still betrays
5 | 8    | 9    <= Abel returns to betraying, Cain continues betraying

You will observe 2 things about Grim Trigger. Between 2 people, if one betrays and the Grim Trigger is triggered, then that one shouldn't Cooperate because, if we recall, it's a choice of adding 3 or adding 2 to their own score. I will now demonstrate Tit-For-Tat. The Strategy of Tit-For-Tat is simple. Initially Cooperate, and then repeat your partner's last move. We will repeat the above game, with both Cain and Abel playing by Tit-For-Tat, except on the 3rd iteration I will make Abel betray, to demonstrate its behaviour.

i | Cain | Abel
--+------+-----
0 | 1    | 1
1 | 2    | 2
2 | 5    | 2    <= Abel Betrays
3 | 5    | 5    <= Cain betrays, Abel Cooperates
4 | 8    | 5    <= Cain Cooperates, Abel Betrays

We will observe that a "Sort-of" loop or pattern begins to emerge when two tit-for-tat players are faced against each other. It's from this, we can see if it was just an honest mistake or if it was a genuine attack, it ends up going back and force doing neither side any good(they both gain 3 score over 2 iterations, compared with always cooperating which would gain 2 score over 2 iterations).

This leads us to the 3rd principle in the iterated dilemma

The way we introduce this forgiveness can vary. Some people choose to model this with a random percent chance that if a Tit-For-Tat player would retaliate, they would choose to forgive and cooperate. Others simply model it as Tit-For-Two-Tat. That is, 2 betrayals must be made for them to begin betraying. These simply break the cycle of betrayal, replacing it with more cooperation, but they fare poorly against always betraying.

There is a 4th principle.

This can be demonstrated with suspicious tit-for-tat, where they start out betraying. You can work through it yourself, but you will see that the action of betrayal on the first move is enough to trigger bad responses. 2 Of this against itself is bad, and against a tit-for-tat player it will loop.

Damnant quod non intellegunt

From the above one would think that Tit-For-Tat is the optimal strategy, however if we introduce a population with all units following their own strategy and facing off against each other, we will see it isn't always the optimal strategy. I won't cover this in this here because this is more in-depth and detailed, however look at the references section if you want to see more information. For such a tournament, for the 20th anniversay, a Southamptom team proved that although Tit-For-Tat worked and is robust, it is not always the optimal strategy. How? They... colluded. The rules of the dilemma disallow communication, and the different strategies employed by that team, allowed for a ten turn song and dance to be communication(One can think of this much like binary, where Betrayals are 1 and Cooperations are 0. You could also probably embed some kind of communication this way now that I think about it). Regardless, despite beating Tit-For-Tat, and being an illegal action, they technically prove it's not always the optimal strategy thus, I come to a 5th key principle.

What I have listed above will probably strike a reader as relatively obvious and stemming from first principles, however I think it's worth remembering the benefits of what I have listed, with Tit-For-Tat. I would argue that for robustness, it tends to help to follow Tit-For-Tat in your actions, and to communicate, and this is exactly what we don't see in the modern world unfortunately. The breakdown of communication ironically enough by communication devices. Perhaps that is a discussion for another time, about how Social Control from Social Media tends to lead people to follow poor strategies. I would argue the strategies followed by the masses these days are based on some envy-based triggers, with these measurements of how liked or disliked a person or their comments are.

Because this topic is non-trivial there are many more strategies, such as Win-Stay, Lose-Shift, and gradual tit-for-tat. I have thus provided references. I hope this brief look at Game Theory and the Prisoner's Dilemma has been of use to some people.

P.S.

It's worth noting also, the final iteration for a given game is the single iteration of the dilemma. This is worthwhile considering. There is also a continuous iterated prisoner's dilemma where it is not a discrete yes or no, but instead more complicated as a person can contribute as much as they want.

References:

The Evolution of trust
The Prisoner's Dilemma on Stanford Encyclopedia
The Prisoner's Dilemma on Wikipedia
Trigger Strategies on Wikipedia