Tit For Tat

Rahul Soans
10 min readJan 30, 2024

We are all Strategists

Humans are messy. We are not the paragons of reason we think we are. We are driven by emotion, make terrible choices and (unfortunately for economists and strategists) don’t conform to the neat manicured prescribed way of doing things. The field of strategy has tried valiantly to predict behaviour with whole libraries dedicated to theories, methods, templates with varying degrees of success. Strategy if nothing else is a vector for decision making, a way of seeing the forest from the trees. And our lives are a constant stream of decisions. Nevermind military campaigns, political initiatives or corporate boardrooms. We constantly grapple with should we buy a house, start a business, change jobs etc etc. The common denominator with most decisions is that we don’t live in a vacuum. We are surrounded by other pesky humans whose equally pesky decisions and choices interact with ours. The crux of strategic thinking since its origins has come into play when interests collide and there is a potential conflict. Strategic thinking has been described as ‘the art of outdoing an adversary, knowing that the adversary is trying to do the same to you’ However, our civilisation is based on our ability to cooperate. As a species we have come a long way because of that. Cooperation is easy if people act unselfishly or work together to further their own tribe. The problem of cooperation is getting people to cooperate when they have an incentive to be selfish. This essay is about how a political scientist Robert Axelrod used the tools initially developed to vanquish an adversary and developed a framework for cooperation.

Game Theory and the Prisoner’s Dilemma

Game Theory was pioneered by polymath Jon Von Neumann, regarded as one of the fathers of computer science. Von Neuman was rational. According to his daughter his lifelong desire was to impose order and rationality on an inherently disorderly and irrational world. Game Theory sprang from Von Neuman’s urge to find neat mathematical solutions to knotty real world problems during one of the most ‘disorderly and irrational’ periods in human history i.e WWII. After the publication of his landmark paper ‘On the theory of parlour games’ he succeeded in establishing game theory as the discipline framing human cooperation and conflict in truly mathematical terms. Being that, the answers of game theory could seem cold, insipid and devoid of the colour of human emotion — but effective nonetheless. However the limitations of Von Neuman’s game theory was that its analysis focused on two players and ‘zero -sum payoffs’, which meant that what one won the other must lose. What was needed was to move beyond this limitation and explore non-zero sum games in which the players could all gain or all lose, depending on how the game was played. Enter the Prisoner’s Dilemma.. a devious little game that set standard for cooperation theory for decades..as well as the standard for countless police procedurals. Here’s how it works:

Two members of a gang A and B, are arrested. Prosecutors lack evidence to convict them of a major crime but can get them on a lesser charge for which they will each serve a year in prison. A and B can’t communicate with each other. A and B have two choices. Cooperate (refuse to inform) or Defect (be a snitch). Each must make the choice without knowing what the other will do. Prosecutors offer each a deal — inform on the other and your sentence is reduced. There are four possible outcomes:

Both A and B refuse to inform on each other: each serves one year

Both A and B inform on each other: each serves two years

A informs on B, who remains silent: A walks free and B serves three years

B informs on A, who remains silent: B walks free and A serves three years

The prisoner’s dilemma is whether to be loyal to your partner (cooperate) or betray him (defect)

If you play Prisoner’s Dilemma once there is a rational choice. If you are either prisoner A or B if you cooperate your sentence averages out to 2 years (one year if the other cooperates, three years if the other defects). If you defect, your sentence averages out to 1 year (zero years if the other cooperates, two if the other defects). So you should always defect. In a single round version of Prisoners Dilemma it is always optimal to defect. Not exactly a glowing view of human nature

How about if the game is played twice? It is still optimal to defect in each round. Three times? Still defect. In fact if there are a known number of rounds then there is no incentive for players to cooperate. This is certainly true of the last round as there isn’t a future to influence.

But what if the number of rounds is unknown (an ‘iterated’ PD)? Things get interesting

The Tournament

Enter Robert Axelrod, a political scientist at the University of Michigan. In 1980, he explained to his colleagues how prisoner’s dilemma worked and asked them what strategy they would use in a game with an unknown number of rounds. In fact he organized a tournament and solicited programs from around the world. Fourteen entries came in for the first round, submitted by leading game theorists and experts across a wide range of disciplines encompassing psychology, mathematics, political science etc. The strategies varied enormously with some being breathtakingly complicated. Axelrod then programmed the various strategies and pitted them against each other in a massive round robin tournament.

So which strategy won? Drumroll….

It was submitted by a mathematician at the University of Toronto named Anatol Rapoport. And like a true-underdog-coming-from-behind mythic tale it was the simplest strategy submitted..consisting of just four lines of code. The name of the program was Tit For Tat.

As mentioned, the way it worked was simple. Always cooperate in the first round. After that, do whatever the other player did in the previous round. So you cooperate in the first round and if the other player cooperates you hold hands and cooperate till death do you part. Suppose the other cooperates and then in the 7th round gets cold feet and defects, you take a hit, so you Tit for Tat her and punish her in the next round. If in the next few rounds she starts to cooperate again you do so as well and peace and rainbows return. Suppose you play against someone who always defects, then you cooperate in the first round and then defect until the end.

A peculiarity is that Tit for Tat can never win a round. If its playing another with a similar strategy the best it could do was a draw. Otherwise it always loses by a small margin. All the other strategies playing against each other eventually produced catastrophic losses. And at the end of the day when everything is summed, Tit for Tat always wins. In other words it lost every battle but won the war. It has four things going for it:

Avoidance of unnecessary conflict by cooperating as long as the other player does

Provocability in the face of an uncalled for defection by another

Forgiveness after responding to a provocation

Clarity of behaviour so the other player can adapt to your pattern of action

A second round of the tournament was conducted with many more entries from amateurs and professionals alike, all of whom were aware of the results of the first round. The result was another victory for Tit For Tat. What this tournament proved..at least analytically was that under suitable conditions cooperation can emerge in a world without central authority even if everyone is just looking out for themselves.

That’s great for the world of computer programs and games etc..but what about real life

Live and Let Live in the Trenches

Trench warfare during WW1 was perhaps one of the more ghastly situations that men were subjected to. French (and then British) and German armies faced each other along 500 miles of fortified trenches for years without making significant advances. All they faced was endless cold, wet and bloody months. However in the midst of this ridiculous conflict peculiar behaviours started to emerge

What made trench warfare so different from most other combat was that the same small units faced each other in immobile sectors for extended periods of time. This changed the game from a one-move Prisoner’s Dilemma in which ‘defection’ is the dominant choice, to an iterated Prisoner’s Dilemma in which other strategies are possible. So if you are in a trench for months at a time facing an enemy who is in the exact same situation as you what do you do?

Well..its a war so doing damage to the enemy is important. It would be suicide to restrain yourself when you know that the enemy firming at you. However given the conditions of trench warfare (this isn’t a one time battle, the same units will be facing each other for months at a time), the reward (to live) would be preferable to mutual punishment, since with mutual punishment both units would suffer for little or no relative gain.

The actual result fell in lock step with the theory’s predictions: the front-line soldiers often refrained from shooting to kill on both sides. In fact both sides made a big show of ‘engaging’ with the enemy without causing any harm. They would shoot at targets way off, both German and British units would shoot at specific times of day, they would refrain from firing during meal times. In fact, so regular were the Germans in their choice of targets, times of shooting, and number of rounds fired, that, after being in the front line for one or two days, the Brits had discovered their system, and knew to a minute where the next shell would fall. These rituals of sketchy and routine firing sent a double message. To the high command they conveyed aggression, but to the enemy they conveyed peace. The men pretended to be implementing an aggressive policy, but were not. In short they violated orders from their own high commands in order to achieve tacit cooperation with each other. This ‘live and let live’ system flourished despite the best efforts of senior officers to stop it, despite the passions aroused by combat, despite the military logic of kill or be killed, and despite the ease with which the high command was able to repress any local efforts to arrange a direct truce.

However when a defection actually occurred, the retaliation was often more than would be called for by TIT FOR TAT. Two-for-one or three-for-one was a common response to an act that went beyond what was considered acceptable.

Cooperation got a foothold through exploratory actions at the front-line level, was able to sustain itself because of the duration of contact between small units facing each other, and was eventually undermined when these small units lost their freedom of action (i.e when the top brass ordered raids into enemy camps). They understood the indirect consequences of their acts as embodied in what Axelrod calls the echo principle: “To provide discomfort for the other is but a roundabout way of providing it for themselves”

So both cooperation and defection were self-reinforcing. And revenge evoked revenge

Prisoner’s Dilemma in the Workplace

The prisoner’s dilemma does have some interesting workplace implications. Firstly within organizations there are situations where it does pay to be cooperative rather than competitive. This may seem obvious but leaders assume that a healthy competition between companies is what drives a market economy so that ethos should be reflected within companies. Leaders were notorious for playing God (or Darwin) in pitting individuals or whole departments against each other and hoping the best would win. For example GE’s Jack Welch (labeled Manager of the century by Fortune magazine in 1999) introduced the ‘stack ranking’ system where employees constantly saw themselves assessed relative to their colleagues, an approach that tricked down to leaders in other industries. If competition within an organization is the norm then a department would feel that sharing information would work against its own interests (i.e it pays to defect). What if a ‘rival’ department didn’t reciprocate?

This had real consequences for General Motors who in 2014 had to recall 800,000 vehicles because of a faulty ignition switch. The switch would disable the engine while the car was in motion, which would in turn prevent the airbags from inflating. Understanding and correcting this issue would have been simple, were it not for the fact that airbags and ignition systems were overseen by two different teams…who did not share information with each other. The silos and competitiveness within GM resulted in a failure that ended up costing lives.

Conclusion — Default to Trust

Going back to the tournament run by Robert Axelrod, the reason Tit For Tat was the dominant strategy was because it defaulted to trusting cooperative behaviour and punishing selfish behaviour. And it did not hold a grudge. The punishment lasted only as long as the selfish behaviour lasted. TIT FOR TAT won the tournament, not by beating the other player, but by eliciting behavior from the other player which allowed both to do well.

So in a non-zero-sum world you do not have to do better than the other player to do well for yourself. On a personal level this is especially true when you are interacting with many different people. Letting each person do the same or a little better than you is fine, as long as you tend to do well yourself. There is no point in being envious of the success of the other people, since in an iterated Prisoner’s Dilemma of long duration the other’s success is virtually a prerequisite of your doing well for yourself.

In a zero sum game like chess it pays to keep your opponent guessing. You do not want them to guess your next move. But in a non-zero-sum setting, you benefit from the other player’s cooperation. The trick is to encourage that cooperation. A good way to do it is to make it clear that you will reciprocate. Lay your cards on the table. Make your ‘strategy’ explicit. Words can help here, but as everyone knows, actions speak louder than words. That is why the easily understood actions of TIT FOR TAT are so effective.

And going back to the trenches of WWI not only did preferences affect behaviour and outcomes, but behaviour and outcomes also affected preferences. In other words when cooperation passes a certain threshold it becomes the norm.

References:

The Evolution of Cooperation by Robert Axelrod

The Art of Strategy by Dixit & Nalebuff

The Man from the Future by Ananyo Bhattacharya

Behave by Robert Sapolsky

Rahul Soans is the founder of the Disruptive Business Network, a community exploring how hard questions and bold changes lead to meaningful work. Join us here

--

--

Rahul Soans

Founder of The Disruptive Business Network <https://www.disruptivebusinessnetwork.com/> Meaningful Work Disruptive Ideas, Learning and Community