1. Our first agent-based evolutionary model

# 1.0. Our very first model

# 1. Goal

The goal of this section is to create our first agent-based evolutionary model in NetLogo. Being our first model, we will keep it simple; nonetheless, the model will already contain the four building blocks that define most models in agent-based evolutionary game theory, namely:

- a population of agents,
- a game that is recurrently played by the agents,
- an assignment rule, which determines how revision opportunities are assigned to agents, and
- a revision protocol, which specifies how individual agents update their (pure) strategies when they are given the opportunity to revise.

In particular, in our model the number of (individually-represented) agents in the population will be chosen by the user. These agents will repeatedly play a symmetric 2-player 2-strategy game, each time with a randomly chosen counterpart. The payoffs of the game will be determined by the user. Agents will revise their strategy with a certain probability, also to be chosen by the user. The revision protocol these agents will use is called “imitate-if-better”, which dictates that a revising agent imitates the strategy of a randomly chosen player, if this player obtained a payoff greater than the revising agent’s.

This fairly general model will allow us to explore a variety of specific questions, like the one we outline next.

# 2. Motivation. Cooperation in social dilemmas

There are many situations in life where we have the option to make a personal effort that will benefit others beyond the personal cost incurred. This type of behavior is often termed “to cooperate”, and can take a myriad forms: from paying your taxes, to inviting your friends over for a home-made dinner. All these situations, where cooperating involves a personal cost but creates net social value, exhibit the somewhat paradoxical feature that individuals would prefer not to pay the cost of cooperation, but everyone prefers the situation where everybody cooperates to the situation where no one does. Such counterintuitive characteristic is the defining feature of social dilemmas, and life is full of them (Dawes, 1980).

The essence of many social dilemmas can be captured by a simple 2-person game called the Prisoner’s Dilemma. In this game, the payoffs for the players are: if both cooperate, R (Reward); if both defect, P (Punishment); if one cooperates and the other defects, the cooperator obtains S (Sucker) and the defector obtains T (Temptation). The payoffs satisfy the condition T > R > P > S. Thus, in a Prisoner’s Dilemma, both players prefer mutual cooperation to mutual defection (R > P), but two motivations may drive players to behave uncooperatively: the temptation to exploit (T > R), and the fear to be exploited (P > S).

Let us see a concrete example of a Prisoner’s Dilemma. Imagine that you have $1000, which you may keep for yourself, or transfer to another person’s account. This other person faces the same decision: she can transfer her $1000 money to you, or else keep it. Crucially, whenever money is transferred, the money doubles, i.e. the recipient gets $2000.

## Try to formalize this situation as a game, assuming you and the other person only care about money.

The game can be summarized using the payoff matrix in Fig. 1. To see that this game is indeed a Prisoner’s Dilemma, note that transferring the money would be what is often called “to cooperate”, and keeping the money would be “to defect”.

Player 2 |
|||

Keep | Transfer | ||

Player 1 |
Keep | 1000 , 1000 |
3000 , 0 |

Transfer | 0 , 3000 |
2000 , 2000 |

To explore whether cooperation may be sustained in a simple evolutionary context, we can model a population of agents who are repeatedly matched to play the Prisoner’s Dilemma. Agents are either cooperators or defectors, but they can occasionally revise their strategy. A revising agent looks at another agent in the population and, if the observed agent’s payoff is greater than the revising agent’s payoff, the revising agent copies the observed agent’s strategy. Do you think that cooperation will be sustained in this setting? Here we are going to build a model that will allow us to investigate this question… and many others!

# 3. Description of the model

In this model, there is a population of n-of-players agents who repeatedly play a symmetric 2-player 2-strategy game. The two possible strategies are labeled 0 and 1. The payoffs of the game are determined by the user in the form of a matrix [[*A*_{00} *A*_{01}] [*A*_{10} *A*_{11}]], where *A _{ij}* is the payoff that an agent playing strategy

*i*obtains when meeting an agent playing strategy

*j*(

*i,*

*j*∈ {0, 1}).

Initially, the number of agents playing strategy 1 is a (uniformly distributed) random number between 0 and the number of players in the population. From then onwards, the following sequence of events –which defines a tick– is repeatedly executed:

- Every agent obtains a payoff by selecting another agent at random and playing the game.
- With probability prob-revision, individual agents are given the opportunity to revise their strategies. The revision rule –called
**“imitate if better”**– reads as follows:Look at another (randomly selected) agent and adopt her strategy if and only if her payoff was greater than yours.

The model shows the evolution of the number of agents choosing each of the two possible strategies at the end of every tick.

^{ CODE }4. Interface design

The interface (see figure 2) includes:

- Three buttons:
- One button named to setup. , which runs the procedure
- One button named to go. , which runs the procedure
- One button named to go indefinitely. , which runs the procedure

In the Code tab, write the procedures to setup and to go, without including any code inside for now.to setup ;; empty for now end to go ;; empty for now end

In the Interface tab, create a button and write setup in the “commands” box. This will make the procedure to setup run whenever the button is pressed.

Create another button for the procedure to go (i.e., write go in the commands box) with display name to emphasize that pressing the button will run the procedure to go just once.

Finally, create another button for the procedure to go, but this time tick the “forever” option. When pressed, this button will make the procedure to go run repeatedly until the button is pressed again.

- A slider to let the user select the number of players.
Create a slider for global variable n-of-players. You can choose limit values 2 (as the minimum) and 1000 (as the maximum), and an increment of 1.
- An input box where the user can write a string of the form [ [
*A*_{00}*A*_{01}] [*A*_{10}*A*_{11}] ] containing the payoffs*A*that an agent playing strategy_{ij}*i*obtains when meeting an agent playing strategy*j*(*i,**j*∈ {0, 1}).Create an input box with associated global variable payoffs. Set the input box type to “String (reporter)”. Note that the content of payoffs will be a string (i.e. a sequence of characters) from which we will need to extract the payoff numeric values. - A slider to let the user select the probability of revision.
Create a slider with associated global variable prob-revision. Choose limit values 0 and 1, and an increment of 0.01.
- A plot that will show the evolution of the number of agents playing each strategy.
Create a plot and name it Strategy Distribution. Since we are not going to use the 2D view (i.e. the large black square in the interface) in this model, you may want to overlay it with the newly created plot.

^{ CODE } 5. Code

## 4.1. Global variables and individually-owned variables

First we declare the global variables that we are going to use and we have not already declared in the interface. We will be using a global variable named payoff-matrix to store the payoff values on a list, so the first line of code in the Code tab will be:

globals [payoff-matrix]

Next we declare a breed of agents called “players”. If we did not do this, we would have to use the default name “turtles”, which may be confusing to newcomers.

breed [players player]

Individual players have their own strategy (which can be different from the other agents’ strategy) and their own payoff, so we need to declare these *individually-owned variables* as follows:

players-own [ strategy payoff ]

## 4.2. Setup procedures

In the setup procedure we want:

- To clear everything up. We initialize the model afresh using the primitive clear-all:
`clear-all`

- To transform the string of characters the user has written in the payoffs input box (e.g. “[[1 2][3 4]]”) into a list (of 2 lists) that we can use in the code (e.g. [[1 2][3 4]]). This list of lists will be stored in the global variable named payoff-matrix. To do this transformation (from string to list, in this case), we can use the primitive read-from-string as follows:
set payoff-matrix read-from-string payoffs

- To create n-of-players players and set their individually-owned variables to an appropriate initial value. At first, we set the value of payoff and strategy to 0:
^{[1]}create-players n-of-players [ set payoff 0 set strategy 0 ]

Note that the primitive create-players does not appear in the NetLogo dictionary; it has been automatically created after defining the breed “players”. Had we not defined the breed “players”, we would have had to use the primitive create-turtles instead.

Now we will ask a random number of players (between 0 and n-of-players) to set their strategy to 1, using one of the most important primitives in NetLogo, namely ask. The instruction will be of the form:

ask AGENTSET [set strategy 1]

where

`AGENTSET`

should be a random subset o players.To randomly select a certain number of agents from an agentset (such as players), we can use the primitive n-of (which reports another –usually smaller– agentset):

ask (n-of SIZE players) [set strategy 1]

where

`SIZE`

is the number of players we would like to select.Finally, to generate a random integer between 0 and n-of-players we can use the primitive random:

random (n-of-players + 1)

The resulting instruction will be:

ask n-of (random (n-of-players + 1)) players [set strategy 1]

- To initialize the tick counter. At the end of the setup procedure, we should include the primitive reset-ticks, which resets the tick counter to zero (and also runs the “plot setup commands”, the “plot update commands” and the “pen update commands” in every plot, so the initial state of the model is plotted):
`reset-ticks`

Thus, the code up to this point should be as follows:

globals [ payoff-matrix ] breed [players player] players-own [ strategy payoff ] to setup clear-all set payoff-matrix read-from-string payoffs create-players n-of-players [ set payoff 0 set strategy 0 ] ask n-of random (n-of-players + 1) players [set strategy 1] reset-ticks end to go end

## 4.3. Go procedure

The procedure to go contains all the instructions that will be executed in every tick. In this particular model, these instructions include *a)* asking all players to interact with another (randomly selected) player to obtain a payoff and *b)* asking all players to revise their strategy with probability prob-revision.

To keep things nice and modular, we will create two separate procedures *to be run by players* named to play and to update-strategy. Writing short procedures with meaningful names will make our code elegant, easy to understand, easy to debug, and easy to extend… so we should definitely aim for that. Following this modular design, the procedure to go is particularly easy to code and understand:

ask players [play] ask players [ if (random-float 1 < prob-revision) [update-strategy] ]

Note that condition

(random-float 1 < prob-revision)

will be true with probability prob-revision.

Having the agents go once through the code above will mark an evolution step (or generation), so, to keep track of these cycles and have the plots in the interface automatically updated at the end of each cycle, we include the primitive tick at the end of to go.

`tick`

## 4.4 Other procedures

### to play

Importantly, note that the procedure to play *will be run by a particular player*. Thus, within the code of this procedure, we can access and set the value of player-owned variables strategy and payoff.

Here we want the player running this procedure (let us call her the running player) to play with some other player and get the corresponding payoff. First, we will (randomly) select a counterpart and store it in a local variable named mate:

let mate one-of other players

Now we need to compute the payoff that the running player will obtain when she plays the game with her mate. This payoff is an element of the payoff-matrix list, which is made up of two sublists (e.g., [[1 2][3 4]]).

Note that the first sublist (i.e., item 0 payoff-matrix) corresponds to the case in which the running player plays strategy 0. We want to consider the sublist corresponding to the player’s strategy, so we type:

item strategy payoff-matrix

In a similar fashion, the payoff to extract from this sublist is determined by the strategy of the running player’s mate (i.e., [strategy] of mate). Thus, the payoff obtained by the running agent is:

item ([strategy] of mate) (item strategy payoff-matrix)

Finally, to make the running agent store her payoff, we can write:

set payoff item ([strategy] of mate) (item strategy payoff-matrix)

This line of code concludes the definition of the procedure to play.

### to update-strategy

In this procedure, which is also *to be run by individual players*, we want the running player to look at some other random player (which we will call the observed-agent) and, if the payoff of the observed-agent is greater than her own payoff, adopt the observed-agent’s strategy.

To select a random player and store it in the local variable observed-agent, we can write:

let observed-agent one-of other players

To compare the payoffs and, if appropriate, adopt the observed-agent’s strategy, we can write:

if ([payoff] of observed-agent) > payoff [ set strategy ([strategy] of observed-agent) ]

This concludes the definition of the procedure to update-strategy and, actually, of all the code in the Code tab, which by now should look as shown below.

## 4.5. Complete code in the Code tab

globals [ payoff-matrix ] breed [players player] players-own [ strategy payoff ] to setup clear-all set payoff-matrix read-from-string payoffs create-players n-of-players [ set payoff 0 set strategy 0 ] ask n-of random (n-of-players + 1) players [set strategy 1] reset-ticks end to go ask players [play] ask players [ if (random-float 1 < prob-revision) [update-strategy] ] tick end to play let mate one-of other players set payoff item ([strategy] of mate) (item strategy payoff-matrix) end to update-strategy let observed-agent one-of other players if ([payoff] of observed-agent) > payoff [ set strategy ([strategy] of observed-agent) ] end

## 4.6. Code in the plots

Finally, let us set up the plot to show the number of agents playing each strategy. This is something that can be done directly on the plot, in the Interface tab.

Edit the plot by right-clicking on it, choose a color and a name for the pen showing the number of agents with strategy 0, and in the “pen update commands” area write:

plot count players with [strategy = 0]

# 6. Sample runs

Now that we have the model, we can investigate the question we posed at the motivation above. Let strategy 0 be “Defect” and let strategy 1 be “Cooperate”. We can use payoffs [[1 3][0 2]]. Note that we could choose any other numbers (as long as they satisfy the conditions that define a Prisoner’s Dilemma), since our revision protocol only depends on ordinal properties of payoffs. Let us set n-of-players = 100 and prob-revision = 0.1, but feel free to change these values.

If you run the model with these settings, you will see that in nearly all runs all agents end up defecting in very little time.^{[2]} The video below shows some representative runs.

Note that at any population state, defectors will tend to obtain a greater payoff than cooperators, so they will be preferentially imitated. Sadly, this drives the dynamics of the process towards overall defection.

# 7. Exercises

You can use the following link to download the complete NetLogo model: 2×2-imitate-if-better.

Exercise 1. Consider a coordination game with payoffs [[3 0][0 2]] such that both players are better off if they coordinate in one of the actions (0 or 1) than if they play different actions. Run several simulations with 1000 players and probability of revision 0.1. (You can easily do that by leaving the button pressed down and clicking the button every time you want to start again from random initial conditions.)

Do simulations end up with all players choosing the same action? Does the strategy with a greater initial presence tend to displace the other strategy? How does changing the payoff matrix to [[30 0][0 2]] make a difference on whether agents coordinate on 0 or strategy 1?

P.S. You can explore this model’s (deterministic) mean dynamic approximation with this program.

Exercise 2. Consider a Stag hunt game with payoffs [[3 0][2 1]] where strategy 0 is “Stag” and strategy 1 is “Hare”. Does the strategy with greater initial presence tend to displace the other strategy?

P.S. You can explore this model’s (deterministic) mean dynamic approximation with this program.

Exercise 3. Consider a Hawk-Dove game with payoffs [[0 3][1 2]] where strategy 0 is “Hawk” and strategy 1 is “Dove”. Do all players tend to choose the same strategy? Reduce the number of players to 100 and observe the difference in behavior (press the setup button after changing the number of players). Reduce the number of players to 10 and observe the difference.

P.S. You can explore this model’s (deterministic) mean dynamic approximation with this program.

^{ CODE }Exercise 4. Reimplement the procedure to update-strategy so the revising agent uses the imitative pairwise-difference protocol that we saw in section 0.1.

^{ CODE }Exercise 5. Reimplement the procedure to update-strategy so the revising agent uses the best experienced payoff protocol that we saw in section 0.1.

- By default, user-defined variables in NetLogo are initialized with the value 0, so there is no actual need to explicitly set the initial value of individually-owned variables to 0, but it does no harm either. ↵
- All simulations will necessarily end up in one of the two absorbing states where all agents are using the same strategy. The absorbing state where everyone defects (henceforth D-state) can be reached from any state other than the absorbing state where everyone cooperates (henceforth C-state). The C-state can be reached from any state with at least two cooperators, so –in principle– any simulation with at least two agents using each strategy could end up in either absorbing state. However, it is overwhelmingly more likely that the final state will be the D-state. As a matter of fact, one single defector is extremely likely to be able to invade a whole population of cooperators, regardless of the size of the population. ↵