Wednesday, June 28, 2017

Dave Computes Trouble Strategy

Let's get into Trouble--a simple board game with a lot of luck and just enough strategy to make things interesting. We'll focus on the two-player variant. All data is based on trials with 1 million games, unless otherwise noted. I'll be referring to some simpler strategies by name, including:

Random = chooses a valid move at random
Capture = captures when possible, but otherwise plays at random
Capture+Start = captures when possible, otherwise starts a new peg, otherwise random


Statistics

With purely random play, player 1 has a slight advantage, winning 51% of games. Here are some additional stats. The Capture+Start column (for example) represents games in which both players use the Capture+Start strategy described above.

Statistic          Random  Capture  Capture+Start
Turns per game     55      58       62
Captures per game  1.1     1.4      2.7
Starts per game    4.7     5.0      6.3
Passes per game    15      16       14
Choices per game   16      16       26
No-Choice games    1.9%    2.2%     2.7%

On a given turn, a player may be forced to pass. This occurs about 25% of the time. Alternatively, a player might be forced to make a particular move, because only one of its pegs can move the number of spaces rolled (or start when a 6 is rolled). On other turns, a player might have a choice of which peg to move in response to a roll of the die. These choices occur on roughly 30 - 40% of turns. Astonishingly, note that about 2% of the time a player will not be given any choice in the course of an entire game (nearly always resulting in a loss).


Simple Strategies

I simulated various strategies playing against a purely random strategy ("Random"), and here are the win rates I found.

Strategy                   Win Rate against Random
Behind                     44%
Ahead                      46%
Random                     50%
Capture                    55%
Capture+Start              62%
Capture+Start+Ahead        64%
Capture+Start+SafestAhead  68%

The Ahead strategy always moves the peg that's furthest ahead (closest to the finish, or perhaps already in it), while the Behind strategy always moves the peg that is furthest behind (closest to the start, or perhaps starting a new peg). Surprisingly, note that both of these strategies are defeated by Random (which explains why my 4-year-old kept beating me).

The Capture strategy takes us to a 55% win rate. (Note that Capture outperforms Start, which starts a new peg whenever possible.)

We obtain a 62% win rate with Capture+Start. We get to 64% by playing Capture+Start+Ahead, which always moves the peg that's furthest ahead when it can't capture or start a new peg.

Any further improvement defies simple rules, and generally involves choosing to move pegs to escape threats and to threaten opponents. The Capture+Start+SafestAhead strategy listed above moves pegs to minimize the number of threats (in which an opponent's peg could capture the player's peg with a single roll of the die, including rolling a 6 to capture a peg at the opponent's entrance). If two moves lead to the same number of threats, the further ahead peg is moved. An even better strategy would also choose to threaten opponent pegs. It's a good bet that even optimal play can still expect to lose to Random about 30% of the time.


Elements of an Optimal Strategy

Initially, I was unable to write a program to find the optimal Trouble strategy. I therefore began by writing a program to approximate optimal play. (I'll come back to optimal play later in this post.) Each time my program is faced with a choice of which move to make, it tries each possible move and simulates a thousand games from that point. In those simulations, it plays the Capture+Start strategy for both sides, which can be computed quickly and is a good (but imperfect) approximation of optimal play. Here are the rules I extracted from my program's results:


Rule #1:  Capture

Always capture an opponent's peg when you can. There are exceptions to this rule, but they are extremely rare. In the game below, the blue player has a roughly 68% chance of winning if it moves the peg at the bottom right, which would escape a pursuer and get the peg to the finish. Nonetheless, the correct move is to capture with the peg at the top left, resulting in a 74% chance of victory.



Rule #2:  Start New Pegs

Prefer to start a new peg when you can. This rule and the ones that follow are really guidelines. They are usually correct, but exceptions are common.

In the game below, the blue player has rolled a 6, and has a 46% chance of winning by starting a new peg. This is a significant advantage over the 39% chance of winning by moving the peg already on the board--even though that move would escape a pursuer.



Rule #3:  Move the Furthest Ahead

In general, move whichever peg is closest to the finish. In the game below, the blue player should move the peg marked 39 because it is closer to the finish, even though moving the peg marked 34 would escape a pursuer and open up the possibility of starting a new peg.



Rule #4:  Avoid Danger

Most exceptions to the Furthest Ahead rule come down to avoiding danger. A dangerous space is one that could lead to a capture on the opponent's next roll. That means anywhere 1-to-6 spaces ahead of an opponent. It also means the opponent's start space (if the opponent has any pegs left to start).

In the game below, we've rolled a 4. Moving the further ahead peg (marked 47) would place it in danger, so it's significantly better to move the peg marked 51, placing that peg in danger instead.



In the game below, we've rolled a 5. If we move the top-left peg, it will land on the opponent's start space, leaving it vulnerable to being captured on the next turn. Therefore, we should move the peg marked 47 instead.



Note that it's ok to land on the opponent's start space if the opponent has no more pegs to start. It's also ok to land on the opponent's start with a 6, since we'll be able to roll again to get off that space before our opponent has a chance to roll.

Note that it's often helpful to forget about getting the further behind pegs to the finish. Instead, think of those pegs as ones you can move any time a roll would place the further-ahead peg in danger. In general, of all those pegs that can be moved without entering danger, move the one that's furthest ahead.


Rule #5: Play Dead

You've got two pegs on the board, and the one that's further ahead finds itself in danger. In this situation, it's often best to move the peg that's behind. The peg that's furthest ahead "plays dead" until it's safe to move again, either because the threat has leapt over us, or because a subsequent roll lets us move to safety.

In the game below, the further ahead peg is threatened and we've rolled a 2. Moving that peg 2 spaces still leaves it vulnerable to capture on the opponent's next roll. (We would need to have rolled at least a 3 to move our peg beyond our opponent's reach.) Therefore, we'll have it play dead, and we'll choose to move peg 39 instead.


There are multiple rationales for this choice. Advancing the further ahead peg often increases the number of opportunities for an opponent to capture it. In this situation, our opponent has only one active peg. If peg 36 plays dead, it's likely our opponent will be forced to step over our peg on the next couple of turns. Moving peg 39 also increases the chance that it will go on to catch up to and capture the opponent. Finally, it's possible that peg 36 is doomed no matter what. In that case, we're better off giving up on it and using our rolls to benefit peg 39. Of course, we'll re-examine the situation on our next roll.


Rule #6:  Desperate Times

Desperate times call for desperate measures. Often, when our opponent has 3 pegs in the finish, we need to stop whatever we're doing to make sure that last peg doesn't reach the finish. If it's unlikely we'll have time to get our remaining pegs to the finish, we're going to have to devote all our energy to capturing our opponent's last peg. That means starting as many pegs as we can and assembling an army to attack it.

In the game below, we should leave peg 11 to threaten our opponent. That's true even if we only rolled a 1. This peg is close enough to attack already. Now let's send in the reinforcements by advancing peg 22. (If we had more pegs left to start, then it'd be even more critical to free up our start space.) As desperate as our situation is, notice that we'll still pull off a win 22% of the time.



Steps Toward a Truly Optimal Strategy

It eventually occurred to me that my program to approximate optimal strategy suffered from a critical weakness. It assumed that I would play randomly in all future moves. That meant that a correct move now might be made incorrect by random play in the future, while an incorrect move now might prove safer in the face of random play. Specifically, I began to notice that the computer rarely suggested I chase my opponent, presumably because I might randomly pass it and get captured.

I therefore set out to write a program to solve for the true optimal strategy. What makes Trouble so tricky is that's its "game tree" is really a "game graph." Due to a capture, an earlier game state can be reached again later in a game, resulting in a cyclic game graph. Attempts to find an algorithm for solving such cyclic games were unsuccessful, so I invented my own algorithm. My idea was inspired by Google's PageRank algorithm. First, I identified all game states, where a state consists of whose turn it is and how the board is arranged. (I did not include the roll of the die.) I associated each game state with a number, indicating the probability that player 1 would win. I initialized all probabilities to 0.5. Then I visited each state exactly once, in no meaningful order, updating each state's probability, according to the following rules.

if player 1 won, record 1.0 and return
if player 2 won, record 0.0 and return
total = 0.0
for each roll 1 to 6:
  if no valid moves:
    if roll is 6: total += prob(this state)
    otherwise: total += 1.0 - prob(other player + this board)
  otherwise:
    bestValue = -1.0
    for each valid move:
      nextBoard = board after move is executed
      if roll is 6: moveValue = prob(this player + nextBoard)
      otherwise: moveValue = 1.0 - prob(other player + nextBoard)
      if moveValue > bestValue: bestValue = moveValue
    total += bestValue
record total / 6.0

Note that these rules are not recursive. In computing any one state's probability, it assumes all other states are already associated with correct probabilities. I updated each state's probability, and then I repeated the process again and again. The resulting probabilities converge surprisingly quickly to the limits of decimal precision.

Unfortunately, there are an awful lot of possible states in a game of trouble. Each peg can occupy one of roughly 30 possible locations. Let's consider a game where each player already has 3 pegs in the finish, and one peg yet to start. I call this a 1-1-game, since each player has one free peg. There are roughly 2 × 30 × 30 = 1,800 possible states in such a game--2 possible players whose turn it could be, roughly 30 possible locations for the first player's peg, and roughly 30 locations for the second player's peg. If each player has 2 finished pegs and 2 yet to start (a 2-2-game), there are now 2 × 30 × 30 × 30 × 30 = 1,620,000 possible states. The situation is actually a bit better than this. My program computes 381,936 such states. This proved to be highly manageable, and so my program had no trouble solving the 2-2-game for the optimal solution. In fact, it was able to solve m-n-games where m + n ≤ 5. But a full game would require roughly a million times more states than a 2-2-game. The rest of this post describes my understanding of the correct strategy, as gleaned from observing optimal 2-2-games.


More Elements of Optimal Play


All the elements I shared earlier remain true for optimal play. Here are a few more.


  • Once a peg is within a single roll of any open places in finish, and that peg is unthreatened, then you should usually wait for a roll that can enter finish, instead of just moving closer to finish.
  • Generally avoid moving a peg that has already finished, unless it would be detrimental to move other pegs.
  • Generally move a lagging peg off of your opponent's start when your leading peg is safe.
  • Here's how to play when no opponent pegs are on the board (all are in start or finish). If you have any pegs beyond your opponent's start, get them to finish. If all of your pegs are before your opponent's start, you should often set up a situation where the leading peg can leap clear across the opponent's start (with a roll of 5 or 6, from just before the opponent's start). Alternatively, set up an ambush, by getting a peg into the space just before your opponent's start, and then leaving it there, allowing your next peg to step over and proceed to finish. Generally abandon any such leap/ambush plans once your opponent starts a new peg.
  • A trailing peg can often chase an opponent peg (that is closer to its finish than the trailing peg is) when the leading peg is likely to remain unthreatened.

Whenever you are faced with a decision, consider each possible move separately and identify what type of move it is. Most moves fall into the following categories, sorted by strength of move.

1.  Capture your opponent's peg.

2.  Start a new peg.

3.  Chase your opponent's last peg.

4.  Strong moves:  enter finish; escape a threat with leading peg (so that no longer a single roll from capture); chase an opponent's peg (that is closer to its finish than the chasing peg is to your finish). I could identify no clear rule for situations when you are faced with a decision between two strong moves. There's probably no more critical decision than determining whether to chase an opponent.

5. Set up a leap over start or ambush if the only pegs on board are yours, and they haven't reached your opponent's start yet.

5.  Move unthreatened leading peg until it could enter any open position in finish with a single roll.

6.  Move a trailing peg.

7.  Move closer to finish with a peg that is already a single roll from any open position in finish.

8.  Move within threat instead of playing dead, if the threatening peg is not likely to follow (because it's not your opponent's leader).

9.  Move finished peg within finish. This is a neutral move generally made to avoid making a detrimental move.

10.  Any other move, other than the following.

11.  Move into a threat with your leading peg.

12.  Allow opponent's last peg to pass your last defender.












Tuesday, June 27, 2017

Dave Computes Slow-To-Terminate Programs

Exponential Time

I'm interested in simple rewrite rules that terminate after a large number of steps. For example:
..., 0 --> ...
..., x --> ..., x - 1, x - 1
Read as: If a sequence of nonnegative integers ends in 0, then remove the 0. Otherwise, remove the last integer x and append - 1 twice.

Starting from 3, we get:
3
2, 2
2, 1, 1
2, 1, 0, 0
2, 1, 0
2, 1
2, 0, 0
2, 0
2
1, 1
1, 0, 0
1, 0
1
0, 0
0
Starting from n, these rewrite rules will terminate in 2n - 1 steps, so they require exponential time. But exponential time is just the tip of the iceberg.


Ackermann's Function

Here's a more interesting set of rewrite rules.
..., 0, x --> ..., x + 1
..., x, 0 --> ..., x + 1
..., x, y --> ..., x - 1, x, y - 1
Starting from [2, 2], we get:
2, 2
1, 2, 1
1, 1, 2, 0
1, 1, 3
1, 0, 1, 2
1, 0, 0, 1, 1
1, 0, 0, 0, 1, 0
1, 0, 0, 0, 2
1, 0, 0, 3
1, 0, 4
1, 5
0, 1, 4
0, 0, 1, 3
0, 0, 0, 1, 2
0, 0, 0, 0, 1, 1
0, 0, 0, 0, 0, 1, 0
0, 0, 0, 0, 0, 2
0, 0, 0, 0, 3
0, 0, 0, 4
0, 0, 5
0, 6
7
Starting from [0, 0] requires 2 steps (and results in a value of 1).
Starting from [1, 1] requires 4 steps (and results in a value of 3).
Starting from [2, 2] requires 22 steps (and results in a value of 7).
Starting from [3, 3] requires 1,772 steps (and results in a value of 53).
Starting from [4, 4] does not finish in a reasonable amount of time.

These rules are my version of Ackermann's Function, which is traditionally defined recursively. Altering the base case rules can give us resulting values that are simpler to understand, but they don't substantially change the extraordinary running times.
..., 1, 0 --> ..., 2
..., 2, 0 --> ..., 0
..., x, 0 --> ..., 1
Now, we can summarize the values resulting from each operation as:
0, x = x + 1
1, x = x + 2
2, x = 2x3, x = 2^x4, x = 2^^x5, x = 2^^^x
Thus, 2^^4 = 2^2^2^2 = 2^16.


My Rewrite Rules for Sequences

I started to wondered if other rewriting rules for sequences led to such interesting behavior. I wrote a program to generate and test rewrite rules. Here's what I found.
..., 0, x --> ..., 0
..., x, y --> ..., y, x - 1, y + 1
Here's a sample run of these rules.
2, 2
2, 1, 3
2, 3, 0, 4
2, 3, 0
2, 0, 2, 1
2, 0, 1, 1, 2
2, 0, 1, 2, 0, 3
2, 0, 1, 2, 0
2, 0, 1, 0, 1, 1
2, 0, 1, 0, 1, 0, 2
2, 0, 1, 0, 1, 0
2, 0, 1, 0, 0, 0, 1
2, 0, 1, 0, 0, 0
2, 0, 1, 0, 0
2, 0, 1, 0
2, 0, 0, 0, 1
2, 0, 0, 0
2, 0, 0
2, 0
0, 1, 1
0, 1, 0, 2
0, 1, 0
0, 0, 0, 1
0, 0, 0
0, 0
0
It is fairly easy to show these rules always terminate with a value of 0. Here are the running times for this rule.

[0, 0] terminates in 1 step
[1, 1] terminates in 5 steps
[2, 2] terminates in 25 steps
[3, 3] terminates in 113 steps
[4, 4] terminates in 481 steps
[5, 5] terminates in 1,985 steps
[10, 10] terminates in 2,095,105 steps


If we allow our rules to append on either end of the sequence, we get even more complex behavior. Here's one that behaves more chaotically.
x, y, ... --> x, ...  (if x = 0 or y = 0)
x, y, ... --> ..., y + 1, x - 1, x + 1  (otherwise)
[1, 1] terminates in 307 steps.
[3, 3] terminates in 4,479 steps.
[10, 10] terminates in 189,295 steps.

When run, it appears to be blowing up. But once a 0 gets into the first position, it quickly swallows up the remainder of the sequence.

Starting from [0, 0], the following simple rules take 25 steps to terminate (but isn't particularly interesting for other starting values).
x, 0, ... --> x + 1, ..., x, 0
x, y, ... --> ..., y - 1
Starting from [0, 0], the following more complex rules take a remarkable 7,129 steps to terminate.
0, 0, ... --> 1, ..., 0
0, y, ... --> ..., y
x, 0, ... --> x + 1, ..., 0, x
x, y, ... --> x - 1, y - 1, ...

Busy Beavers

A Busy Beaver program is the program of a given length that terminates after the greatest number of steps. We know that there is no algorithm for generating busy beaver programs (because otherwise it could be used to solve the halting problem).

Nonetheless, I set out to find candidate busy beaver programs--or, at least, short programs that take a long time to terminate. Here are some of my favorites.


Brainf**k

For this esoteric programming language, I assumed that programs begin at the leftmost of an infinite array of cells initialized to zero, that cell values cannot be negative, and that programs crash (instead of ignoring or terminating) upon encountering any error condition. Here's a sample run:



+[->+<]  0, 0
+[->+<]  1, 0
+[->+<]  0, 0
+[->+<]  0, 0
+[->+<]  0, 1
+[->+<]  0, 1
+[->+<]  0, 1
+[->+<]  0, 1
+[->+<]  0, 1

And here are some of my favorite slow-to-terminate programs:

+ (1 steps)
++ (2 steps)
+[-] (5 steps)
++[-] (9 steps)
+++[-] (13 steps)
++++[-] (17 steps)
+++++[-] (21 steps)
++++++[-] (25 steps)
+++++[+--] (31 steps)
++++++[+--] (37 steps)
+++++[[->]<] (46 steps)
>++++[-[>]+<] (66 steps)
>+[>+++[-<]>>] (134 steps)
>+[>++++[-<]>>] (185 steps)
+[->++++++[-<]>] (630 steps)


Fractran

A Fractran program repeatedly multiplies the current value by the first fraction in the program that gives an integer result, until no such multiplication is possible. I assumed that all programs start in the simplest nontrivial state:  2. Thus, the program [1/36, 3/2, 4/3] produces the states:

2 (times 3/2)
3 (times 4/3)
4 (times 3/2)
6 (times 3/2)
9 (times 4/3)
12 (times 3/2)
18 (times 3/2)
27 (times 4/3)
36 (times 1/36)
1

Here are some of my favorite slow-to-terminate Fractran programs:

[1/3, 9/2] (3 steps)
[1/36, 3/2, 4/3] (9 steps)
[1/75, 45/2, 50/3] (12 steps)
[1/216, 4/3, 27/2] (20 steps)
[25/8 , 4/15, 135/2] (23 steps)
[1/42, 21/5, 35/2, 30/7, 14/3] (26 steps)
[154/15, 11/42, 77/2, 3/77, 70/3] (48 steps)
[125/148176, 648/2401, 8/1125, 972405/2] (670 steps)


Lambda Calculus

I'll use Scheme syntax for my Lambda Calculus programs. Here's a sample run:

((lambda (x0) (x0 (x0 x0))) (lambda (x0) x0))
((lambda (x0) x0) ((lambda (x0) x0) (lambda (x0) x0)))
((lambda (x0) x0) (lambda (x0) x0))
(lambda (x0) x0)

And here are some slow-to-terminate programs:

(lambda (x0) x0) [0 steps]
((lambda (x0) x0) (lambda (x0) x0)) [1 step]
((lambda (x0) (x0 x0)) (lambda (x0) x0)) [2 steps]
((lambda (x0) (x0 (x0 x0))) (lambda (x0) x0)) [3 steps]
((lambda (x0) ((x0 (x0 x0)) x0)) (lambda (x0) x0)) [4 steps]
((lambda (x0) (x0 x0)) (lambda (x0) (x0 (x0 (lambda (x1) x1))))) [8 steps]
((lambda (x0) (x0 x0)) (lambda (x0) (x0 (x0 (x0 (lambda (x1) x1)))))) [14 steps]
((lambda (x0) (x0 x0)) (lambda (x0) (x0 (x0 (lambda (x1) (lambda (x2) (x0 x1))))))) [26 steps]
((lambda (x0) (x0 x0)) (lambda (x0) (x0 (lambda (x1) (x1 (x0 (x0 (lambda (x2) x2)))))))) [58 steps]
((lambda (x0) ((x0 x0) x0)) (lambda (x0) (((x0 (lambda (x1) x0)) (lambda (x1) x0)) (lambda (x1) x0)))) [99 steps]


Tag Systems

These 2-tag systems begin in the start state 10. At each step, the matching rule removes the first 2 symbols and appends the given symbols to the end. Here is a sample run of the 2-tag system 0x -> 0, 1x -> 1000. (The matching symbols are underlined, and the appended symbols are shown in boldface.)

10
1000
001000
10000
0001000
010000
00000
0000
000
00
0

Here are some slow-to-terminate 2-tag systems using 2 symbols.

0x -> 0, 1x -> 100 (5 steps)
0x -> 0, 1x -> 1000 (10 steps)
0x -> 0, 1x -> 00100 (13 steps)
0x -> 0, 1x -> 100000 (21 steps)
0x -> 0, 1x -> 0010000 (25 steps)
0x -> 0, 1x -> 00001010 (50 steps)
0x -> 0, 1x -> 0000101010 (118 steps)
0x -> 0, 1x -> 000010101010 (232 steps)

And ones with a halting symbol:

0x -> 100, 1x -> 020 (21 steps)
0x -> 01121, 1x -> 11011 (70 steps)

And some with 3 rules:

0x -> 0, 1x -> 221, 2x -> 001 (779 steps)
0x -> 220, 1x -> 1, 2x -> 110 (779 steps)
0x -> 12, 1x -> 1, 2x -> 01210 (915 steps)
0x -> 0, 1x -> 02, 2x -> 10201 (915 steps)

And these favorite 3-tag systems, which start on the string 100:

0xx -> 0, 1xx -> 11001 (587 steps)
0xx -> 00, 1xx -> 0101100 (1,766 steps)


Thue

These Thue programs start from a single 0, find the first matching rule, and replace the left side of the rule with the right. Here is a sample run of the program 10->, 1->00, 0->111. (The matching symbols are underlined, and the appended symbols are shown in boldface.)

0
111
0011
00001
000000
11100000
110000
1000
00
1110
11
001
0000
111000
1100
10

And here are some slow-to-terminate Thue programs:

[0->, 0->] 1 step
[0->1, 1->] 2 steps
[0->11, 1->] 3 steps
[0->111, 1->] 4 steps
[0->1111, 1->] 5 steps
[0->11111, 1->] 6 steps


[0->1, 111->, 1->00] 8 steps


[10->, 1->00, 0->111] 16 steps

[100->, 1->00, 0->111] 28 steps
[100->, 1->00, 0->1111] 45 steps
[100->, 1->00, 0->11111] 66 steps
[100->, 1->00, 0->111111] 91 steps