Remember Mastermind? In this game, there is a hidden secret code consisting of four pegs, each of which may be one of six colors (1, 2, 3, 4, 5, or 6).
Suppose the secret code is 5512.
Of course, we don't know that yet. So we might guess that the code is 1122.
In response, we're given the feedback RW. The R tells us that one of our numbers is correct and in the right place, and the W tells us that one of our numbers is correct but in the wrong place. (As it turns out, the last 2 earned the R, and one of the 1s earned the W, but we don't know that yet.)
Next, we might guess 1134, which will result in the feedback of W (one number is correct but in the wrong place.) Here is our complete game, from our first guess (1122) to the winning guess (5512):
1122 RW
1134 W
2352 RW
6242 R
5512 RRRR
Awarding Feedback
There are 14 possible feedback strings: [empty], W, R, WW, RW, RR, WWW, RWW, RRW, RRR, WWWW, RWWW, RRWW, RRRR. Given a guess and a secret code, determining the appropriate feedback is not so straightforward. The algorithm turns out to be symmetrical, in that it doesn't matter which code is the guess and which is the secret code. Suppose the two codes are:
1123
2321
First, award an R for any number that appears at the same position in both codes. (This will require comparing the first value in each code, then the second value in each code, etc.) Indicate that we have already used that number in both codes, so that we don't look at it again. Here, we'll cross out any used numbers.
11
23
R
After all R's have been awarded, give a W for any unmarked number that appears in both strings (at any position). (This will require comparing each number in the first code with each number in the second code.) Again, mark any such matching numbers as used, so that we don't look at them again.
23
RW
And finally:
2
RWW
There are 6 * 6 * 6 * 6 = 1,296 possible secret codes. Let's return to the game we saw earlier, which began with:
1122 RW
At this point, only 208 of the original 1296 codes are consistent with this feedback. In other words, any of those 208 codes could turn out to be the secret code. None of the other 1,296 - 208 = 1,088 codes could possibly be the secret code. Our game continued with:
1134 W
Guessing 1134 was inconsistent with the feedback we had received so far. It could not possibly have been the secret code. (If 1134 were the secret code, then our initial guess of 1122 would have been awarded the feedback RR.) Nonetheless, it is sometimes to our advantage to make such an inconsistent guess. We've now received feedback for two guesses. Only 38 codes are consistent with this feedback. Our game continued with:
2352 RW
This guess was consistent with all previous feedback. After receiving feedback for this guess, only 7 codes are consistent with all the feedback we've received so far (2426, 4242, 4262, 5512, 5612, 6242, and 6512). Our game continued with:
6242 R
This guess was consistent, as it was one of the 7 remaining codes possible. And now only one code is consistent with all this feedback. We therefore made that guess and won.
5512 RRRR
And now we're ready to talk strategy.
RC: The Random-Consistent Strategy
The first strategy we'll consider is to simply make random consistent guesses. If 38 codes are consistent with all feedback so far, then randomly choose one of those 38 codes to be the next guess. We'll refer to this as the RC strategy.
RC actually performs surprisingly well. After simulating 1 million rounds of Mastermind, this strategy resulted in an average of only 4.64 guesses per round. On the other hand, it often used 8 guesses, and on extremely rare occasions (that proved difficult to reproduce), it used 9 guesses. Here's an example of a 9-guess round using RC:
1111
2235 RR
3233 R
5255 R
4635 W
2244 RR
2266 RRR
2226 RRWW
2262 RRRR
We can judge all other strategies against this simple RC approach. Any other strategy we consider had better have a better average (lower than 4.64), have a better worst case (lower than 9), or be easier to apply. Although it's easy to talk about picking a random consistent code, in practice it is very difficult for a human to pick a random guess. Humans are in danger of falling into patterns of play, and as we'll see, many patterns lead to higher averages.
The First Guess
We might be overwhelmed by the 1,296 different codes we can choose for our initial guess. In practice, however, there really is no compelling reason to choose, for example, 6133 instead of 5451. Both of these guesses have the same qualities: one number appears twice and two numbers appear once. They are isomorphic to each other. Since we have not seen any feedback yet, the 6 numbers are equivalent to us, as are the 4 positions. In this light, there are really only 5 codes we need to consider for our initial guess:
1111 (6)
1112 (120)
1122 (90)
1123 (720)
1234 (360)
The value in parentheses indicates the number of different initial guesses with the same characteristics. For example, there are 6 single-digit codes (1111, 2222, 3333, 4444, 5555, and 6666), while there are 360 codes like 1234 (with four different numbers). So which code should we guess first?
The X+RC Strategy
What happens if we always make the same initial guess, and then follow it with random consistent guesses? For example, a 1111+RC strategy would mean always guessing 1111 first, and then making random consistent guesses. Such a strategy would be slightly simpler to apply than pure RC. I simulated a million rounds of this strategy for each of the following initial guesses, and here are the results.
1111+RC (average = 5.13)
1112+RC (average = 4.74)
1122+RC (average = 4.64)
1123+RC (average = 4.61)
1234+RC (average = 4.67)
As we can see, 1111+RC is a considerably worse strategy than any other. The best strategy is 1123+RC, which requires an average of 4.61 guesses--a slight improvement over the 4.64 guesses of pure RC. All other initial guesses fare no better than pure RC.
The 1123+X+RC Strategy
This begs the question, what if we always begin by guessing 1123, then guess some predetermined code X as our second guess (regardless of the feedback we receive for 1123), and then play randomly consistent? What code should we use for X, and is this 1123+X+RC strategy an improvement over 1123+RC?
First, there are 129 possible codes to consider for X:
1111, 1112, 1113, 1114, 1122, 1123, 1124, 1132, 1134, 1144, 1145, 1211, 1212, 1213, 1214, 1221, 1222, 1223, 1224, 1231, 1232, 1233, 1234, 1241, 1242, 1243, 1244, 1245, 1411, 1412, 1413, 1414, 1415, 1422, 1423, 1424, 1425, 1432, 1434, 1435, 1444, 1445, 1455, 1456, 2211, 2212, 2213, 2214, 2221, 2222, 2223, 2224, 2231, 2232, 2233, 2234, 2241, 2242, 2243, 2244, 2245, 2311, 2312, 2313, 2314, 2322, 2323, 2324, 2332, 2334, 2344, 2345, 2411, 2412, 2413, 2414, 2415, 2421, 2422, 2423, 2424, 2425, 2431, 2432, 2433, 2434, 2435, 2441, 2442, 2443, 2444, 2445, 2451, 2452, 2453, 2454, 2455, 2456, 4411, 4412, 4413, 4414, 4415, 4422, 4423, 4424, 4425, 4432, 4434, 4435, 4444, 4445, 4455, 4456, 4511, 4512, 4513, 4514, 4516, 4522, 4523, 4524, 4526, 4532, 4534, 4536, 4544, 4546, 4566
For example, having already played 1123, we should consider playing 1111, 2222, or 4444 next, but we don't need to test 3333, 5555, or 6666. After much simulation, the best second guesses turn out to be 2245 or 4415. These were each tested for a million rounds. Here are the results:
1123+2245+RC (average = 4.592)
1123+4415+RC (average = 4.593)
Let's assume 2245's slight edge is statistically significant. We have now discovered 3 simple strategies based on random consistent guessing:
1123+RC (average = 4.61)
1123+2245+RC (average = 4.59)
Note, by the way, that the order of the first two guesses is irrelevant. 1123+2245+RC will perform exactly as well as 2245+1123+RC.
Is there a code we should always use as our third guess? In other words, can we improve on these numbers with a 1123+2245+X+RC strategy? The answer is no. Even the best third guesses require an average of 4.75 guesses. It makes sense that this approach of hardwired guesses can't keep paying off. At some point, it must be better to select later guesses based on earlier feedback.
LC: The Lowest-Consistent Strategy
In guessing randomly, we might find that we fall into a system. The simplest system is the lowest-consistent strategy, which we'll refer to as LC. LC is very similar to RC. The difference is that, in LC, instead of picking a random consistent code, we pick the first consistent code (according to numerical order). For example, if [5436, 5463, 5634, 6354, 6435, 6543] are the only codes that are consistent with all feedback so far, then our next guess will be 5436--the lowest of the consistent codes. Here is a sample game played using LC.
1111
2222
3333 R
3444 RW
5345 RWW
5436 RRWW
5463 RWWW
5634 RWWW
6435 RRRR
What about X+LC? Is there some initial guess that will improve our chances? Yes, there is, but note that it's going to be a lot harder to determine for LC than it was for RC. With RC, there were only 5 initial guesses to consider: 1111, 1112, 1122, 1123, and 1234. All other guesses were isomorphic to one of these. Not so for LC. In LC, we prefer making guesses with lower numbers in earlier positions, and that breaks the symmetry. We must now consider every possible initial guess separately.
When we do this, here's what we learn. By starting by guessing either 5463 or 5466, and then proceeding to play LC, we only need 6,021 to guess all possible codes, or an average of about 4.65 guesses per round (still slightly worse than pure RC).
What about 5463+X+LC or 5466+X+LC? Here's what my simulations found.
LC (total = 7,471; average = 5.76)
5463+LC (total = 6,021; average = 4.65)
5466+LC (total = 6,021; average = 4.65)
5463+4322+LC (total = 5,917; average = 4.57)
5466+4322+LC (total = 5,885; average = 4.54)
Notice that these last two strategies perform better than any RC strategy we considered! In fact, we can do even better. The best LC strategies I stumbled on are:
6564+4233+LC (total = 5,869; average = 4.53)
6564+4332+LC (total = 5,869; average = 4.53)
I suspect even these are not optimal. Unfortunately, an exhaustive search of X+Y+LC strategies would take too long to compute. It is clear, however, that using a third hardwired guess is counterproductive, as it was for RC. In the end, if we want a better strategy, we're going to need to consider approaches where the second guess depends on the feedback we receive from the first guess.
Knuth's Strategy
Let's look again at those five distinct initial guesses we might make.
1111: =625, R =500, RR=150, RRR= 20, RRRR= 1
1112: R =317, W =308, =256, RW =156, RR =123, ...
1122: =256, R =256, W =256, RW =208, RR =114, ...
1123: W =276, RW=230, WW=222, R =182, RWW = 84, ...
1234: WW=312, RW=252, W =152, WWW=136, RWW =132, ...
Here we can see which feedback we'll see most often for a given initial guess. For example, if our first guess is 1123, then we'll receive a feedback of W for 276 of the 1,296 possible secret codes; we'll receive RW for 230 secret codes, and so on. On the other hand, if we guess 1111, then we might receive a blank feedback corresponding to 625 possible secret codes. Clearly a guess of 1111 may fail to narrow down the possible secret codes as much as we'd like.
If we're concerned about these worst case scenarios, then the safest first guess is 1122, which will guarantee narrowing our search down to 276 or fewer possible secret codes. Now suppose that we receive a blank feedback (no Rs or Ws) in response to guessing 1122. Although there are 276 secret codes consistent with this feedback, they are each isomorphic to one of only twelve possible guesses.
1111: =256
1113: W =111, =81, R =64
1133: = 81, R =54, W =54, RW =42, RR =16, ...
1134: W = 76, R =54, WW =52, RW =42, =16, ...
1333: = 81, R =81, RR =27, RW =27, W =27, ...
1334: W = 54, RW =49, R =43, WW =33, RR =27, ...
1345: WW = 63, RW =60, RR =27, R =24, RWW =24, ...
3333: R =108, =81, RR =54, RRR=12, RRRR= 1
3334: R = 51, W =46, RW =42, RR =39, WW =19, ...
3344: RW = 56, RR =34, R =32, W =32, WW =24, ...
3345: RW = 46, WW =42, RWW=40, RR =29, RRW =20, ...
3456: RWW= 60, WWW=48, RW =36, RR =24, RRW =24, ...
If we continue to play in this paranoid fashion, we'll guess 3345 next, because that guarantees narrowing our search to 46 or fewer possible secret codes. And so on.
To summarize, we'll always make a guess that guarantees narrowing our search as much as possible, by picking the guess whose most common feedback occurs for the fewest number of secret codes. This is called a minimax strategy. It's as if we're playing against an adversary who is giving us the worst possible feedback, and we're trying to beat that adversary. This particular approach is a greedy approach, meaning that we're not looking several guesses ahead. We therefore shouldn't expect our strategy to be optimal.
This strategy was developed by Donald Knuth in 1977. Specifically, whenever several possible guesses would narrow the search down to the same number of secret codes, Knuth's algorithm picks the lowest consistent code. However, it is possible that none of these equally strong guesses is consistent with the feedback received. In this case, the lowest such guess is chosen. One of the rounds shown earlier is an example of using Knuth's minimax strategy.
1122 RW
1134 W
2352 RW
6242 R
5512 RRRR
Knuth's strategy requires a total of 5,801 guesses to uncover all 1,296 codes, with an average of about 4.48 guesses per round. Astoundingly, each of the 1,296 codes can be solved in 5 or fewer guesses! This worst-case scenario occurs for 694 of the codes--more than half. Some of this strategy is reproduced below.
History Guess Count Total Avg Max
1122 1,296 5,801 4.48 5
| 3345 256 1,175 4.59 5
|R 1344 256 1,179 4.61 5
|W 2344 256 1,176 4.59 5
|RW 1134 208 938 4.51 5
|RR 1234 114 500 4.39 5
|WW 2344 96 407 4.24 5
||RW 3636 46 217 4.72 5
|W|W 3516 44 209 4.75 5
|R|W 3526 44 209 4.75 5
|R|RW 4524 42 198 4.71 5
|W|RW 4514 42 198 4.71 5
||WW 6634 42 199 4.74 5
|R|WW 3135 41 195 4.76 5
|W|WW 3235 41 195 4.76 5
||RWW 3454 40 189 4.73 5
|RW|W 2352 38 176 4.63 5
|RWW 1213 36 145 4.03 5
|R|R 3325 34 157 4.62 5
|RW|RWW 1315 34 160 4.71 5
|W|R 3315 34 157 4.62 5
|RW|RW 1516 32 147 4.59 5
|RRW 1223 32 124 3.88 4
||RR 3636 29 135 4.66 5
The complete table would have 1,296 entries. But in many of those scenarios, there is only a single code consistent with all given feedback. In cases where there are only two possible codes, we can do no better than guessing the first and then (if necessary) the second. We therefore don't need to memorize situations with 1 or 2 consistent codes. That means we only need to memorize a table of 269 entries. (Actually, we can do even better if we identify those scenarios where there are three possible codes and the order of guessing them doesn't matter.) Sadly, this is still a lot to memorize.
Consistent Knuth
One thing that makes Knuth's strategy so difficult to memorize is its use of guesses that are inconsistent with all previous feedback. What happens if we modify Knuth's strategy to require consistent guesses only? Now we need a total of 5,828 guesses (just 21 more than before), and an average of about 4.50 guesses per round. That sounds fine, except that we must give up the maximum of 5 guesses. There will now be 54 secret codes that require 6 guesses. The new table contains 286 entries, and is only slightly easier to memorize. Here's an excerpt:
History Guess Count Total Avg Max
1122 1,296 5,828 4.50 6
| 3345 256 1,175 4.59 6
|R 1344 256 1,178 4.60 6
|W 2344 256 1,178 4.60 6
|RW 1314 208 940 4.52 6
|RR 1134 114 512 4.49 6
|WW 2314 96 408 4.25 5
||RW 3636 46 217 4.72 6
|W|W 3516 44 209 4.75 6
|R|W 3526 44 209 4.75 6
|R|RW 4524 42 198 4.71 6
|W|RW 4514 42 198 4.71 6
||WW 6634 42 199 4.74 6
|R|WW 3135 41 196 4.78 6
|W|WW 3235 41 196 4.78 6
||RWW 3454 40 189 4.73 6
|RW|W 2452 39 181 4.64 5
|RWW 1213 36 147 4.08 6
|R|R 3325 34 157 4.62 5
|W|R 3315 34 157 4.62 5
|RRW 1223 32 125 3.91 5
|RW|RW 5115 32 151 4.72 6
||RR 3366 29 136 4.69 6
|RR|RW 1352 28 128 4.57 5
Exhaustive Search
Are there strategies that guarantee a win in 5 consistent guesses? Answering this question requires an exhaustive search. The Knuth algorithm is greedy, in that it assumes a locally optimal choice will prove to be globally optimal. An exhaustive search must consider scenarios where a choice that appears to be suboptimal may win out when we consider the full ramifications of the choice.
Using an exhaustive search, I proved that no such consistent-five-guess strategy exists.
Another question we can answer with an exhaustive search is what consistent-guess strategy results in the lowest average number of guesses? The resulting strategy requires 5,671 total guesses, with an amazing average of about 4.38 guesses per round--a considerable improvement over Knuth! There are 19 secret codes that require the worst-case of 6 guesses to solve: 1126, 1163, 1623, 1662, 2453, 2545, 3223, 4314, 4421, 4423, 5162, 5452, 5523, 5533, 6123, 6343, 6351, 6523, 6636. The full strategy requires memorizing another large table of game states. Here's an excerpt of this average consistent (AC) strategy:
History Guess Count Total Avg Max
1123 1,296 5,671 4.38 6
|W 4534 276 1,254 4.54 6
|RW 1415 230 1,035 4.50 6
|WW 2345 222 997 4.49 6
|R 1445 182 802 4.41 6
|RR 1425 105 456 4.34 6
|RWW 1234 84 349 4.15 5
| 4456 81 328 4.05 5
|W|RW 3566 53 252 4.75 6
|WWW 2314 44 173 3.93 5
|WW|WW 5631 43 201 4.67 5
|W|W 6615 43 201 4.67 5
|R|W 6563 42 198 4.71 6
|W|WW 3665 42 196 4.67 6
|RW|W 3522 41 192 4.68 5
|RRW 1243 40 158 3.95 5
|WW|RW 2614 40 188 4.70 6
|RW|RW 3135 39 185 4.74 6
To manage the enormous computation necessary to produce this table, I came up with a couple of tricks in the way I wrote the code. I originally determined the best guess given a particular guessing/feedback history. I later improved it to determine the best guess given only a set of possible secret codes. I also originally wrote code to find the best response to each possible consistent guess. Later I made a dramatic improvement by only testing distinct guesses (ones that were not isomorphic to already-tested guesses). For this, I found the possible feedback for each guess (e.g. W=111, =81, R=64) and then ignored the guess if I had already tested a guess with the same feedback profile.
I was disappointed that the AC strategy table does not appear to exhibit patterns that would lend themselves to rules-of-thumb. I had hoped to find some rules-of-thumb so I would only need to memorize exceptions, but it seems like every situation is an exception.
The AC strategy begs the question: What strategy would lead to the lowest average number of guesses, regardless of whether those guesses are consistent or not? Unfortunately, even my most efficient code was not up to this challenge. We do know, however, that inconsistent guesses allow us to get the average down to 5,625/1,296 = about 4.34 guesses, thanks to work by Koyama and Lai in 1993. Like AC, their solution requires a worst-case of 6 guesses.
The Easy Strategy
How can we play a successful game of Mastermind without memorizing a massive table or hurting our brain to play a strategy like RC or X+LC?
The easiest strategy I know is largely equivalent to LC. The idea is to first determine all the 1s, then the 2s, then 3s, and so on, dealing with one color at a time, limiting ourselves to consistent guesses. This way, it is generally easy to determine what the feedback is telling us. In fact, we'll always start by assuming that the lowest numbers in our guesses earn the Rs, higher numbers earn the Ws, and the highest ones earn blanks. We'll only give up those assumptions when they remain untenable.
We always begin by guessing 1111. The only possible feedback we might receive is blank feedback, R, RR, RRR, or RRRR. Suppose we receive blank feedback in response.
1111
That tells us that there are no 1s in the solution. Next, we guess 2222, and we receive an R.
2222 R
Now we know there's exactly one 2, but we don't know where it belongs yet. We'll assume it goes at the beginning, and fill the remaining holes with 3s.
2333 W
Because we already know there's a 2 in the solution, we know the W applies to the 2, and therefore there aren't any 3s. We now try 2 in the next position, filling the remaining holes with 4s.
4244 RW
Things are a bit more confusing now, because it's not immediately obvious whether 2 or 4 earned the R. But if the 2 earned the R, then a 4 must have earned the W, which a little thought shows is impossible. The 2 must therefore have earned the W, so we'll move it into the next hole: xx2x. One of the 4s must have earned the R, so we'll assume it was the first 4: 4x2x. Then we'll fill the rest with 5s.
4525 RWW
This is tricky. Again, we'll begin by assuming the lowest value is correct, and see how that plays out. If the 2 is correct, we have xx2x. That means that the 4 and a 5 are wrong. We'll always start with the lower value (which is usually the most constrained). If 4 is wrong, it can't go in the first position. It also can't go in the second position, because we know a 4 earned an R in our 4244 guess. We already have a 2 in the third position, so the 4 must be last: xx24. Assuming the 5 is wrong, we'll move it into the first position to get 5x24, and fill the hole with a 6.
5624 RWWW
Tricky again. Last time we assumed that the 2 was correct, and that forced us to guess 5624. Since this wasn't correct, the 2 must have been wrong. We're now forced to move the 2 into the last position: xxx2. From our 4244 guess, we know the 4 must be in the first or third position now. If we put it in the first position, we get 4xx2. Looking back at our 4525 guess, we see that 4 earns the R, so 5 must earn a W, which is only possible if it goes in the third position: 4x52. With the 6, that gives us 4652. We see that makes the 6 correct on our last guess, which is consistent with earning RWWW. In fact, 4652 turns out to be the only consistent guess, and therefore the secret code itself.
4652 RRRR
Note that this isn't quite the LC strategy, because LC would have guessed 4652 before 5624. But this Easy strategy performs identially to LC. This 7-guess example is pretty typical of the Easy strategy (although i deliberately picked an example that required some especially tricky reasoning).
The Advanced Strategy
We already know that 1111 is a terrible first guess, so maybe we can improve on our Easy strategy by eliminating single-color guesses. Specifically, we'll guess 1122 instead of 1111 (and 3344 instead of 3333). Of course, this will complicate the reasoning a little. We'll call this the Advanced strategy, although it's really only slightly more advanced than the Easy strategy. Here's a sample game.
1122 R
We'll assume the first 1 is correct.
1333
Blank feedback. There must be no 1s or 3s, so one of the 2s in our original guess must have been correct. We'll assume the first 2 was correct, and fill the remaining holes with 4s.
4424 RW
2 can't be correct here, so we'll move it over again. We'll assume the first 4 was correct, and we'll fill the remaining holes with 5s.
4552 RRR
We'll assume the 2 and 4 are correct (as it turns out they must be), and that the first 5 is also correct. We fill the remaining hole with a 6.
4562 RRWW
Now we assume we picked the wrong 5, and swap the 5 and 6 to win the game.
4652 RRRR
This is the same secret code we found with the Easy strategy in 7 guesses, but notice that the Advanced strategy paid off here by solving it in 6 guesses.
Here's one more example of the Advanced strategy in action.
1122
Blank feedback means no 1s or 2s. We now guess 3344.
3344 RW
We'll assume the first 3 is right and the second is wrong, and fill remaining holes with 5s.
3535 W
Only one W, so there can't be two 3s. We'll move one 3 into the next available spot to get x3xx. We'll put 4 is the first consistent hole to get 43xx, and we'll fill the rest with 6s.
4366 R
Only one R. We tried two 3s and that was incorrect. We tried a 3 and a 4 and that was incorrect. At this point, only two 4s would explain our feedback from guessing 3344. We know the first position must have a 4: 4xxx. From 3344, we know there must be a 4 in one of the last two positions. Let's try the third position: 4x4x. One of the 5s in 3535 must have been wrong, but that's only possible if we change to 4xx4, placing a 5 to give us 4x54. Finally, we know there can't be any 1s, 2s, 3s, or 6s, and there can't be any more 5s. The only possibility for the remaining spot is another 4.
4454 RRRR
So, how does this strategy perform? It requires 6,210 guesses to find all 1,296 codes, with an average of 4.79 guesses per round. That's a bit worse than RC, but a lot easier to execute, and a big improvement over Easy (nearly a full guess). The worst case scenario requires 8 guesses, but this only occurs for a single secret code: 6543. Here is that worst case game.
1122
3344 RW
3535 RW
3663 RW
3456 WWWW
4365 WWWW
5634 WWWW
6543 RRRR
Final Analysis
Here is a summary of all the strategies we considered and how they compare to each other.
Strategy Avg Worst
LC / Easy 5.76 9
Advanced 4.79 8
X+LC 4.65
RC 4.64 9
X+RC 4.61
X+Y+RC 4.59
X+Y+LC 4.53
Knuth 4.48 5
Consistent Knuth 4.50 6
Avg Consistent 4.38 6
Koyama/Lai 4.34 6
1122: =256, R =256, W =256, RW =208, RR =114, ...
1123: W =276, RW=230, WW=222, R =182, RWW = 84, ...
1234: WW=312, RW=252, W =152, WWW=136, RWW =132, ...
Here we can see which feedback we'll see most often for a given initial guess. For example, if our first guess is 1123, then we'll receive a feedback of W for 276 of the 1,296 possible secret codes; we'll receive RW for 230 secret codes, and so on. On the other hand, if we guess 1111, then we might receive a blank feedback corresponding to 625 possible secret codes. Clearly a guess of 1111 may fail to narrow down the possible secret codes as much as we'd like.
If we're concerned about these worst case scenarios, then the safest first guess is 1122, which will guarantee narrowing our search down to 276 or fewer possible secret codes. Now suppose that we receive a blank feedback (no Rs or Ws) in response to guessing 1122. Although there are 276 secret codes consistent with this feedback, they are each isomorphic to one of only twelve possible guesses.
1111: =256
1113: W =111, =81, R =64
1133: = 81, R =54, W =54, RW =42, RR =16, ...
1134: W = 76, R =54, WW =52, RW =42, =16, ...
1333: = 81, R =81, RR =27, RW =27, W =27, ...
1334: W = 54, RW =49, R =43, WW =33, RR =27, ...
1345: WW = 63, RW =60, RR =27, R =24, RWW =24, ...
3333: R =108, =81, RR =54, RRR=12, RRRR= 1
3334: R = 51, W =46, RW =42, RR =39, WW =19, ...
3344: RW = 56, RR =34, R =32, W =32, WW =24, ...
3345: RW = 46, WW =42, RWW=40, RR =29, RRW =20, ...
3456: RWW= 60, WWW=48, RW =36, RR =24, RRW =24, ...
To summarize, we'll always make a guess that guarantees narrowing our search as much as possible, by picking the guess whose most common feedback occurs for the fewest number of secret codes. This is called a minimax strategy. It's as if we're playing against an adversary who is giving us the worst possible feedback, and we're trying to beat that adversary. This particular approach is a greedy approach, meaning that we're not looking several guesses ahead. We therefore shouldn't expect our strategy to be optimal.
This strategy was developed by Donald Knuth in 1977. Specifically, whenever several possible guesses would narrow the search down to the same number of secret codes, Knuth's algorithm picks the lowest consistent code. However, it is possible that none of these equally strong guesses is consistent with the feedback received. In this case, the lowest such guess is chosen. One of the rounds shown earlier is an example of using Knuth's minimax strategy.
1122 RW
1134 W
2352 RW
6242 R
5512 RRRR
Knuth's strategy requires a total of 5,801 guesses to uncover all 1,296 codes, with an average of about 4.48 guesses per round. Astoundingly, each of the 1,296 codes can be solved in 5 or fewer guesses! This worst-case scenario occurs for 694 of the codes--more than half. Some of this strategy is reproduced below.
History Guess Count Total Avg Max
1122 1,296 5,801 4.48 5
| 3345 256 1,175 4.59 5
|R 1344 256 1,179 4.61 5
|W 2344 256 1,176 4.59 5
|RW 1134 208 938 4.51 5
|RR 1234 114 500 4.39 5
|WW 2344 96 407 4.24 5
||RW 3636 46 217 4.72 5
|W|W 3516 44 209 4.75 5
|R|W 3526 44 209 4.75 5
|R|RW 4524 42 198 4.71 5
|W|RW 4514 42 198 4.71 5
||WW 6634 42 199 4.74 5
|R|WW 3135 41 195 4.76 5
|W|WW 3235 41 195 4.76 5
||RWW 3454 40 189 4.73 5
|RW|W 2352 38 176 4.63 5
|RWW 1213 36 145 4.03 5
|R|R 3325 34 157 4.62 5
|RW|RWW 1315 34 160 4.71 5
|W|R 3315 34 157 4.62 5
|RW|RW 1516 32 147 4.59 5
|RRW 1223 32 124 3.88 4
||RR 3636 29 135 4.66 5
The complete table would have 1,296 entries. But in many of those scenarios, there is only a single code consistent with all given feedback. In cases where there are only two possible codes, we can do no better than guessing the first and then (if necessary) the second. We therefore don't need to memorize situations with 1 or 2 consistent codes. That means we only need to memorize a table of 269 entries. (Actually, we can do even better if we identify those scenarios where there are three possible codes and the order of guessing them doesn't matter.) Sadly, this is still a lot to memorize.
Consistent Knuth
One thing that makes Knuth's strategy so difficult to memorize is its use of guesses that are inconsistent with all previous feedback. What happens if we modify Knuth's strategy to require consistent guesses only? Now we need a total of 5,828 guesses (just 21 more than before), and an average of about 4.50 guesses per round. That sounds fine, except that we must give up the maximum of 5 guesses. There will now be 54 secret codes that require 6 guesses. The new table contains 286 entries, and is only slightly easier to memorize. Here's an excerpt:
History Guess Count Total Avg Max
1122 1,296 5,828 4.50 6
| 3345 256 1,175 4.59 6
|R 1344 256 1,178 4.60 6
|W 2344 256 1,178 4.60 6
|RW 1314 208 940 4.52 6
|RR 1134 114 512 4.49 6
|WW 2314 96 408 4.25 5
||RW 3636 46 217 4.72 6
|W|W 3516 44 209 4.75 6
|R|W 3526 44 209 4.75 6
|R|RW 4524 42 198 4.71 6
|W|RW 4514 42 198 4.71 6
||WW 6634 42 199 4.74 6
|R|WW 3135 41 196 4.78 6
|W|WW 3235 41 196 4.78 6
||RWW 3454 40 189 4.73 6
|RW|W 2452 39 181 4.64 5
|RWW 1213 36 147 4.08 6
|R|R 3325 34 157 4.62 5
|W|R 3315 34 157 4.62 5
|RRW 1223 32 125 3.91 5
|RW|RW 5115 32 151 4.72 6
||RR 3366 29 136 4.69 6
|RR|RW 1352 28 128 4.57 5
Exhaustive Search
Are there strategies that guarantee a win in 5 consistent guesses? Answering this question requires an exhaustive search. The Knuth algorithm is greedy, in that it assumes a locally optimal choice will prove to be globally optimal. An exhaustive search must consider scenarios where a choice that appears to be suboptimal may win out when we consider the full ramifications of the choice.
Using an exhaustive search, I proved that no such consistent-five-guess strategy exists.
Another question we can answer with an exhaustive search is what consistent-guess strategy results in the lowest average number of guesses? The resulting strategy requires 5,671 total guesses, with an amazing average of about 4.38 guesses per round--a considerable improvement over Knuth! There are 19 secret codes that require the worst-case of 6 guesses to solve: 1126, 1163, 1623, 1662, 2453, 2545, 3223, 4314, 4421, 4423, 5162, 5452, 5523, 5533, 6123, 6343, 6351, 6523, 6636. The full strategy requires memorizing another large table of game states. Here's an excerpt of this average consistent (AC) strategy:
History Guess Count Total Avg Max
1123 1,296 5,671 4.38 6
|W 4534 276 1,254 4.54 6
|RW 1415 230 1,035 4.50 6
|WW 2345 222 997 4.49 6
|R 1445 182 802 4.41 6
|RR 1425 105 456 4.34 6
|RWW 1234 84 349 4.15 5
| 4456 81 328 4.05 5
|W|RW 3566 53 252 4.75 6
|WWW 2314 44 173 3.93 5
|WW|WW 5631 43 201 4.67 5
|W|W 6615 43 201 4.67 5
|R|W 6563 42 198 4.71 6
|W|WW 3665 42 196 4.67 6
|RW|W 3522 41 192 4.68 5
|RRW 1243 40 158 3.95 5
|WW|RW 2614 40 188 4.70 6
|RW|RW 3135 39 185 4.74 6
To manage the enormous computation necessary to produce this table, I came up with a couple of tricks in the way I wrote the code. I originally determined the best guess given a particular guessing/feedback history. I later improved it to determine the best guess given only a set of possible secret codes. I also originally wrote code to find the best response to each possible consistent guess. Later I made a dramatic improvement by only testing distinct guesses (ones that were not isomorphic to already-tested guesses). For this, I found the possible feedback for each guess (e.g. W=111, =81, R=64) and then ignored the guess if I had already tested a guess with the same feedback profile.
I was disappointed that the AC strategy table does not appear to exhibit patterns that would lend themselves to rules-of-thumb. I had hoped to find some rules-of-thumb so I would only need to memorize exceptions, but it seems like every situation is an exception.
The Easy Strategy
How can we play a successful game of Mastermind without memorizing a massive table or hurting our brain to play a strategy like RC or X+LC?
The easiest strategy I know is largely equivalent to LC. The idea is to first determine all the 1s, then the 2s, then 3s, and so on, dealing with one color at a time, limiting ourselves to consistent guesses. This way, it is generally easy to determine what the feedback is telling us. In fact, we'll always start by assuming that the lowest numbers in our guesses earn the Rs, higher numbers earn the Ws, and the highest ones earn blanks. We'll only give up those assumptions when they remain untenable.
We always begin by guessing 1111. The only possible feedback we might receive is blank feedback, R, RR, RRR, or RRRR. Suppose we receive blank feedback in response.
1111
That tells us that there are no 1s in the solution. Next, we guess 2222, and we receive an R.
2222 R
Now we know there's exactly one 2, but we don't know where it belongs yet. We'll assume it goes at the beginning, and fill the remaining holes with 3s.
2333 W
Because we already know there's a 2 in the solution, we know the W applies to the 2, and therefore there aren't any 3s. We now try 2 in the next position, filling the remaining holes with 4s.
4244 RW
Things are a bit more confusing now, because it's not immediately obvious whether 2 or 4 earned the R. But if the 2 earned the R, then a 4 must have earned the W, which a little thought shows is impossible. The 2 must therefore have earned the W, so we'll move it into the next hole: xx2x. One of the 4s must have earned the R, so we'll assume it was the first 4: 4x2x. Then we'll fill the rest with 5s.
4525 RWW
This is tricky. Again, we'll begin by assuming the lowest value is correct, and see how that plays out. If the 2 is correct, we have xx2x. That means that the 4 and a 5 are wrong. We'll always start with the lower value (which is usually the most constrained). If 4 is wrong, it can't go in the first position. It also can't go in the second position, because we know a 4 earned an R in our 4244 guess. We already have a 2 in the third position, so the 4 must be last: xx24. Assuming the 5 is wrong, we'll move it into the first position to get 5x24, and fill the hole with a 6.
5624 RWWW
Tricky again. Last time we assumed that the 2 was correct, and that forced us to guess 5624. Since this wasn't correct, the 2 must have been wrong. We're now forced to move the 2 into the last position: xxx2. From our 4244 guess, we know the 4 must be in the first or third position now. If we put it in the first position, we get 4xx2. Looking back at our 4525 guess, we see that 4 earns the R, so 5 must earn a W, which is only possible if it goes in the third position: 4x52. With the 6, that gives us 4652. We see that makes the 6 correct on our last guess, which is consistent with earning RWWW. In fact, 4652 turns out to be the only consistent guess, and therefore the secret code itself.
4652 RRRR
Note that this isn't quite the LC strategy, because LC would have guessed 4652 before 5624. But this Easy strategy performs identially to LC. This 7-guess example is pretty typical of the Easy strategy (although i deliberately picked an example that required some especially tricky reasoning).
The Advanced Strategy
We already know that 1111 is a terrible first guess, so maybe we can improve on our Easy strategy by eliminating single-color guesses. Specifically, we'll guess 1122 instead of 1111 (and 3344 instead of 3333). Of course, this will complicate the reasoning a little. We'll call this the Advanced strategy, although it's really only slightly more advanced than the Easy strategy. Here's a sample game.
1122 R
We'll assume the first 1 is correct.
1333
Blank feedback. There must be no 1s or 3s, so one of the 2s in our original guess must have been correct. We'll assume the first 2 was correct, and fill the remaining holes with 4s.
4424 RW
2 can't be correct here, so we'll move it over again. We'll assume the first 4 was correct, and we'll fill the remaining holes with 5s.
4552 RRR
We'll assume the 2 and 4 are correct (as it turns out they must be), and that the first 5 is also correct. We fill the remaining hole with a 6.
4562 RRWW
Now we assume we picked the wrong 5, and swap the 5 and 6 to win the game.
4652 RRRR
This is the same secret code we found with the Easy strategy in 7 guesses, but notice that the Advanced strategy paid off here by solving it in 6 guesses.
Here's one more example of the Advanced strategy in action.
1122
Blank feedback means no 1s or 2s. We now guess 3344.
3344 RW
We'll assume the first 3 is right and the second is wrong, and fill remaining holes with 5s.
3535 W
Only one W, so there can't be two 3s. We'll move one 3 into the next available spot to get x3xx. We'll put 4 is the first consistent hole to get 43xx, and we'll fill the rest with 6s.
4366 R
Only one R. We tried two 3s and that was incorrect. We tried a 3 and a 4 and that was incorrect. At this point, only two 4s would explain our feedback from guessing 3344. We know the first position must have a 4: 4xxx. From 3344, we know there must be a 4 in one of the last two positions. Let's try the third position: 4x4x. One of the 5s in 3535 must have been wrong, but that's only possible if we change to 4xx4, placing a 5 to give us 4x54. Finally, we know there can't be any 1s, 2s, 3s, or 6s, and there can't be any more 5s. The only possibility for the remaining spot is another 4.
4454 RRRR
So, how does this strategy perform? It requires 6,210 guesses to find all 1,296 codes, with an average of 4.79 guesses per round. That's a bit worse than RC, but a lot easier to execute, and a big improvement over Easy (nearly a full guess). The worst case scenario requires 8 guesses, but this only occurs for a single secret code: 6543. Here is that worst case game.
1122
3344 RW
3535 RW
3663 RW
3456 WWWW
4365 WWWW
5634 WWWW
6543 RRRR
Final Analysis
Here is a summary of all the strategies we considered and how they compare to each other.
Strategy Avg Worst
LC / Easy 5.76 9
Advanced 4.79 8
X+LC 4.65
RC 4.64 9
X+RC 4.61
X+Y+RC 4.59
X+Y+LC 4.53
Knuth 4.48 5
Consistent Knuth 4.50 6
Avg Consistent 4.38 6
Koyama/Lai 4.34 6
No comments:
Post a Comment