Wednesday, December 30, 2015

Dave Computes Optimal Hold'em Poker Play

Computers are beginning to play hold'em poker as well as the top human players.  Such computer players really consist of three computer programs run in three separate phases.

Phase 1:  Buckets
Because of the astronomical number of possible poker situations, programs are needed to assign a wide range of situations to a limited number of buckets.  For example, maybe K8s falls into preflop bucket 5, Q5 with a flop of QTT falls into flop bucket 7, TT with a turn of 9742 falls into turn bucket 8, and 94 with a river of Q9862 falls into river bucket 6.  The computer also calculates the probability that a player will be dealt a hand assigned to preflop bucket 6 (for example), and the probability that such a hand will move into flop bucket 4, then into turn bucket 7, and finally into river bucket 7, when it will beat hands in river bucket 6 and lose to hands in river bucket 8.

Phase 2:  Learning the Abstract Game
With the precomputed buckets and probabilities, a second program can now simulate abstract poker games at great speed.  In this phase, there are no playing cards at all--just bucket numbers.  The output of this program is an enormous table indicating how to play any hand (e.g. flop bucket 7) in any situation (after a given history of folds, calls, and raises of varying amounts).  Computing this table can require hundreds of processors running for months on end.  Various approximations are needed to ensure that results are computed in a reasonable amount of time, such as limiting the number of players, number of raises allowed, possible sizes of raises, number of buckets, size of betting history stored, and skipping betting altogether on one or more streets.

Phase 3:  Playing Poker by Numbers
After phase 2 does all the hard work, we finally get to play poker again in phase 3.  Here, a program simply converts any hand dealt to a bucket number and then makes the correct play for that bucket and betting history as found in the enormous table computed in phase 2.

Dreams of a Human-Usable Poker System
Sadly, such computer players have so far focused almost exclusively on heads-up poker, a variation rarely played in practice.  And more importantly, we humans have not benefited from all these calculations.  Our poker books still tell us how to play based only on the authors' intuition and experience.  These books are also incomplete, providing no insight into how to play in a multitude of situations.  Such books also stress that how you play a hand depends on how your opponents have been playing, but don't provide a foundation for how to play when you first sit down at a table--your ABC poker.  I want to see a book based on computed tables describing optimal play.  Of course, I can't memorize the large tables used by poker programs, so I also want to see these tables distilled into useful rules of thumb I can easily keep in my head and apply in real time.

My Summer Poker Project
To that end, this summer I coded the first two of these phases, with the goal of developing a system to help me perform Phase 3 as a human player.  I used my Easy Formula to assign preflop hands to buckets numbered from 0 (J2 and worse) to 14 (AA).  On the flop, turn, and river, I used buckets numbered from 0 to 9, where bucket 7 (for example) has a 70-79% chance of winning in a heads-up showdown on the river.  (In practice, no hands fall into preflop bucket 0.)  See my Flop Formula to get a feel for the different flop buckets.

For Phase 2, I used a counterfactual regret minimization algorithm, as explained beautifully by Neller and Lanctot in their 2013 paper entitled An Introduction to Counterfactual Regret Minimization.  In this algorithm, players begin by making each move at random.  After each play, the computer makes a note of how much better it would have been to make a different move.  These regret values are accumulated for each information set (e.g. bucket number and betting history for this deal).  Future plays are weighted based on these regret scores.  Over a long (and seemingly unpredictable) period of time, these strategies converge to the Nash equilibrium.

There were three reasons I was forced to make approximations in my program.

1.  Limited computation time
2.  Limited computer memory
3.  Limited human memory (I wanted to keep the resulting tables small and memorizable.)

I found that doubling the number of buckets would merely double the time, but my limited human memory held me to 15 preflop buckets, 9 flop buckets, 10 turn buckets, and 10 river buckets.  Increasing the number of players, number of raises per street, or number of possible raise sizes increased the computation time exponentially.  Cutting out a round of betting saved enormous time, and limiting history stored was critical to keeping the size of the output data to a manageable size for my limited brain.

All my computations are based only on pot-sized raises.  Experimentations with larger raises revealed that they were almost never chosen.  Smaller raises were helpful (though not often preferred to pot-sized raises).  Such smaller raises led to complex betting strategies that would be challenging to memorize.  (In short, whenever it'd be preferable to make a smaller raise with a weaker hand, it becomes necessary to make smaller raises with some of the strongest hands, too, to prevent exploitation.)  My results are primarily based on 2 raises per street, which I suspect leads to a resulting strategy that reraises too many hands (unafraid of a third raise).

Summary of What I Learned
1.  If you're the first to enter a pot preflop (and you're not one of the blinds), enter with a raise.

2.  You should not often find yourself laying down hands that you bet for strength.  If you're raised, plan to call often.  Likewise, if you limped in, you should probably call a raise.

3.  Any time you try to get away with limping in with a weak hand, you must compensate by limping in with your strongest hands (or you can be exploited by an opponent who bets/raises when you show weakness).

4.  You must bluff sometimes (or you can be exploited by an opponent who folds to all your shows of strength).

5.  On the button, you should generally only bet with your strongest hands and bluff with your weakest hands--the ones that don't have a chance of winning otherwise.  This means that you should often accept free cards with marginal hands that could still improve, instead of semibluffing them or using continuation bets.  Of the hands in the middle that you check, you'll check the worst with intention of folding, and the best with intention of calling.

6.  Under the gun, you should often check with intention of raising your strongest hands.  This is especially true on the flop and turn when you weren't the aggressor in the previous street.  If you were the aggressor in the previous street, you should come out betting with anything decent as a semi bluff or continuation bet.  You generally don't want to check raise here or bluff your weakest hands.  If you're called, you'd rather have a chance of improving.

Heads Up Preflop Details
This data is based on a multi-street game with 40 million trials (9 hours).

The button (Btn) should raise with 3+ and limp in with the rest (calling with 2+ if raised).
If checked, the big blind (BB) should raise 4+.  If raised, BB should call 3s and 4s, and reraise 5+.

Multi-Player Preflop Details
This data is based on a 5-player preflop-only game (i.e. no betting on flop/turn/river) with 10 million trials (6 hours).  As a result, position in post-flop play did not factor into these results.  The summary you see here has been approximated and simplified a bit to keep the list memorizable.  I'm naming the seats as:  Btn, CO, UTG, BB, SB.  UTG (under the gun) is first to act preflop.  I'm also assigning numbers to the non-blind positions based on the hands they open with:  UTG = 6, CO = 5, Btn = 4.

Opening
Seat n opens by raising n (and folding the rest).
SB opens by limping with 1, raising 4, and limping with 8.
BB opens by raising 3 (when SB limps and all else fold).

Defending Against a Raise
When seat n raises, all (except BB) call n+1 and reraise n+2.
(This also holds when seat n limps and someone else raises.)
If SB raised, BB calls 1 and reraises 6.
If Btn/CO raised, BB calls 2 and reraises 7.
If UTG raised, BB calls 3 and reraises 8.

Defending Against 2 Raises
When seat n raises and someone reraises, call with n+3.

When Opponents Limp In
When seat n opens by limping in, non-blinds call n and blinds call any.
All can raise with 8.

When You Are Reraised
Call any.

Heads Up Post-Flop Details
This data is based on a multi-street game with 40 million trials (9 hours).  These results are highly simplified for ease of memorization.  (I have doubts about the validity of some of these thresholds.)

Flop/Turn, BB:  Bluff some and bet 7+ (on flop after raising preflop) or check any (otherwise).

Btn, after BB checked:  Bluff often and bet 7+ (on flop) or 8+ (on turn/river).

Flop/Turn, facing bet:  Call 5, raise 6+ (on flop with no preflop raise) or 8+ (otherwise).

River, BB:  Bluff often and bet 9.

River, facing bet:  Call 8, raise 9.

Thursday, July 30, 2015

Dave Computes Players-Per-Street Statistics in Hold'em Poker

This data is based on a simulation of 100,000 hold'em poker deals with 5 players, with a cap of one pot-sized raise per round (to allow a large number of hands to complete in 8 hours).  Although the player strategies were probably far from optimal, I believe that the play frequencies computed at the end of 100,000 deals are probably fairly accurate.

19% of deals are decided preflop.
32% of deals are decided on the flop.
18% of deals are decided on the turn.
31% of deals are decided on the river.

On the flop, 47% of deals are heads-up, 27% are 3-way, 6% are 4-way, and 1% are 5-way.

On the turn, 38% of deals are heads-up, 9% are 3-way, 2% are 4-way, and 0% are 5-way.

On the river, 24% of deals are heads-up, 6% are 3-way, 1% are 4-way, and 0% are 5-way.

Since 100% of deals are multi-way preflop, it's important to know preflop play cold.
The next most common situations are a heads-up flop (47%), a heads-up turn (38%), a 3-way flop (27%), a heads-up river (24%), a 3-way turn (9%), a 3-way river and a 4-way flop (6%).

In other words, improve your heads-up flop/turn play before worrying about 3-way play, and 4-way play ought to be very rare.

Dave Computes a System for Evaluating Flops in Hold'em Poker

Here is a simple but effective system for giving a rough estimate of how good your Hold'em poker hand is on the flop.  We'll group hands by how likely they are to win a showdown in a heads-up game.  A hand that scores a 9 should win a heads-up showdown 90-100% of the time.  A hand that scores an 8 should win a heads-up showdown 80 - 89% of the time, and so on.  When a pair is made, this system awards more points for higher pairs--regardless of the presence of overcards in the flop or the strength of kickers.  This surprising aspect of the system is nonetheless consistent with probabilities of winning a hand, as determined empirically.  Of course, overcards and kickers do matter, but they don't often move a hand out from winning 75% of the time (say) to winning 65% or 85% of the time.

The Flop Scoring System
+9 for trips
+8 points for KK
+7 points for 99
+6 points for 66
+5 points for 22, 4-flush, open-ended straight, A on a paired flop
+4 high cards, gutshot
-1 for single-suit flop with no flush draw
-1 for no-gap (e.g. 987) or one-gap (e.g. 976) with no straight draw

9-Point Hands
These are made hands, like A5 with a flop of 432.  Most 9-point hands are trips, like 33 with a flop of K73, or 52 with a flop of 554.

8-Point Hands
J6 with a flop of JT6 scores 8 points for making 2 pairs.  AA with a flop of KT7 scores 8 for making a pair of aces, as does A2 with a flop of AQK (despite the weak kicker and threatening board).  So does K4 with a flop of K53 for making a pair of kings.

7-Point Hands
Q5 with a flop of QTT scores 7 for making a pair of queens (no credit awarded for making 2-pair on a paired board).  Likewise, J6 with a flop of AJ5 scores 7 for making a pair of jacks.

6-Point Hands
83 with a flop of A82 scores 7, as does 77 with a flop of 954.

5-Point Hands
54 with a flop of QT5 scores 5, as does J3 with a flop of 763.  Jc8h with a flop of Qh9h5h scores 5 for a 4-flush.  Likewise, KT with a flop of QJ4 scores 5 for an open-ended straight.  A5 with a flop of KK4 scores 5 for holding an ace on a paired flop.  Th6h with a flop of Ks6s3s scores 5 points:  6 for a pair of sixes and -1 for a single-suit flop.  Likewise, Q6 with a flop of 764 scores 5 points:  6 for a pair of sixes and -1 for a single-gap flop.

4-Point Hands
J2 with a flop of 432 scores 4 points:  5 for a pair of twos and -1 for a no-gap flop.  Nearly any other vaguely playable hand scores 4 points.  These are generally hands with an ace or a couple of high cards, like KJ with a flop of Q32, or A2 with a flop of 974.  Gutshots also score 4 points, like T4 with a flop of KQ9.

4-and-Lower
4-point hands are generally worthless.  These are hands you hope will improve for free, but you should probably not invest any money in them (even as bluffs).  Because of this, there's no sense in developing a rule to distinguish 4-point hands from 3-point hands.  The very worst hands score 1 point.  (All hands have at least a 10% chance on the flop of winning a heads-up showdown.)  1-point and 2-point hands have virtually no possibility of improving.  These hands are probably worth a single bluff given the right circumstances, but should otherwise be discarded.

Turn Scores
Regarding the dream of a scoring system for turn and river holdings, turn evaluation is very subtle.  It appears to depend about equally on both the strength of the hand you made and the number of single cards that an opponent could hold that would beat your hand.  Maybe the best way to evaluate the turn is simply to start with your flop score and then decide whether the turn card itself ought to bump the value of your hand up or down.  A formula would certainly be welcome, as evaluating your hand at the turn is quite critical.

River Scores
Evaluating at the river is probably less critical, as few hands ought to reach the river, and you can usually tell whether you made your hand or not and whether your opponent is likely to have made a better hand.  The single most important aspect of evaluating a hand at the river is the number of single cards that your opponent might hold that would beat your hand.