回活页
文章标题
making money with gto strategy 2+2杂志
[size=12.00119972229px]The University of Alberta's Computer Poker Research Group (CPRG) recently achieved a significant milestone in the evolution of poker strategy by solving heads-up limit poker. The resulting discussion within the poker community has exposed some significant misunderstandings about the nature of game theory and optimal play. I'm no game theory expert, and to be honest I misunderstood some of the same things myself until recently. [size=12.00119972229px]One of the most significant misunderstandings, and the one that I want to address in this article, is that game theoretically optimal (GTO) play is merely breakeven poker, which after accounting for the rake would actually yield a negative expected value for such a strategy. Although it's true that two players employing equilibrium strategies would simply push money and back and forth until it was all raked away, such strategies can and do profit from opponents who employ unbalanced strategies. [size=12.00119972229px]What an equilibrium strategy actually does is guarantee you a certain minimum expected value, regardless of your opponent’s strategy. [size=12.00119972229px]To see how this works, think of a heads-up limit hold 'em game with blinds of $1 and $2. First, imagine that instead of betting, the player in the small blind always calls and then both players check it down all the way, turn over the cards, and award the pot to the player with the better hand. Clearly, in this game neither player has an advantage. If there is a rake then both will slowly lose money to the house. [size=12.00119972229px]The introduction of betting makes position important, and now the player on the button, who enjoys the additional advantage of posting a smaller blind, has a clear theoretical advantage. If the blinds do not rotate, this player will profit over time unless the rake is very large or his strategy is very bad. [size=12.00119972229px]The CPRG's most successful heads-up limit hold 'em bot, named Cepheus, has a win rate of approximately .044 big bets (nearly $0.18, in our example) per hand when playing from the button against itself. That means that if you play against it from out of position, you can expect to lose at least .044 big bets per hand no matter how good your strategy is. You could easily do worse, but you won't do better. [size=12.00119972229px]It also means that, if you play against it from in position, you won't win more than .044 big bets. Again, you could easily win less, and you might even lose, but you won't win more. If you folded 100% of hands on the button, then you would lose .25 big bets per hand in that seat. Hopefully you are a bit better at poker than that, but the point remains: a game theoretically optimal strategy is not a breakeven strategy and it can profit from an opponent’s mistakes, although in many cases it will not profit as much as a well-crafted exploitive strategy would. [size=12.00119972229px]In Mathematics of Poker, Bill Chen and Jerrod Ankenman investigate a toy game called the AKQ game that is easy to solve using game theory. In the version of this game we will consider, each player antes $15. Each is dealt an Ace, King, or Queen from a three-card deck (meaning that if you are dealt a King, you know your opponent has either an Ace or a Queen). There is one round of betting with a fixed $10 bet and no raising. If neither player folds, there is a showdown and the player with the higher card is awarded the pot. [size=12.00119972229px]If the out of position player (OP) checks, the in position player (IP) will obviously want to bet all of his Aces. It's also clear that he accomplishes nothing by betting a King, as an Ace should never fold and a Queen should never call. The interesting question is whether he should attempt to bluff his opponent off of a King when he holds a Queen. [size=12.00119972229px]Optimal strategy in this game revolves around the concept of indifference. A GTO strategy would aim to bluff in such a way that, whether OP calls always, sometimes, or never with a King, he can't take advantage of IP's bluffing strategy. In this case, IP's bet will be $10 into a $30 pot. When OP calls with a King, he loses $10 if IP has an Ace but wins $40 if IP has a Queen. If IP bets his Queens exactly ¼ as often as he bets his Aces, then OP will have an expected value of 0 for calling with a King, which is the same as the expected value of folding. Thus, calling is no better or worse for him than folding, which is why we say he is indifferent. [size=12.00119972229px]It turns out that the optimal strategy for OP is to bet 100% of the time that he has an Ace, bet 25% of the time that he has a Queen, check-call 75% of the time that he has a King, and check-fold otherwise. IP's optimal strategy if OP bets is to call with all of his Aces and 50% of his Kings. If OP checks, IP checks behind with all of his Kings and 75% of his Queens. He bets the other 25% of his Queens and all of his Aces. In the absence of rake, this game has an expected value of $0.42 for IP and -$1.42 for OP. Neither player is exploiting the other, but position alone is valuable. [size=12.00119972229px]These are equilibrium strategies, which mean that neither player can unilaterally improve his expected value. If OP were to change his strategy and start check-folding 100% of his Kings, the expected value of the game would remain the same. Of course if IP also changed his strategy, he could exploit OP by bluffing with all of his Queens, and that would cause him to have a higher expected value. If he does not know what OP's strategy will be, though, he can guarantee himself a theoretical profit of at least $0.42. These GTO strategies enable IP to “lock up” the value of his position, and they enable OP to keep his losses to a minimum. [size=12.00119972229px]If these players alternate positions and play an even number of hands, then neither will have an advantage. Even if OP bluffs too much or not enough with his Queens, or folds too much or not enough with his Kings, he will not lose anything unless his opponent actively exploits his error. The GTO strategy does not profit from these mistakes. [size=12.00119972229px]However, there are some mistakes it will profit from. If OP makes the grievous error of calling a bet with a Queen or folding an Ace, then IP profits from these mistakes even when he does not actively exploit them. IP's bluffing strategy is designed to make OP indifferent to calling with a King. There is nothing IP can do to make his opponent indifferent to calling with a Queen or an Ace, but if OP makes a blatant error with these hands, then IP profits from that error. [size=12.00119972229px]The simplicity of the AKQ game makes it hard to imagine anyone making such a mistake. We all know that in real poker games, though, many players really do call with nearly hopeless hands, bluff with hands they'd be better off checking, and sometimes even fold absurdly strong hands (I used to play regularly with someone who swore that he always lost with pocket Kings and routinely folded them pre-flop even in unopened pots). A GTO strategy may not be the ideal way to profit from these errors, but it will profit from them. [size=12.00119972229px]The belief that a GTO strategy is necessarily a break-even strategy (or a money loser, in a raked game) is simply not true. It is true that two players employing near-optimal strategies against each other in a raked game will both lose money, but the nature of near-optimal strategies is such that using an exploitive strategy against this player will fare no better. It simply isn't possible to beat such a game, no matter how you play. [size=12.00119972229px]It is also true that, if you can predict specific mistakes that your opponent will make, you can craft a maximally exploitive strategy that will fare better than a GTO strategy would. Most poker players are not as good at this as they think they are, and many end up playing very exploitably in situations where they don't explicitly intend to do so. [size=12.00119972229px]Though computers and the very smart people who work with them have a way of surprising us, we seem still to be a long way from finding GTO solutions to multiplayer or no-limit poker games. Indeed, such solutions may not exist for multiplayer games at all. [size=12.00119972229px]Computers are, however, quite close to being competitive against the vast majority of poker players in these games. I'm not saying that everyone should try to learn and use near-optimal strategies in the games they play regularly. I do think, however, that understanding the game theory that underlies poker strategy can help you to identify leaks in your opponents' play to exploit as well as leaks in your own play that you didn't even know could be exploited. It can also help in situations where you don't know how to expect your opponents to play or when you are playing against opponents with a decided skill advantage over you. [size=12.00119972229px]GTO poker does not have to mean break-even poker. Then again, if you ever find yourself contesting a pot against Phil Ivey, you'll be glad to have the tools to help you come anywhere close to breaking even.
|