Texas Hold-Em -- An Evolutionary Approach Part 7

© 2006 Richard P. Ten Dyke

Six months ago, we started an adventure in Poker, Darwinian Evolution, and Computer Programming. The adventure is now over, and it was a success. This is the final chapter and these are the results.

For programming, we used an object-oriented programming language called REALbasic, which contains all of the old capabilities of the BASIC language with object capability added. We created separate objects for the components of the game, including the deck, the deal, the player, and the game. The development environment that comes with the compiler already contained objects for building the human interface, including windows, buttons, and images.

We found that the built-in objects were essential to building the application, but the objects which we added were not. The logic of the game became cumbersome to debug, and we reverted to old habits. The deck, deal, player, and game objects were replaced with straight code, using properties (global variables) and methods (subroutines). Debugging the game logic became a challenging and time consuming part of the process, as expected. I do not rule out the use of adding one’s own objects to the system, but they should not be done in small pieces, but rather in larger chunks, such as, by adding an entire poker simulation to the system as a single object. That is for the future.

As for the game, we chose Texas Hold-Em, a variation of poker which consists of two player cards and up to five community cards. This makes the game quite different from seven-card stud where the player holds four hidden cards and three of the hand are exposed. In Hold- em, a player has to assess his chances using the community cards as they support his own hand, contrasted to those of the other players.

The first challenge was to find a means to place a value on a particular hand. We solved the valuation problem by finding a means to create a probability that ranks the possibilities of any particular hand against a random opponent. If you start by holding a pair of aces, it might seem that you are a near-certain winner. You are not. If the community cards show four spades, for example, and one of your aces is not the ace of spaces, then another player with an ace, king, or queen or jack of spades is going to win the pot with a high flush and you end up with zilch. But if you hold those aces, and the community cards show a mixture of suits and ranks, you have a very high probability of winning the pot. So the lay of the community cards is essential to placing a value, and a probability on a particular hand.

The second challenge for the game is betting strategy, which should rather be called “betting policy.” A policy is a set of guidelines that a player uses to decide when to raise, bet, call, or fold. This is partly based on the quality of the player’s hand, those two cards, and how they mesh with the community cards as well as the current betting situation. The hand quality is easy to assess. The betting situation is not. One can identify clues which include the betting behavior of the other players, and things like how much money you have, how many players started and how many remain who have not folded, etc. But the player needs to generalize on all this information, to find patterns that can be used as templates for decision making. I have been less successful in this department, and will have to do more to analyze this game logic. I am uncertain how important this is, and it may be less important than other factors that we will see in a moment.

The third important part of the study was the use of Darwinian evolution to develop the betting policies. It is well known for poker that a conservative betting policy is “better.” The question is whether there is a single “best” betting policy or whether there are several. The answer turns out be be more subtle than the question that was asked.

Before we answer that, we can report that the evolution process works well. To refresh your memory, players come and go in the game. Poorer players are replaced by near-clones of the better ones. Over time, the player pool evolves to conservative play. As the betting becomes more conservative, the games get shorter. Frequently the game is over before all of the community cards are shown, because all players but one have folded. This process is confirmed by observation of real games with real players. The failure of aggressive players—those not playing conservatively—occurred consistently, although they were “tolerated” in some simulated tournaments before being eliminated.

An anticipated bifurcation of player’s policies did not occur. We can’t be too disappointed, however, because it probably points up an important truth: poker is a simple game. If the game were more complicated, for example by introducing a high-low split of the pot, then we could expect to see some players become adept going for the high hands, and others adept at going for the low. The players exist in an environment, and they respond to it. If the environment rewards one group of actions, then those are the actions which will evolve. In a real world, the environment is much more complex, and many survival strategies are possible, so there is more opportunity for bifurcation. We will need a more complex game than poker to test these theories.

But wait, there’s more.

Let us get back to the question of whether there is a single best betting policy.

The environment in the game of poker consists of more than the hands that are dealt and the rules of the game. It also includes the actions of all of the players. And here is the subtle part. The environment is not stable, since it evolves along with the players as they themselves evolve. So a betting policy that is optimum playing with one set of players will not be optimum as opposing players change over time.

Here is the experiment that demonstrates that. After running a simulation, and after it has stabilized, we stop it. At that point, we pick a winning policy, and make it fixed. That is, you, as a player, stay with that particular policy in the game no matter what. Now let us start the simulation again with new players. Other players in the game come and go. They are not as good as you are, since you have already evolved. How will you do? The answer is that you will win—initially. However, the other players will evolve, and gradually you will win less. Then, the other players will evolve further and soon you will be eaten alive. This happens because you have become a fixed target in the environment and have become the food that the other players use to survive. This is remindful of the observations that led Gerald M. Edelman to describe how the immune system works in the human body (and to win a Nobel Prize in the process). Consider that you, as a player, are a virus invading the host organism (the game) and that the antibodies (the other players) are waiting for something to happen. There is not much to do until you show up, but when you do they go into action. Over a period of time, the antibodies themselves evolve, using you as the target, until they find just the right combination of features to destroy you.

To what degree this analogy works I don’t know; but it illustrates that while there may be a perfect policy for playing a game, it depends on how the other players are playing. Your policy must be flexible enough to take that into consideration. Like a virus, a policy must evolve along with the game itself or be destroyed. The fact that a virus can evolve and defeat an immune system, or that bacteria can evolve and defeat antibiotics, is well know in medicine. Similarly, in poker, a perfect policy is not determined by only the rules of the game, but also on how it is being played at the moment. A perfect policy is a flexible policy, tuned to the moment. I am sure that really good poker players know this instinctively. We learned it by Darwinian simulation.

Richard Ten Dyke has previously contributed to this newsletter on the topic of Digital Photography and computer creativity. He is retired from IBM and can be reached at tendyke@bedfordny.com. All opinions are his own, and he welcomes comments.

© 2006 Richard P. Ten Dyke

 

 


 
 
© Danbury Area Computer Society, Inc. All Rights Reserved.
Web Site Terms & Conditions of Use