Danbury Area Computer Society

Texas Hold-Em -- An Evolutionary Approach Part 5

"It ain't the cards you git, it's how you plays 'em that counts." said Jeb as he pulled the chips from the center of the table and started stacking. "I had a purdy good hand too." said Louie who had just tossed in three kings as he folded." "I know," replied Jeb, "that's why I raised."

Now, that dialog, which I made it up, was based on actual events. Once I was Jeb, and bluffed the shirt off a guy who had a really good hand, and once I was Louie and someone did the same to me. It's when the weak hand beats the strong hand in this game, and it happens quite frequently— well, at least sometimes. Louie's last name is Loser, and Louie is in every game. If you don't know who he is, then he's you.

One of the ironies of this game is that a good hand can damage your finances. It seems to be a contradiction, but is really common sense. With a good hand you are more likely to be pulled into betting large amounts, but you can still lose to a better hand. Because you know this, you might also fold, as Louie did.

A poker player who only plays his hand, which is to say he bets on good hands and folds on bad, can become Louie the Loser in any game. His opponents can learn that he can be bluffed out of his socks by raising against him until he drops out. OK, there might be a Louie the Loser who stays in no matter what, always to the bitter end—to "keep 'em honest" as they say, but that is a different syndrome. Reminder: there is more than one way to lose at poker.

The task before us is to create a machine that plays poker not like a machine, one that would be difficult to predict and difficult to win against. We are not there yet, but there is progress.

Earlier, we described a poker simulator that allows a human to play a simplified game against a computer. The odds are stacked against the human, but the human could beat the odds by using good judgment.

Then we looked at a mechanized way of evaluating a hand, a necessary step for a machine to play a good game. We developed a method to determine the probability that any given hand would win against a random opponent. This turns out to be an effective and relatively simple task, but it is compute intensive.

The current goal is to use that technique to build a new game simulator (without a human interface) that plays poker tournaments using machine-generated players and allows for variation in playing strategies. So far, we can report on a simulator that only allows the players to play their own cards; that is, they do not take the opponents into account. This may be unsatisfactory as a final product, but it is a necessary step to get the programs debugged and to see if anything interesting develops.

The results of this phase are now complete, and they are in line with expectations.

Each of the simulations that we have completed so far is the result of playing about 500,000 games. Actually, 500,000 may not enough to get consistent and reliable results. With my computer, one complete run of 500,000 games takes several hours because the routine that calculates the quality of the hand is compute intensive.

The results confirm that there is little to be learned in pre-flop betting. Much more is learned when the flop is on the table. The quality of the information improves with the next card (the turn) and with the final card (the river). The simulation results reflect this.

The quality of a hand in each betting round is specified by a number from 0 to 100. If a hand has a quality score of 75, then there is a 75 percent chance that the hand would win against another hand drawn at random. In playing the game, each player will compare his hand's quality score with a chosen threshold value "x" at each round of betting. If q is greater than x, the player will bet. If q and x are approximately the same, the player will stay. If q is greatly less than x the player will fold. Each player has his own set of four threshold values, one for each of the four rounds of betting, pre- flop, flop, turn, and river.

In a simulation, threshold values distinguishes better players from poorer ones. However, players do not actually choose their own threshold values. Rather, they are inherited. This is the "evolution" aspect of the game. Initially, in a simulation, all players are given threshold values at random, but these change as the simulation proceeds.

The simulation draws from a pool of 32 players. A tournament consists of an evening with many tables playing. In play, each table starts with 7 players drawn at random from the pool. Each table will play 21 hands in an evening, and each evening will consist of several tables. In an evening, players can play at more than one table. Players at a table will play each hand, and bet, stay, or fold. One player, in a position which rotates around the table, must "bet blind" to start off the betting. At each of seven rounds of betting, the size of the bet doubles. This is my rule, not one of drawn from the game itself, and I use this to achieve some regularity and ease of programming. Sometimes the players will bet to the limit, and sometimes all players but the last better will fold and, the game is over before betting limit is reached.

After each evening, a best player and a worst player of the tournament are identified by the amount of winnings or losses. The worst player will leave the tournament and be replaced in the pool by a descendent of the best player. The descendent results from an imperfect cloning process. It is not perfect, or no evolution could take place. In that sense it is more like sexual reproduction than cloning. Eventually, after many tournaments, all of the original players in the pool are replaced by better players. After the completion of a large number of tournaments, we can view the profiles of the players that have evolved and which are now in the game.

This is one example of threshold values for a winning player:

Threshold: Pre-flop (35), Flop (89), Turn(34), River (10)

This tells us is that our player stays in at the Pre-Flop 65 percent (100 - 35) of the time. That is, the quality value of his hand must be greater than 35 in order to stay in the game. That is not a big hurdle to overcome. At the Flop, the player folds if the quality of the hand is less than 89. That means that he needs a very good hand to stay in. After that, if the player is still in the game, he is likely to stay to the end no matter what cards are played on the table. For this player the Flop is the main decision point of the game.

Other betting patterns also emerge, but this or similar patterns have been repeated in several simulation runs. Sometimes not. In one run, many of the players made the decision at the Turn instead of the Flop. In another run, most of the players made the big decision at the Flop, but a few remained who were very aggressive and had low thresholds at each point of the game. It is not surprising to find variations, and we can expect to see more of them as we proceed to more complex strategies.

As you can see, the players’ thresholds do not reflect other factors in the game which would be important in a realistic simulation. A strategy has to take into account more than just the quality of a hand. However, we do not yet know what these factors should be or how important they may become.

We have two challenges ahead. One will be to find a way to get more games completed in a simulation run within a reasonable amount of time. The second will be to design players strategies that take into account more factors in the game, particularly bluffing. These are some of the additional factors (besides hand quality) to be considered:

About the Player:

How much money do I have left?

Should I bluff?

About the Opposing players:

Is there a strong player in the game? A weak player?

How much money does each opposing player have?

What are their betting histories?

About the Game itself:

How many players started in the game and how many remain.

These factors should not just be used singly, but also in combination. For example, A bluffing strategy might be desirable only if there is one opposing player in the game, and only if that player has a history of folding. Of course, I am only speculating. The simulation will tell us which factors and which combinations are important and which are not. We will run simulations adding new factors—one or two at a time—to see the cumulative results of each.

A formidable task? We shall see.