It is no mystery why poker is such a popular pastime: the dynamic card game produces drama in spades as players are locked in a complicated tango of acting and reacting that becomes increasingly tense with each escalating bet. The same elements that make poker so entertaining have also created a complex problem for artificial intelligence (AI). A study published today in Science describes an AI system called DeepStack that recently defeated professional human players in heads-up, no-limit Texas hold’em poker, an achievement that represents a leap forward in the types of problems AI systems can solve.
Neural Designer is a desktop application for data mining which uses neural networks, a main paradigm of machine learning. The software is developed by the startup company called Artelnics, based in Spain and founded by Roberto Lopez and Ismael Santana.
DeepStack, developed by researchers at the University of Alberta, relies on the use of artificial neural networks that researchers trained ahead of time to develop poker intuition. During play, DeepStack uses its poker smarts to break down a complicated game into smaller, more manageable pieces that it can then work through on the fly. Using this strategy allowed it to defeat its human opponents.
For decades scientists developing artificial intelligence have used games to test the capabilities of their systems and benchmark their progress. Twenty years ago game-playing AI had a breakthrough when IBM’s chess-playing supercomputer Deep Blue defeated World Chess Champion Garry Kasparov. Last year Google DeepMind’s AlphaGo program shocked the world when it beat top human pros in the game of go. Yet there is a fundamental difference between games such as chess and go and those like poker in the amount of information available to players. “Games of chess and go are ‘perfect information’ games, [where] you get to see everything you need right in front of you to make your decision,” says Murray Campbell, a computer scientist at IBM who was on the Deep Blue team but not involved in the new study. “In poker and other imperfect-information games, there’s hidden information—private information that only one player knows, and that makes the games much, much harder.”
Artificial intelligence researchers have been working on poker for a long time—in fact, AI programs from all over the world have squared off against humans in poker tournaments, including the Annual Computer Poker Competition, now in its 10th year. Heads-up, no-limit Texas hold’em presents a particularly daunting AI challenge: As with all imperfect-information games, it requires a system to make decisions without having key information. Yet it is also a two-person version of poker with no limit on bet size, resulting in a massive number of possible game scenarios (roughly 10160, on par with the 10170 possible moves in go). Until now poker-playing AIs have attempted to compute how to play in every possible situation before the game begins. For really complex games like heads-up, no-limit, they have relied on a strategy called abstraction in which different scenarios are lumped together and treated the same way. (For example, a system might not differentiate between aces and kings.) Abstraction simplifies the game, but it also leaves holes that opponents can find and exploit.
With DeepStack, study author Michael Bowling, a professor of machine learning, games and robotics, and colleagues took a different approach, adapting the AI strategies used for perfect-information games like go to the unique challenges of heads-up, no-limit. Before ever playing a real game DeepStack went through an intensive training period involving deep learning (a type of machine learning that uses algorithms to model higher-level concepts) in which it played millions of randomly generated poker scenarios against itself and calculated how beneficial each was. The answers allowed DeepStack’s neural networks (complex networks of computations that can “learn” over time) to develop general poker intuition that it could apply even in situations it had never encountered before. Then, DeepStack, which runs on a gaming laptop, played actual online poker games against 11 human players. (Each player completed 3,000 matches over a four-week period.)
DeepStack used its neural network to break up each game into smaller pieces—at a given time, it was only thinking between two and 10 steps ahead. The AI solved each mini game on the fly, working through millions of possible scenarios in about three seconds and using the outcomes to choose the best move. “In some sense this is probably a lot closer to what humans do,” Bowling says. “Humans certainly don’t, before they sit down and play, precompute how they’re going to play in every situation. And at the same time, humans can’t reason through all the ways the poker game would play out all the way to the end.” DeepStack beat all 11 professional players, 10 of them by statistically significant margins.
Campbell was impressed by DeepStack’s results. “They're showing what appears to be a quite a general approach [for] dealing with these imperfect-information games,” he says, “and demonstrating them in a pretty spectacular way.” In his view DeepStack is an important step in AI toward tackling messy, real-world problems such as designing security systems or performing negotiations. He adds, however, that even an imperfect-info game like poker is still much simpler than the real world, where conditions are continuously changing and our goals are not always clear.
DeepStack is not the only AI system that has enjoyed recent poker success. In January a system called Libratus, developed by a team at Carnegie Mellon University, beat four professional poker players (the results have not been published in a scientific journal). Unlike DeepStack, Libratus does not employ neural networks. Instead, the program, which runs off a supercomputer, relies on a sophisticated abstraction technique early in the game and shifts to an on-the-fly reasoning strategy similar to that used by DeepStack in the game’s later stages. Campbell, who is familiar with both technologies, says it is not clear which is superior, pointing out that whereas Libratus played more elite professionals, DeepStack won by larger margins. Michael Wellman, a computer scientist at the University of Michigan who was also not involved in the work, considers both successes “significant milestone[s] in game computation.”
Bowling sees many possible directions for future AI research, some related to poker (such as systems that can compete in six-player tournaments) and others that extend beyond it. “I think the interesting problems start to move into what happens if we’re playing a game where we don’t even know the rules,” he says. “We often have to make decisions where we’re not exactly sure how things actually work,” he adds, which will involve “building agents that can cope with that and learn to play those games, getting better as they interact with the world.”