The Ancient Game of Go Succumbs to the Monte Carlo Method

goboardwikipedia.jpg
from Wikipedia

An article in the Seattle Times paints a vivid picture of the fascination of computer programs that play the ancient game of Go. One evening in 2006, Peter Drake, an assistant professor of Computer Science at Portland’s Lewis & Clark College stays late in his office and begins adjusting the algorithms in his computer Go program, known as Orego. The next morning, he is still staring at the computer in his office.

What is so exciting about this particular computer program that it motivates a grown man to put in an all-nighter? It turns out that Drake’s most recent strategy, known as the Monte Carlo method, coupled with his latest algorithmic tactics, are finally allowing Orego to beat a tough competing program in a challenge over the Internet.

Go

The game of Go originated more than 2,000 years ago in China. One player uses black stones, the other white. Players lay stones on a grid, traditionally 19 x 19 lines, in an attempt to capture the most territory. The rules are simple, but mastering the game is not.

Writing a computer program to play Go is especially challenging. Although recent computer versions of chess can beat even world champions, writing a computer Go program that can compete against expert players is still one of the grand challenges in the field of artificial intelligence.

Why are computer Go programs so hard to write?

A traditional game program chooses the next move by constructing a tree representing all possible next moves, followed by all possible countermoves, and then all possible countermoves to the countermoves, and so on. Although the tree broadens very rapidly, it’s often possible to “prune the tree” by evaluating potential intermediate game states without going all the way to the end of the game. (For example, in chess a non-end game state might be evaluated by simply counting the pieces on both sides.)

Go has two problems with this approach. First, because of the large size of the board and the huge number of allowed moves, the branching factor in a game tree would much larger than in chess and its size would soon grow out of control. Also, in Go it is much less feasible to stop building the tree before the endgame, because it is much more difficult to evaluate non-end game states. (I’m not a Go player, but I’ve read that this has something to do with effectively “dead” stones remaining on the board for many moves.)

The superiority of humans over computers in Go might also be explained in more humanistic terms. Go challenges computer algorithms because it requires broad pattern matching skills and a relatively intuitive approach, while chess favors an algorithmic solution because the focus is more localized (on the capture of a single piece) and the approach is more logical. The game of Go can demand such a whole-body, visceral level of concentration that the fictional Go master in Trevanian’s novel Shibumi eventually develops stomach cancer due to the constant strain of it.

goboardmattryall.jpg
“My Go Board” by Matt Ryall

Enter Monte Carlo

A recent and surprisingly effective breakthrough in computer Go is the use of the clever and conceptually simple Monte Carlo method. On small-sized boards (9 x 9 lines), Monte Carlo based Go programs can now challenge human experts. (Although Monte Carlo Go on a 19 x 19 board is not yet a strong contender against either computer or human opponents.)

The most famous early use of the Monte Carlo method was for running simulations during the development of the atom bomb in the Manhattan Project at Los Alamos, New Mexico. According to Stanislaw Ulam, a Manhattan Project researcher and famous mathematician, the Monte Carlo method was named in honor of his uncle, an inveterate gambler. (See the fascinating autobiography Adventures of a Mathematician by Stanislaw Ulam.)

How does the Monte Carlo method work?

Sometimes it’s easy to calculate the probability of a given event. For example, the probability of getting two heads with two flips of a coin is simply 1/2 * 1/2 = 1/4. Other times, however, it’s difficult or impossible to calculate the probability of a particular outcome of an experiment. In this case, we can use the Monte Carlo method to actually do the experiment many times (or at least simulate it) and find out the probability directly.

An example of a probability problem that is somewhat cumbersome to calculate is the classic puzzle known as Galileo’s Dice. According to legend, a gambler once ask Galileo to find the probability of getting a sum of 9 when three dice are rolled. To solve this problem using the Monte Carlo approach, we can write a program that uses a random number generator to simulate throwing three dice, and repeat that simulation maybe 1000 times. The number of times we get 9, divided by 1000, is the observed relative frequency of getting 9. If the simulated experiment is repeated a large number of times (1000 is a good choice), the observed relative frequency is an accurate point estimate for the actual probability of 9.

When a Monte Carlo Go program chooses the next play, it uses the Monte Carlo method to estimate, for each possible move, the probability that the move will result in a winning game. It then simply chooses the move that has the highest estimated probability of leading to a win. The probability estimate for each possible move works just like the dice example—but instead of simulating three dice throws to estimate the probability of getting a 9, it simulates a sequence of subsequent Go moves until the hypothetical game ends, to estimate the probability of winning with that move. The simulation might be repeated 100 or 1000 times. (The more times the better, but the rules of play might limit the permissible response time.)

A big advantage of a pure Monte Carlo approach is that it is based simply on random numbers and simulations, and the program does not need to use complex if-then rules based on expert knowledge specific to the game of Go. However, more effective Monte Carlo Go programs do incorporate a limited amount of expert Go knowledge in the form of heuristics (i.e. rules-of-thumb). Using heuristics is time consuming so a program can’t use too many.

The Monte Carlo method makes heavy use of memory and processing cycles. However, according to Remi Coulom, a pioneer in the development of Monte Carlo Go, the algorithms are easy to parallelize and can therefore take good advantage of today’s multi-core processors.

The Significance of Computer Go Research

It seems that serious computer-science types involved in game programming often feel a need to justify their fixation by describing the real-world problems that game research might solve. In the case of Go, however, it’s likely that the challenges encountered in game programming might truly resemble problems that are found in a high-entropy, complex, non-black-and-white world (more so than the challenges encountered in programming a more neatly logical game such as chess). Examples of these real-world problems suggested by computer Go researchers include designing electronic circuits, building self-driving cars, targeting web advertising, optimizing channel allocation in cellular systems, and determining the best sites for industrial plants. The techniques developed in computer Go programming might also transfer to other academic fields—such as cognitive science, pattern recognition, and machine learning—to a greater degree than chess programming techniques.

So creating a winning Go program is not only a grand challenge in pure artificial intelligence, but might also have real-world and academic payoffs.

Links

“What’s black, white and can stump computers?”

“AI Invades Go Territory”

“Orego”

“Computer Go”

“Monte Carlo Method” 

Comments are closed.