Towards Explainable AI for Chess
AlphaZero has been memorably described as playing “like a human on fire.” But while its fluid, enterprising play impresses many observers as human-like, the process it uses to get there is anything but human. Humans learn from other humans. We read books, play over old games, and engage our opponents in postmortem discussions. According to legend, Paul Morphy learned the game by watching his father and uncle play.
AlphaZero does none of these things; it learns only by playing itself. This is part of what makes it so exciting. It’s a true tabula rasa, a completely fresh perspective. It’s like teaching a super-intelligent alien the rules of chess and asking, “Well, what do you think?” In that sense, it’s surprising that AlphaZero’s play seems so human-like. For instance, it was not a foregone conclusion that it would mostly agree with modern grandmasters about what the best openings are.
While it only took nine hours of training to surpass human performance in chess, this is misleading: according to one estimate, it would cost around $35 million in computing power to replicate AlphaGo Zero (the Go version of AlphaZero). Part of the reason it requires such enormous resources to train is that it starts with no human chess knowledge apart from the rules. This is by design.
Rich Sutton wrote, “We have to learn the bitter lesson that building in how we think we think does not work in the long run.”
In computer chess, the methods that defeated the world champion, Kasparov, in 1997, were based on massive, deep search. At the time, this was looked upon with dismay by the majority of computer-chess researchers who had pursued methods that leveraged human understanding of the special structure of chess. When a simpler, search-based approach with special hardware and software proved vastly more effective, these human-knowledge-based chess researchers were not good losers. They said that “brute force” search may have won this time, but it was not a general strategy, and anyway it was not how people played chess. These researchers wanted methods based on human input to win and were disappointed when they did not.
What does work, he says, is taking advantage of Moore’s Law by building systems that can scale with increases in computing power. This is the philosophy behind AlphaZero. (The leader of the AlphaZero team, David Silver, studied under Sutton.)
It may seem like a paradox that taking out human knowledge led to something that plays more like a human. However, perhaps the key factor was not removing human knowledge, but enabling learning. Maia - a neural network-based engine that learns by imitating human games (not by playing against itself) - is arguably the most human engine yet. People seem to enjoy playing with it: the Maia bots already have hundreds of thousands of games on lichess. You can play against one of them right now.
Still, we are only scratching the surface of the possibilities of using computers as training tools. For example, why not evaluate positions in a way that’s more realistic for human games? Current computer evaluations are based entirely on the computer’s main line. Even if best play for one side requires playing five “only moves” in a row, the evaluation the computer shows you is based entirely on that line, with no regard for the possibility of a mistake.
One way to craft a more human evaluation would be to use Monte Carlo simulations. The idea is very simple: to evaluate a position, simply play it out. With the help of computers you can play it out many times and take the average result. So to predict how a position will play out in a game between 1500 players, you can do a Monte Carlo simulation using the 1500 Maia bot (a Maia Carlo?). If you’re around 1500 strength yourself, you might find this evaluation a lot more relevant than the Stockfish evaluation.
Another good step in this direction is the DecodeChess app, which interprets engine evaluations in terms of concepts like threats, undefended pieces, and plans. This translates engine moves into concepts we’re already familiar with, but I keep coming back to the idea of AlphaZero as a super-intelligent alien and wondering what it might have figured out that we don’t know yet.
According to Sutton, we shouldn’t try to put our insights into machine learning models, but can we pull insights out of them? This is what they’re trying to do with computer vision on the Circuits thread on distill.pub. The core idea is that trained neural networks contain human-readable algorithms (this is still somewhat controversial in the machine learning community). To demonstrate this, they go through a trained image model neuron-by-neuron and circuit-by-circuit and try to understand what it’s doing.
In the classic interpretation of a computer vision model, successive layers in the neural network detect increasingly more complex features. The first layer detects lines and curves, the next layer detects shapes like rectangles or circles, and so on up to eyebrows and firetrucks. Examining the neurons, they do find intelligible algorithms, some expected and some unexpected.
In the expected group are edge detectors - filters that look for vertical or horizontal lines. These were widely used and understood in computer vision before machine learning, much like handcrafted evaluation functions in old school chess engines.
(In fact, early vision research on cats found that the visual cortex responds strongly to straight lines. This suggested that biological vision operates along a hierarchy of complexity, which was part of the inspiration for the layered structure of neural networks.)
One of the unexpected features was frequency detectors:
High-low frequency detectors are an example of a less intuitive type of feature. We find them in early vision, and once you understand what they’re doing, they’re quite simple. They look for low-frequency patterns on one side of their receptive field, and high-frequency patterns on the other side. Like curve detectors, high-low frequency detectors are found in families of features that look for the same thing in different orientations.
In a previous post I described how LeelaZero, the open source replication of AlphaZero, seems to value piece activity more highly than traditional engines. Nonetheless, piece activity is something we’ve always known about, even if we maybe didn’t prioritize it highly enough. Could there be something like a frequency detector for chess - a simple, powerful idea we never even thought of? What would it look like?