In one science fiction story, a spaceship sets out from Earth on a long journey to a distant planet. Due to the length of the trip, many generations of passengers are born and die in transit. When they finally arrive at their destination, the descendants of those early pioneers find on the alien world … their cousins from Earth already there, in long-established colonies. These others settlers had left centuries later, but had access to much faster space travel, which allowed them to make the journey in much less time, even considering the massive head-start.
This is how I think of the quiet revolution Google ignited with its AlphaZero learning algorithm, which uses a neural network with general reinforcement learning. On December 6, 2017, it was declared that “Chess changed forever” when it was announced that AlphaZero had crushed the program called Stockfish, the 2017 Chess.com Computer Chess Champion:
“the programmers of AlphaZero, housed within the DeepMind division of Google, had it use a type of ‘machine learning,’ specifically reinforcement learning. Put more plainly, AlphaZero was not ‘taught’ the game in the traditional sense. That means no opening book, no endgame tables, and apparently no complicated algorithms dissecting minute differences between center pawns and side pawns.”
Humans have been trying to improve Chess strategy for about 1,500 years, and computer programs have been refined for decades. Starting with ZERO knowledge of chess, besides the rules, AlphaZero became the most power chess-playing entity in history after training against itself for… 4 hours!
AlphaZero completely dispensed with all of the usual information chess algorithms usually rely on. These include simple human heuristics (e.g. “a bishop is worth three pawns” or “try to develop your pieces”), databases of openings or endgames, or even move lists from previous games. “Essentially, AlphaZero acquired 1,400 years of human chess knowledge—and then some—on its own, and in a ludicrously short amount of time.” Unlike with previous programs, this is not a case of methodically tweaking and optimizing a specialized algorithm until it outperformed older versions of itself. Instead, this is a new kind of flexible artificial intelligence that has the potential to be put to use on a huge array of problems is science, healthcare, or finance.
For a brief primer of how neural networks work, since this excellent video:
Since it is untainted by puny human minds, AlphaZero’s “alien” style can be shocking at times:
What’s also remarkable, though, Hassabis explained, is that it sometimes makes seemingly crazy sacrifices, like offering up a bishop and queen to exploit a positional advantage that led to victory. Such sacrifices of high-value pieces are normally rare. In another case the program moved its queen to the corner of the board, a very bizarre trick with a surprising positional value. “It’s like chess from another dimension,” Hassabis said.
Hassabis speculates that because Alpha Zero teaches itself, it benefits from not following the usual approach of assigning value to pieces and trying to minimize losses. “Maybe our conception of chess has been too limited,” he said.
On the one hand, AlphaZero took an “arguably more human-like approach” , since it could only process about 80,000 positions per second, compared with 70 Million for Stockfish. However, since there are no identifiable “rules” AlphaZero is playing by, we risk the “Black-boxification” of our tools, in which we know that something works, but have no access to the internal workings.
One trend I did notice in some of the games released by Google, was that AlphaZero would often sacrifice material to obtain positional advantages over Stockfish. Over and over, Stockfish’s pieces would get blocked in and almost completely negated, while AlphaZero’s pieces would command a huge range of potential moves while coordinating with each other. Simply counting up the “point value” of the pieces remaining on the board is easy and tempting for human players, but perhaps AlphaZero has “discovered” that a blocked-in bishop might as well not be there, while a “good” knight can turn the tide of the game.
In a similar way, TensorTip is a medical sensor that tries to dispense with human programming. Using non-invasive monitoring of the temporarily color distribution of the tissue, it tries to measure “…a wide range of physiological and bio-parameters such as blood glucose, hemoglobin and hematocrit, blood PH, oxygen saturation SpO2, blood carbon dioxide, blood pressure, peripheral pulse and more…” based on a huge database of previously accumulated data.
As described by the FDA filing:
The Tensor Tip is based on real time color image sensor, real time photographing the fingertip tissue. A color image sensor of the type used in the Tensor Tip enables a wide range of information in the spectral, resolution, dynamic range and time domains enabling further investigation of the blood chromatic changes as a function body physiology. The concept behind this investigation reflects the idea that a change in human physiology condition would temporarily change the blood pigmentation.
There is no small degree of irony in the fact some recent trends, like evidence-based medicine, that have tried to “optimize decision-making by emphasizing the use of evidence from well-designed and well-conducted research” – instead of age-old “professional opinion” or “expert intuition” – might themselves be superseded by black-box neural networks like AlphaZero or Watson. So the debate will continue about how much we really need to understand about what is going on under the hood of our best decision-making tools.