Nate Silver, head mathemagician of the 538 blog has won a great deal of attention for his perfect 51 for 51 (including DC) state-by-state predictions of the most recent presidential election. This has lead to a spike in sales for his book, which covers a wide range of topics related to the business of forecasting and the standards we should require for statistical evidence before we take predictions seriously. One of the issues covered in the book, but generally overlooked, is a subtle but important shift occurring in science. This is the reduction in emphasis on statistical significance, and and rise to prominence of Bayesian analysis. The webcomic xkcd nails it in two comics:
This second one came out recently:
The point is that predictions for rare events (like the sun exploding) can easily get swamped by false positives. This is a famous problem in statistics called the Base-rate fallacy which has ensnared many people. For example, a medical test that correctly indicates the presence of a disease 99% of the time (sensitivity) and also correctly returns a negative result for 99% of healthy people (specificity) will NOT be as definitive as most assume if the disease is rare enough. Let’s suppose the base rate of the disease (the “prior probability”) is one in a thousand. If 100,000 people are tested (100 sick of whom are actually sick, and 99,900 healthy), 99 will be correctly labeled as sick, but so will 999 healthy people (99,900 x 0.01) who receive false positives. Therefore, the chance that someone is really sick given that he or she tested positive is still only 9% [= # of True Positives/# of all Positives]. In modern life, we test many propositions that have only a small chance of being correct: pseudo-science, novel scientific theories, new particles created by a collider. The solution is to use Bayesian reasoning, which takes into account the prior probability that a proposition is true. Extraordinary claims require extraordinary evidence, precisely because the chance that so many established principles, that would be overturned by accepting the new claim, is so low to start with.
The established method, however, achieving almost dogmatic stature in science until recently, is the use of “statistical significance,” in which an arbitrarily chosen threshold (usually 5%) is chosen, and a proposition is accepted (or at least “fails to be rejected”) when the odds are such that data “at least as good” as what the experiment obtained could have been simply due to chance is less than the threshold.
As Silver points out in the book, among the reasons “statistical significance,” gained such a stronghold (so much so that it is much more likely that scientific results will be submitted for publication if they could have be obtained by chance 4.9% of the time vs 5.1%) is that is doesn’t require an estimation of the prior probability. This was seen as being more objective somehow, especially when there is no obvious base-line rate to use. However, as the xkcd comics show, this way of thinking can lead to absurd results. Better to use Bayesian methods, as Nate shows.