Press Release

Autocorrect for your DNA
Information Theory and the Genetic Code

New research at NSU has revealed the information content associated with each letter of DNA. This work may improve our understanding of how the genetic code can resist the effects of mutations that may cause cancer or inherited diseases. The same genetic code is used by almost all living organisms to translate three-letter “words,” or codons, of DNA into amino acids, which are strung together to form proteins. Assistant Professor Louis Nemzer, a biophysicist at the Halmos College of Natural Sciences and Oceanography, used methods from the field of information theory to calculate the “Shannon entropy” of each letter of DNA, depending on the type of base (A, T, G, or C) and its position in the codon. Although many people have never heard of Claude Shannon, his pioneering work at Bell Labs on measuring the maximum amount of information contained in messages is still crucial today for digital communication technologies, including text messaging, WiFi, and mobile data transmission. So why did Shannon choose to call his measure of information “entropy,” a word more associated with the physics of an ideal gas? “There are very close connections between thermodynamics and information theory” said Dr. Nemzer, “entropy in physics really just measures how much information about a system you are missing.” By using the equations originally developed for thermodynamics and adapted by information theory, he calculated how “determinative” each letter, or nucleotide, is for the properties of the amino acid it codes for. Changing the properties of even a single amino acid too much may cause the entire protein it is in to lose its ability to function, with potentially negative health outcomes. Fortunately, the genetic code has a kind of built-in “autocorrect” feature that causes most single-letter mutations to produce the same, or a chemically similar, amino acid to the original. This helps make the genetic code robust more to error. The new research, just published in a pair of related papers in the Journal of Theoretical Biology and BioSystems, quantifies how important each letter is to the final properties of the amino acid. It was also found that the genetic code takes advantage of the fact that not all mutations are equally likely. The mutations in DNA that would cause the most severe changes to proteins are less frequent, and more easily repaired. Dr. Nemzer hopes to use the knowledge gained from this research to improve our understanding of the risk factors for cancer and genetic disorders, as well as trace the evolution of different genes between species.

Louis R. Nemzer. Shannon information entropy in the canonical genetic code.
Journal of Theoretical Biology 415 158–170 (2017) DOI: 10.1016/j.jtbi.2016.12.010

Louis R. Nemzer. A binary representation of the genetic code.
BioSystems 155 10–19 (2017) DOI: 10.1016/j.biosystems.2017.03.001

Art Deco Games

After a long hiatus, I’ve been getting back into the Civilization Series. It only took me a few rounds of Civ 5 to fall in love with the updated gameplay, as well as the Art Deco theme.

The futurism and techno-optimism fit perfectly with the message of the game.

Image result for civ 5 art deco

Here is the Rockefeller Center loading-page tribute. The central themes are progress, human ingenuity, and lots of mechanically inspired straight lines. Prometheus presides over the ice-skating rink as the hero, rather than heretic.

Also published by 2K Games is BioShock . The use of Art Deco in the underwater city of Rapture was somewhat more Ayn-Randian and foreboding.

Image result for bioshock art deco


Image result for bioshock rapture art deco

Lucky or Good

A fortunate bounce that goes the way of your favorite team may tempt you to say: “it’s better to be lucky than good.” But are we even able to distinguish between the two? Advanced sports analytics begins to help us disentangle the effects of luck and skill.

So as to avoid the philosophical questions or semantics involved, let’s simply define luck as something that is not expected to continue – and teams experiencing a stretch of good or back fortune soon regressing to the mean – while skill persists over time. The law or large numbers tells us that, if we as much sports as we wanted, all lucky or unlucky deviations would “wash out” on average, and we would know the true skill of a player or team. Of course, we don’t live in such a data paradise. A simple way to quantify future win probability is with the Pythagorean expectation. At the most fundamental level, the true skill of a team depends on the number of points it scores and allows. It is possible that there exist some special skill in eking out close games, but winning games is likely to be simply a matter of “bunching” points to the best effect. For example, the 1960 World Series is famous in that the Pirates won even though “the losing team scored more than twice as many runs as the winning team, as the Yankees won three blowout games (16–3, 10–0, and 12–0), while the Pirates won four close games (6–4, 3–2, 5–2, and 10–9).” Analogies to the electoral college are obvious. The Pythagorean win expectation formula is:

where the exponent k depends on how much luck is involved. Larger values mean that the higher “quality” team wins more often. As I wrote in a previous post, the chances a game was won by the best team or the luckiest depends on the sport. The best fit exponents for different sports have been calculated:

  • English Premiere League : 1.3
  • NHL : 2.15
  • NFL : 2.37
  • NBA : 13.91

Surprisingly, the Pythagorean win expectation can better at predicting future win/loss record than even the past record. Better at predicting future record, even more than actual current record. Compare this formula with the Hill equation in biology, which models cooperative behavior like the binding of oxygen to hemoglobin

Consider the 2016 MLS Cup. The Seattle Sounders prevailed in penalty kicks despite not generating a single shot on goal for 120 minutes of regulation and extra time combined. In contrast, Toronto had seven shots on goal, including one that looked like a sure game-winner, except for an incredible save by Stefan Frei. Had it gone in, everyone would have congratulated Toronto on a dominating 1-0 victory. Instead, Seattle ended up with the cup.

Or how about Leicester city, who overcame 5000:1 odds to win the English Premiere League last year? They benefited from poor showings from the traditional EPL powerhouses, and also were lucky enough to edge out quite a few close games.

Image result for Leicester city

While most expected a return to Earth after such a meteoric rise, I don’t think many expected such a fiery crash landing. This year, Leicester city is fighting to avoid relegation.

Ice hockey is (somewhat) better when I comes to rewarding the best team, but even then, games can be decided by a bounce of the puck. To help figure out if an NHL team’s success is attributable to luck or skill, we can turn to Corsi, Fenwick, and PDO.

Unlike baseball and football, hockey doesn’t have well defined “states” to analyze. Instead, we can use shots on goal as a way to approximate puck possession. Corsi is the sum of shots on goal, missed shots and blocked shots. Fenwick is the same with blocked shots excluded. Why are shots so important?

The basic idea is that generating scoring chances takes skill, but whether a goal is actually scored is unpredictable. Just putting the puck on net allows good things to happen, like a deflection or rebound, even if the original shot doesn’t go in. Also, your opponent can’t score you have the puck. So Corsi/Fenwick is a measure of skill independent of the “luck” of goals going in. Conversely, the sum of shooting percentage and save percentage is called PDO. The thought is that these values are the luck potion, that should tend to regress to 100% over time. Of course, PDO can remain high if you have an exceptionally good goalie, or sharpshooting skaters.


From the author that brought you Moneyball and The Big Short, comes the Hollywood-ready story of an epic scientific bromance that overturned decades of economic thinking.

Michael Lewis starts his new book, The Undoing Project, with the caveat that it is kind of like the inverse of Moneyball. Instead of focusing on how people might use pure data and analytics to compensate for the fallibility of human judgement, he is going to write about the study of those foibles themselves. Although I had already read “Thinking, Fast and Slow,” Nobel laureate Daniel Kahneman’s magnum opus on how the Human brain uses heuristics, not pure rationality, to make decisions, I was still riveted by the narrative of how he and collaborator Amos Tversky worked together on these ideas with very different, but complimentary styles.

Similar to using optical illusions to understand how vision works, Kahneman and Tversky used surveys to demonstrate mental blind spots hardwired into the brain.

Even if you know about it, the Conjunction fallacy is very hard to resist:

Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.
Which is more probable?
(A) Linda is a bank teller.
(B) Linda is a bank teller and is active in the feminist movement.

Since the set of all bank tellers includes bank tellers who are feminists, (A) must be more likely. By adding more restrictions, (“active in the feminist movement”) it tricks us into thinking it will be more likely. Another example: Estimate the likelihood that at least 1,000 people will have to evacuate from California this year. Now estimate the likelihood that a forest fire will start in Southern California and 1,000 people will have to evacuate this year. Again, the brain uses the representative of the narrative as an imperfect proxy for how probable we should think something is.

The main principle is that real people are affected by the way choices are framed – whether as a loss or a gain – or by how representative a description sounds. While we might now consider some of these findings obvious – people are clearly not omniscient, perfectly rational, purely selfish members of homo economicus – Behavioral economics upset a great deal of Economic theory. This is because mathematical models of economic behavior rely on assumptions like stable preferences of rational actors. The idea of “bounded rationality” makes everything a muddle. And if people are Predictably Irrational, then the errors are not just random noise, but rather a systematic bias that won’t even wash out on average.

[Parenthetically, this is another great example of the difference between uncorrelated errors, which can be improved with aggregation, versus systematic bias, which cannot]

The upside, however, is that if real people are not perfectly rational, they can be nudged into doing the right thing, like saving for retirement, with simple changes to the framing or “choice architecture,” which is basically how the options are framed.


Appendix – Behavioral Economics Bibliography:

Animal Spirits
Misbehaving: The Making of Behavioral Economics

Predictably Irrational
Stumbling on Happiness
Thinking, Fast and Slow
The Undoing Project
The Upside of Irrationality


Who Will Win?

Today is election day. The big questions everyone is asking are: who will be the next leader of the free world, and who called it?

Nate Silver made his 538 website a household name in the wake of the 2008 Presidential election, for which his model correctly predicted every state. Silver’s approach was to disdain the “hot takes” of pundits, and relied instead on complicated algorithms that crunched hard poll numbers. In particular, the 538 model uses national data to compensate for missing state polls, and also accounts for correlations between states.


While Silver’s reputation continues to be sterling, he has caused some consternation among the Hillary faithful during the home-stretch of the election by projecting much higher chances for Trump (above 1 in 3 for a while) than other sites that have cropped up in the past few election cycles. For example, the New York Times forecast gives Trump a 16% chance on election day:


While some sites were even more sanguine for Clinton:


In fact, the battle-lines have shifted from Silver vs. Taleb (as I wrote about here), to Silver vs. Sam Wang and the Huffington Post. Silver is highly skeptical, based on the possibility of polling error, that anyone can say that a Clinton victory is more than 98% likely.


A (very) naive approach, taking all states as independent, will lead to a lot of false certainty, since there is only a minuscule path for Trump given the multiplying the state-by-state probabilities he needs to win. However, we know that states are not independent spins. News that flips one state is likely to have an effect on other states as well. Nevertheless, if the lead is big enough, sites like the Huffington Post that assume relatively weak correlations give a Clinton a 98% slam-dunk.

Conversely, my feeling is that the gears and levers under the hood of 538’s model have created such a non-linear system that even a modest margin-of-error, but especially if you allow for the chance that there is a systemic bias, can tip an almost-certainty into serious doubt. Remember, 538 runs nationwide simulations in which state-state correlations are highly relevant, on top of using nationwide polls to adjust state numbers. This magnifies uncertainty in a close race, as states hover on the edge of being flipped.

So who is right? From a philosophical point of view, we may never be able to know the “correct” probability of a one time event, which depends so much on the assumptions used and cannot be repeated to build up statistics. However, if a model predicts many events – like state-by-state results – Jordan Ellenberg suggests the Brier score may be of use in evaluating its performance. The formula is:


Which rewards both accuracy and risk-taking. From Wikipedia:

Suppose that one is forecasting the probability P that it will rain on a given day. Then the Brier score is calculated as follows:

  • If the forecast is 100% (P = 1) and it rains, then the Brier Score is 0, the best score achievable.
  • If the forecast is 100% and it does not rain, then the Brier Score is 1, the worst score achievable.
  • If the forecast is 70% (P=0.70) and it rains, then the Brier Score is (0.70-1)² = 0.09.
  • If the forecast is 30% (P=0.30) and it rains, then the Brier Score is (0.30-1)² = 0.49.
  • If the forecast is 50% (P=0.50), then the Brier score is (0.50-1)² = (0.50-0)² = 0.25, regardless of whether it rains.

Sticking with a maximally un-informed prior (50/50) always gives the same score. In order to improve, the model both has to get it right, and also assign useful probabilities. Notice that even a perfect model will be wrong a lot of the time. For example, an event that is correctly predicted to have a 15% chance will still happen… 15% of the time. The quality of the model can thus only be assessed over many predictions.