How statistics can confuse investors

Quoting Benjamin Disraeli, Mark Twain famously quipped, “There are three kinds of lies: lies, damned lies, and statistics.” In the field of investments, in which we rely heavily on statistical analysis to evaluate the merits of investment strategies and products, Twain’s point is all too relevant.

Correlation Is Not Causation

One statistic that is all too easy to be misleading is correlation, starting with its definition. How many times have we heard that correlation measures the tendency for two variables to move up and down together? That’s not quite right. What correlation actually measures is the degree to which two variables, each in excess of its own average, are statistically related.

The other major mistake often made with respect to correlation is causation. Seeing that two variables are statistically related, we too easily jump to the conclusion that there is a causal relationship between them. But correlation and causation are two very different things.

Academic David Leinweber drove this point home in a paper in the mid-1990s that showed that there was a very high correlation between the annual level of the S&P 500 and the annual production of butter in Bangladesh. The author presented the results for the period 1981 through 1993 and found that the correlation over this period was about 87%.

I wanted to see if I could find a similar correlation for the S&P/TSX Composite over a more recent period. It didn’t take me long to discover that for the Canadian stock market, it’s the butter production of Brazil from 1994 through 2017 that does the trick.

I drew the level of the S&P/TSX Composite for each year as a red circle and a blue line that shows the level of the S&P/TSX Composite predicted by the annual butter production in Brazil. They appear to be strongly related. As in Leinweber’s example, the correlation is about 87%.

If you are thinking that there must be some trick to finding dairy production numbers that are correlated with stock market indexes, you’re right. The trick is to use trended variables.

Over any period of time, if two variables are trending upward, such as a stock market index and production in a growing dairy industry, they are positively correlated, even if there is no causal link between them.

The solution to trended variables is to remove the trends. With both stock market indexes and production levels, the natural way to remove the trends is to take the percentage rate of change of each variable. I did that for the annual levels of the S&P/TSX Composite and annual Brazilian butter production.

I then plotted the annual percentage rates of change of both of these variables. Now, we get the expected result of almost no correlation; just an insignificant 5%.

But even if we have constructed the variables properly, correlation is still not causation. If A and B are correlated, it could be that there is a third variable, C, related to both of them, that we cannot observe.

When trying to find causation, one must look to economic reasoning, not just statistical links. This is especially important to keep in mind when evaluating quantitative investment strategies, especially those behind new strategic-beta exchange-traded funds. Any causal explanation must be made apart from the statistics.

Watch Your Back Test

Correlation is not the only statistic in which statistical significance can get confused with causation. A common statistical procedure in investment management is back-testing. Back tests are run because the period of live performance is often quite limited, non-existent if the strategy has yet to go live. The idea is that if a strategy back-tests well, it should do well in real time.

But this can only be the case if there are causal links between what the strategy does at each point in time and its subsequent performance. There needs to be an economic rationale for the strategy before back-testing it. There are several issues that should be considered when evaluating a back test, especially one involving a factor-based strategy:

Positive Results Bias

As my colleague Ben Johnson says, “There is no such thing as a bad-looking back test.” When we are presented with impressive back-test results, we don’t know how many other strategies or factors were tried that didn’t turn out very well.

Zoo of Factors

There are so many factors to choose from that John Cochrane of the University of Chicago coined the term “zoo of factors.” Given this zoo of hundreds of factors, anyone with the right data set, a computer, and some programming knowledge could back-test any number of factor-based strategies in short order and report only the favourable results. As the late Nobel-prize winning economist Ronald H. Coase said, “If you torture the data long enough, it will confess.”

Simulation, Not Reality

The purpose of a back test is to see how a strategy would have performed in the past. But we can never know for sure how it would have done. Some back tests try to be more realistic by including assumptions such as trading costs, but many do not. But no matter what assumptions are made, a back test is a simulation, not a historical fact.

No Controlled Experiments

In the hard sciences, empirical work mainly consists of running controlled experiments in which the effects of factors other than the variable being studied are minimized or eliminated. Unfortunately, economists generally cannot perform controlled experiments. Instead, they do statistical analysis on historical data, with the assumption that the underlying processes that generate the data remained the same over time. This is the assumption of stationarity. Back-testing is an example of this kind of analysis.

Beware Statistics

It is all too easy to calculate nonsensical statistics, such as the correlation between a stock market index and butter production. Furthermore, statistical analysis, in the absence of economic analysis, can be misused to demonstrate almost anything, as can happen with back tests that have no linkage to economic reason.

Finally, using properly constructed statistics, but in the absence of other relevant facts, can lead to poor decisions, such as recommending a fund to an investor who lacks the patience to hold it through the rough patches. By paying careful attention to these issues, we can prevent statistics from being the third kind of lie.