I got sucked into a debate about variance and distributions today, with a PhD in engineering, somebody much more numerate than me. At the end, we agreed on the practical. Indeed, at the bottom of this post, I am going to torture him until he supports my point.

I don’t want to tweet on this anymore because I have already stunk Twitter up badly enough for today. But let me give a personal anecdote about why I insist that references to variance or sigmas require an explicit sense of the distribution involved. Without that reference they are meaningless.

This anecdote proves I am stupid, but that at least I do not forget. I remember in high school statistics that they made me square the deviations before adding them up. And they made me do this before telling me anything about probability distributions – which I insist now was backwards pedagogically and just confusing in real time.

Why are you forcing me to square this shit? Who died and made the second power king? Why not cubed? Or just average average? It is not obvious, right?

The reason, as I discovered AFTER grad school, is that the variance calculation arises naturally in the context of the Normal distribution. Here is how Wiki, surely an authority, puts it! The money line is the last bit.

It is not that some guy came up with squaring the deviations based on intuition and then said, *ok, now what are we going to do with this?* Logically, the distribution is prior to the formula, I think — and I think I hear Wiki agreeing.

Anybody can calculate a formula. For example, consider the dumb **Taylor Rule**. My problem with it is not that it does not exist or cannot be calculated, but that it is stupid to bother calculating it near the zero bound.

The question, generally, is whether the formula *helps* you with anything. And the formula for variance is helpful only in the context of a distribution. I am not saying you can’t calculate it, while being utterly in the dark. Clearly, you can, as we see every day. That is what provoked me in the first place. *

By the way, here is Brian Romanchuk on the same issue from an engineering perspective.

Squaring things and then taking the square root has useful geometric properties (it’s like taking a straight line) that carries through to higher dimensional spaces. Pretty much all work in controls uses signals in a “2-norm” space; the analysis does not work in a 1-norm (taking the sum of absolute values). From a physics standpoint, the 2-norm is often related to energy content, which is why physical systems are best analysed using it.

* *This is a bit beyond me, but if fits my point, so I will use it. The *formula* is determined by the *purpose*.

Also, economics is just physics for dummies. But that is separate.

* *The most provocative abuse of standard deviation I have yet encountered happened just after the Swiss lost their ability to hold down their currency against euro. The subsequent move in euro-swiss was said to be a 14 sigma event, something that would happen maybe once every two existences of the known universe to date. I figured the euro-swiss could go, because I did not buy the idea that central banks have “unlimited balance sheet power.” [The central bank was on the hook for a large bet that they had estimated the equilibrium real exchange rate correctly, and no amount of funny money could make that bet dissolve, while allowing the Swiss to retain an opinion — any opinion — on their own inflation rate.] For me, that is a recurring theme, for sure. But I don’t presume to be able to just intuit up 14 sigma events. I didn’t even get 2008.*