Margin of Error – Mostly Error

February 14, 2016
Posted by Jay Livingston

It’s the sort of social “science” I’d expect from Fox, not Vox. But today, Valentine’s Day, Vox (here) posted this map purporting to show the average amount people in each state spent on Valentine’s Day.

(Click on the image for a larger view.)

“What’s with North Dakota spending $108 on average, but South Dakota spending just $36?” asks Vox. The answer is almost surely: Error.

The sample size was 3,121. If they sampled each state in its proportion of the US population, the sample in the each Dakota would be about n = 80 n = 8. The source of the data, Finder, does not report any margins of error or standard deviations, so we can’t know. Possibly, a couple of guys in North Dakota who’d saved their oil-boom money and spent it on chocolates are responsible for that average. Idaho, Nevada, and Kansas – the only other states over the $100 mark – are also small-n. So are the states at the other other end, the supposedly low-spending states (SD, WY, VT, NH, ME, etc.). So we can’t trust these numbers.

The sample in the states with large populations (NY, CA, TX, etc.) might have been as high as 300-400, possibly enough to make legitimate comparisons, but the differences among them are small – less than $20.

My consultant on this matter, Dan Cassino (he does a lot of serious polling), confirmed my own suspicions. “The study is complete bullshit.”

UPDATE February 24, 2016: Andrew Gelman (here) downloaded the data did a far more thorough analysis, estimating the variation for each state. His graph of the states shows that even between the state with the highest mean and the state with the lowest, the uncertainty is too great to allow for any conclusions: “Soooo . . . we got nuthin’.”

Andrew explains why it’s worthwhile to do a serious analysis even on frivolous data like this Valentine-spending survey. He also corrects my order-of-magnitude overestimation of the North Dakota sample size. 

No comments: