The Grand Locus / Life for statistical sciences

the Blog

## Let's Disqus

Today I opened a Disqus forum on the blog. You can find the discussion threads at the end of every post. I also have added the forum to the previous posts, so that you can retroactively express your opinion.

It's about 6 months that I started writing, and I could have done this earlier (The Grand Locus is a fork of Nick Johnson's Bloggart, which includes support for Disqus), but I must confess that my tolerance for trolls is very low (there, that's one on the left). I hate having to search information in the middle of personal insults on the forums.

But then I started to be active on Cross Validated which is Stack Overflow's statistics spin-off. And it was a double surprise. I realized that Internet communities are not intrinsically dysfunctional, and I also realized that people of an extreme competence, who deserve academic respect, are part of these communities. And the awesome news is that some of them read this blog!

Anyway, the forum is now open and here are a couple of guidelines in case you don't feel like fooling around and try stuff.

• To write or reply...

## Focus on: multiple testing

With this post I inaugurate the focus on series, where I go much more in depth than usual. I could as well have called it the gory details, but focus on sounds more elegant. You might be in for a shock if you are more into easy reading, so the focus on is also here as a warning sign so that you can skip the post altogether if you are not interested in the detail. For those who are, this way please...

In my previous post I exposed the multiple testing problem. Every null hypothesis, true or false, has at least a 5% chance of being rejected (assuming you work at 95% confidence level). By testing the same hypothesis several times, you increase the chances that it will be rejected at least once, which introduces a bias because this one time is much more likely to be noticed, and then published. However, being aware of the illusion does not dissipate it. For this you need insight and statistical tools.

### Fail-safe $n$ to measure publication bias

Suppose $n$ independent research teams test the same null hypothesis, which happens to be true — so not interesting. This means that the...

## The most dangerous number

I have always been amazed by faith in statistics. The research community itself shakes in awe before the totems of statistics. One of its most powerful idols is the 5% level of significance. I never knew how it could access such a level of universality, but I can I venture a hypothesis. The first statistical tests, such as Student's t test were compiled in statistical tables that gave reference values for only a few levels of significance, typically 0.05, 0.01 and 0.001. This gave huge leverage to editors and especially peer-reviewers (famous for their abusive comments) to reject a scientific work on the ground that it is not even substantiated by the weakest level of significance available. The generation of scientists persecuted for showing p-values equal to 0.06 learned this bitter lesson, and taught it back when they came to the position of reviewer. It then took very little to transform a social punishment into the established truth that 0.06 is simply not significant.

And frankly, I think it was a good thing to enforce a minimum level of statistical reliability. The part I disagree with is the converse statement...