The Grand Locus / Life for statistical sciences

the Blog

## Fisher information (with a cat)

It is still summer but the days are getting shorter (p < 0.05). Edgar and Sofia are playing chess, Immanuel purrs in a sofa next to them. Edgar has been holding his head for a while, thinking about his next move. Sofia starts:

“Something bothers me Immanuel. In the last post, you told us that Fisher information could be defined as a variance, but that is not what I remember from my classes of mathematical statistics.”

“What do you remember, Sofia?”

“Our teacher said it was the curvature of the log-likelihood function around the maximum. More specifically, consider a parametric model $f(X;\theta)$ where $X$ is a random variable and $\theta$ is a parameter. Say that the true (but unknown) value of the parameter is $\theta^*$. The first terms of the Taylor expansion of the log-likelihood $\log f(X;\theta)$ around $\theta^*$ are

$$\log f(X;\theta^*) + (\theta - \theta^*) \cdot \frac{\partial}{\partial \theta} \log f(X;\theta^*) + \frac{1}{2}(\theta - \theta^*)^2 \cdot \frac{\partial^2}{\partial \theta^2} \log f(X;\theta^*).$$

Now compute the expected value and obtain the approximation below. We call it $\varphi(\theta)$ to emphasize that it is...

## Does science need statistical tests?

Some time ago, my colleague John asked for help with the statistics for one of his manuscripts.

“We have this situation where we knocked out a gene with CRISPR and I want to test if it affects viability. I know that you are supposed to use a non-parametric test when the sample is small, but I have heard that you can still use the t test if the variables are Gaussian. So now I am genuinely confused. Which test should I use?”

“I agree. It’s confusing. Why do you want to make a statistical test by the way?”

“Same as everyone. I want to know if the effect is significant. Plus, I’m hundred percent sure that the reviewers will ask for it.”

“I see. I will rephrase my question then. What decision do you have to make?”

“I can give you all the details of our experiments if you want, but I’m surprised. Nobody has ever asked me that before and I thought that experimental details do not really matter so much for a statistical test. So what kind of details do you need?”

“Nothing in particular. I just want to know whether you...

## A gentle introduction to the Cramér-Rao lower bound (with a cat)

It is summer, Edgar and Sofia are comfortably sitting on the terrace, watching the beautiful light of the end of the day. Edgar starts:

“Let’s play a game to see who is the better statistician! Immanuel my cat will give each of us a secret number strictly greater than zero. The other person will have to guess it.”

“How are we going to guess?”

“Let’s say that the secret numbers are the means of some Poisson variable. We generate samples at random. The one who gets the closest estimate by dinner time wins.”

“That sounds easy! Will Immanuel give us the same number?”

“What is the fun in that? Let’s ask him to give two different numbers. You know what to do. Just give me your first sample whenever you are ready and I will try to guess your secret number.”

Immanuel whispers something in the ear of Sofia and then does the same with Edgar. Sofia opens her laptop and after a few keystrokes she says “The first number I have for you is 1.”

“OK, I give up. You win.”

Sofia is puzzled at first, but then she notices how Immanuel is rolling...

## Scientific models

Literature discussions were usually very quiet in the laboratory, but somehow, this article had sparked a debate. Linda thought it was very bad. Albert liked it very much. Kate, the PI, was undecided. At some point the discussion stalled, so Kate made a move to wrap up.

“So, Linda, why do you think the article is bad?”

“Because they are missing a thousand controls.”

“I find their model in figure 6 really cool. Actually, if it is true, it…”

“Precisely my point!” interrupted Linda. “It’s pure speculation!”

Kate intervened.

“Albert, you describe figure 6 as a model. What makes it a model?”

Albert spoke after a pause.

“It’s an idealized summary of their findings.”

“Fantasized you mean!” replied Linda.

Kate ignored the point and turned to Linda.

“Linda, do you think that figure 6 is a model?”

“Of course not! It’s just speculation.”

“Now I have a question for you Albert: what is the difference between a model and a summary?”

While Albert was thinking, Kate continued.

“And I also have a question for you Linda: what is the difference between speculation and assumption?”

Now they were...

## I’m the boss!!

“You know how scientists will communicate in the future, don’t you?”

“Of course I do!”

It is a shameless lie, I have no clue what Frédéric has in mind, but I don’t want to look stupid.

“And you Vincent, I bet you know it too... right?”

On this afternoon of 2004, somewhere on the south coast of Madagascar, Vincent gives one of his majestic puzzled looks. That was exactly what Frédéric had hoped for.

“Well, in the future, scientists will no longer publish in journals. They will have public lab notes. They will post their results on their personal Internet page, day by day... like a blog. Peer scientists will be allowed to leave their comments, criticize the protocols and the results. In short, the information will go directly from the producers to the consumers, and it will spread much better because science will become open source.”

I believed it then.

But now, I just became an independent researcher. I have my own team. I am the boss!! And I realize that Frédéric was wrong. To stay in research you need a good track record. And as far as track...