The Grand Locus / Life for statistical sciences

the Blog

## Does the Earth have a mind?

The life of Jean-Dominique Bauby took a dramatic turn on Friday 8 December 1995. On that day, he had a cerebrovascular accident that left him in a coma. When he woke up twenty days later, his body did not respond, his brainstem was damaged beyond repair. Able to move only his left eyelid, Jean-Dominique patiently wrote a whole book, The Diving Bell and the Butterfly, where he describes his experience.

In the past, it was known as a “massive stroke,” and you simply died. But improved resuscitation techniques have now prolonged and refined the agony. You survive, but you survive with what is so aptly known as “locked-in syndrome.” Paralyzed from head to toe, the patient, his mind intact, is imprisoned inside his own body, unable to speak or move. In my case, blinking my left eyelid is my only means of communication.

Up until his death, three days after the publication of the book, there was no doubt for anyone that Jean-Dominique had retained every aspect of his consciousness. There was no doubt that this motionless mass was genuinely conscious.

### The Gaia hypothesis

But how do we know that someone is conscious, what...

## Is there a gene for alcoholism? (2)

In the post Is there a gene for alcoholism? I explained how claims to discover the gene for such and such complex behavior (mostly alcoholism and homosexuality) are based on correlations that are never confirmed by experimentation. We will have to wait until neurogenetics comes of age before we can seriously tackle this kind of question. But when that happens, how likely is it that we really discover a gene for alcoholism?

To make my point come across, I will have to touch a few words about the problem of missing heritability. According to current estimates, the human genome consists of ~ 25,000 protein-coding genes and about as many non protein-coding RNAs, the function of which still remains to be established. The implicit meaning of "gene for alcoholism" is actually a mutation that would somehow affect one of these ~ 50,000 functional entities.

Mutation is somewhat inaccurate in this context as we should speak of polymorphism. A piece of our genome is monomorphic if everybody has exactly the same sequence, otherwise, it is polymorphic. The vast majority of polymorphic sequences in humans are SNPs (single-nucleotide polymorphisms), i.e. sequences that differ by only one nucleotide...

## Lessons from Intelligent Design

The first time I heard a friend of mine — a clever guy — claim that he does not believe in Evolution, I was speechless. Over the years I realized something important: he is not alone. As much as 40% of US citizens believe in strict creationism, while only ~ 15% believe in Evolution (source: Gallup polls).

The latest incarnation of creationism, Intelligent Design, received some media attention during the trial of the Dover Area School District. Following the annoucement that Intelligent Design will be taught side by side with Evolution at the biology classes, a group of parents sued the public school district and finally convinced the judges that this constitutes a breach of the First Amendment of the constitution.

In essence, Intelligent Design claims to build on scientific observations. The rationale of the argument is that biological organisms, human beings in particular, are too complex to be the product of chance. They are designed. And if there is a design, there must be a designer.

### Crimestop

If you have read George Orwell's novel Nineteen Eighty-Four, you will perhaps remember the fictive language Newspeak. By removing words from the English vocabulary, the powers that be gradually enclose the...

## Let’s Disqus

Today I opened a Disqus forum on the blog. You can find the discussion threads at the end of every post. I also have added the forum to the previous posts, so that you can retroactively express your opinion.

It's about 6 months that I started writing, and I could have done this earlier (The Grand Locus is a fork of Nick Johnson's Bloggart, which includes support for Disqus), but I must confess that my tolerance for trolls is very low (there, that's one on the left). I hate having to search information in the middle of personal insults on the forums.

But then I started to be active on Cross Validated which is Stack Overflow's statistics spin-off. And it was a double surprise. I realized that Internet communities are not intrinsically dysfunctional, and I also realized that people of an extreme competence, who deserve academic respect, are part of these communities. And the awesome news is that some of them read this blog!

Anyway, the forum is now open and here are a couple of guidelines in case you don't feel like fooling around and try stuff.

• To write or reply...

## Focus on: multiple testing

With this post I inaugurate the focus on series, where I go much more in depth than usual. I could as well have called it the gory details, but focus on sounds more elegant. You might be in for a shock if you are more into easy reading, so the focus on is also here as a warning sign so that you can skip the post altogether if you are not interested in the detail. For those who are, this way please...

In my previous post I exposed the multiple testing problem. Every null hypothesis, true or false, has at least a 5% chance of being rejected (assuming you work at 95% confidence level). By testing the same hypothesis several times, you increase the chances that it will be rejected at least once, which introduces a bias because this one time is much more likely to be noticed, and then published. However, being aware of the illusion does not dissipate it. For this you need insight and statistical tools.

### Fail-safe $n$ to measure publication bias

Suppose $n$ independent research teams test the same null hypothesis, which happens to be true — so not interesting. This means that the...

## The most dangerous number

I have always been amazed by faith in statistics. The research community itself shakes in awe before the totems of statistics. One of its most powerful idols is the 5% level of significance. I never knew how it could access such a level of universality, but I can I venture a hypothesis. The first statistical tests, such as Student's t test were compiled in statistical tables that gave reference values for only a few levels of significance, typically 0.05, 0.01 and 0.001. This gave huge leverage to editors and especially peer-reviewers (famous for their abusive comments) to reject a scientific work on the ground that it is not even substantiated by the weakest level of significance available. The generation of scientists persecuted for showing p-values equal to 0.06 learned this bitter lesson, and taught it back when they came to the position of reviewer. It then took very little to transform a social punishment into the established truth that 0.06 is simply not significant.

And frankly, I think it was a good thing to enforce a minimum level of statistical reliability. The part I disagree with is the converse statement...

## Is there a gene for alcoholism? (1)

This is usually the next thing I hear when I say that I am a geneticist. Behind this question and its variants lies a profound and natural interrogration, which could be phrased as "how much of me is the product of my genes?" I made a habit of not answering that question but instead, highlight its inaneness by lecturing people about genetics. So, for once, and exclusively on my blog, here is the tl;dr answer: no, there is not. Now comes the lecture about genetics.

I will start with mental retardation — unrelated with my opinion of those claims, really — and more precisely with the fragile X syndrome. James Watson, the co-discoverer of the structure of DNA and the pioneer of the Human Genome Project declared:

I think it was the first triumph of the Human Genome Project. With fragile X we've got just one protein missing, so it's a simple problem. So, you know, if I were going to work on something with the thought that I were going to solve it, oh boy, I'd work on fragile X.

In other words, there seems to be a gene for mental retardation. The incidence...

## The chaos and the doubt

Probability is said to be born of the correspondence between Pierre de Fermat and Blaise Pascal, some time in the middle of the 17th century. Somewhat surprisingly, many texts retrace the history of the concept up until the 20th century; yet it has gone through major transformations since then. Probability always describes what we don't know about the world, but the focus has shifted from the world to what we don't know.

Henri Poincaré investigates in Science et Méthode (1908) why chance would ever happen in a deterministic world. Like most of his contemporaries, Poincaré believed in absolute determinism, there is no phenomenon without a cause, even though our limited minds may fail to understand or see it. He distinguishes two flavors of randomness, of which he gives examples.

If a cone stands on its point we know that it will fall but we do not know which way (...) A very small cause, which escapes us, determines a considerable effect that we can not but see, and then we say that this effect is due to chance.

And a little bit later he continues.

How do we represent a container filled with gas? Countless molecules...

## The autistic computer

I was the shadow of the waxwing slain
By the false azure in the windowpane

What did Vladimir Nabokov see in the first verses of Pale Fire? Was it "weathered wood" or "polished ebony"? As a synesthete, his perception of words, letters and numbers was always tainted with a certain color. Synesthesia, the leak of a sensation into another, is a relatively rare condition. It was known to be more frequent among artists, such as the composer Alexander Scriabin or the painter David Hockney, but it turns out that it might also be frequent among autists. This might even be the reason that some of them have a savant syndrome (a phenomenon first popularized by the movie Rain Man).

One of those autistic savants, Daniel Tammet explains in the video below how he sees the world and how this allows him to carry out extraordinary intellectual tasks.

In his talk, Daniel Tammet explains how he performs a multiplication by analogical thinking. Because he sees a pattern in the numbers, he gives the problem another interpretation, another meaning, where the solution is effortless. This would happen at the level of the semantic representation (i.e. when the brains deciphers...

## The geometry of style

This is it! I have been preparing this post for a very long time and I will finally tell you what is so special about IMDB user 2467618, also known as planktonrules. But first, let me take you back where we left off in this post series on IMDB reviews.

In the first post I analyzed the style of IMDB reviews to learn which features best predict the grade given to a movie (a kind of analysis known as feature extraction). Surprisingly, the puncutation and the length of the review are more informative than the vocabulary. Reviews that give a medium mark (i.e. around 5/10) are longer and thus contain more full stops and commas.

Why would reviewers spend more time on a movie rated 5/10 than on a movie rated 10/10? There is at least two possibilities, which are not mutually exclusive. Perhaps the absence of a strong emotional response (good or bad) makes the reviewer more descriptive. Alternatively, the reviewers who give extreme marks may not be the same as those who give medium marks. The underlying question is how much does the style of a single reviewer change with his/her...

« | »