Like many other academic journals, Molecular and Cellular Biology takes copyrights very seriously. And to trace the criminals who share scientific publications funded by public institutions, they add to the margin of the pdf reprints downloaded from their website the date and the identity of the license owner.
I recently heard that some people downloaded and installed the pdf toolkit pdftk and at the Linux terminal issued a command like the one below, where they replaced
article.pdf by the name of the pdf they had downloaded.
pdftk article.pdf output uncompressed-article.pdf uncompress
Using their text editor, they opened the uncompressed pdf file and looked for lines like the ones below and commented them out with a % sign (or even deleted them, just in case).
10 0 0 10 0 0 cm BT
/R19 11 Tf
0 -1 1 0 579.5 456.847 Tm
[( on some day by Institution of the Evil Person)556]TJ
-94.148 0 Td
-89.2543 0 Td
[(Downloaded from )278]TJ
They then ran pdftk again to fix the pdf document, and the download information was gone.
pdftk uncompressed-article.pdf output...
In the first post of this series on genetics and racism, I explained how Richard Lewontin concluded from his work on human diversity that human races are of no value for taxonomy (the classification of living begins). This view was later criticized and even termed Lewontin's fallacy by A. W. F. Edwards. Yet, nobody ever doubted that Lewontin was honest in his approach. But more recently came another case that gives the shivers. The great Stephen Jay Gould, the author of the acclaimed Mismeasure of Man was accused of data manipulation.
The mismeasure of Gould
Stephen Jay Gould was this kind of scientist who pops up everywhere. I discovered him in a comment about the opinion of the Vatican on Evolution, others knew him for his statistical analyses of baseball records, while he was actually a paleontologist, author of the theory of punctuated equlibrium. But his most famous work is undoubtedly The Mismeasure of Man.
Like the author, the book is a strange chimera, somewhere in between scientific research and history, with a touch of lyricism. The Mismeasure of Man is a journey through the differences between people, or more precisely through the scientific discourse over this...
Important note: Please read the Erratum at the end of the post.
It is 1879. Leo Tolstoy, then rich and famous for War and Peace and Anna Karenina works on another kind of text. In A Confession he explains at length that he regrets writing those novels. The focus of his remorse and his anger towards himself is the heart of his talent, this innate sense of human nature. Tolstoy's pen had no equal when it came to paint the Russian society of the time, its characters and its culture. However, he explains that this attitude towards writing is wrong, because he has been telling without preaching, he has been describing without judging. He will even abandon the royalties of War and Peace and Anna Karenina, refusing to earn money from such immoral writings.
We were all then convinced that it was necessary for us to speak, write, and print as quickly as possible and as much as possible, and that it was all wanted for the good of humanity. And thousands of us, contradicting and abusing one another, all printed and wrote — teaching others. And without noticing that we knew nothing, and that...
What if I told you to choose a card at random? Simply choose one from a standard deck of 52 cards, think of a card, do not draw one from a real deck... Make sure you have one in mind before you read on.
Assuming that you have a card in mind, I bet you chose the Ace of Spades. Of course, I don't know the card you were thinking of, but that is the one I bet on. Every textbook on probability says that I have a 1/52 chance of having guessed right. Actually, that is not quite true... About one out of 4 readers will choose the Ace of Spades, and one out of seven will choose the Queen of Hearts, as shown by a study of the researcher in psychology of magic Jay Olson and his collaborators.
In this experiment, is the choice of a card purely random? And what does random mean anyway? Even if purely random is not properly defined, most would agree that it means no information at all, or completely unpredictable. The outcome of the experiment above is clearly not what you would call purely random. But...
The life of Jean-Dominique Bauby took a dramatic turn on Friday 8 December 1995. On that day, he had a cerebrovascular accident that left him in a coma. When he woke up twenty days later, his body did not respond, his brainstem was damaged beyond repair. Able to move only his left eyelid, Jean-Dominique patiently wrote a whole book, The Diving Bell and the Butterfly, where he describes his experience.
In the past, it was known as a “massive stroke,” and you simply died. But improved resuscitation techniques have now prolonged and refined the agony. You survive, but you survive with what is so aptly known as “locked-in syndrome.” Paralyzed from head to toe, the patient, his mind intact, is imprisoned inside his own body, unable to speak or move. In my case, blinking my left eyelid is my only means of communication.
Up until his death, three days after the publication of the book, there was no doubt for anyone that Jean-Dominique had retained every aspect of his consciousness. There was no doubt that this motionless mass was genuinely conscious.
The Gaia hypothesis
But how do we know that someone is conscious, what...
In the post Is there a gene for alcoholism? I explained how claims to discover the gene for such and such complex behavior (mostly alcoholism and homosexuality) are based on correlations that are never confirmed by experimentation. We will have to wait until neurogenetics comes of age before we can seriously tackle this kind of question. But when that happens, how likely is it that we really discover a gene for alcoholism?
To make my point come across, I will have to touch a few words about the problem of missing heritability. According to current estimates, the human genome consists of ~ 25,000 protein-coding genes and about as many non protein-coding RNAs, the function of which still remains to be established. The implicit meaning of "gene for alcoholism" is actually a mutation that would somehow affect one of these ~ 50,000 functional entities.
Mutation is somewhat inaccurate in this context as we should speak of polymorphism. A piece of our genome is monomorphic if everybody has exactly the same sequence, otherwise, it is polymorphic. The vast majority of polymorphic sequences in humans are SNPs (single-nucleotide polymorphisms), i.e. sequences that differ by only one nucleotide...
The first time I heard a friend of mine — a clever guy — claim that he does not believe in Evolution, I was speechless. Over the years I realized something important: he is not alone. As much as 40% of US citizens believe in strict creationism, while only ~ 15% believe in Evolution (source: Gallup polls).
The latest incarnation of creationism, Intelligent Design, received some media attention during the trial of the Dover Area School District. Following the annoucement that Intelligent Design will be taught side by side with Evolution at the biology classes, a group of parents sued the public school district and finally convinced the judges that this constitutes a breach of the First Amendment of the constitution.
In essence, Intelligent Design claims to build on scientific observations. The rationale of the argument is that biological organisms, human beings in particular, are too complex to be the product of chance. They are designed. And if there is a design, there must be a designer.
If you have read George Orwell's novel Nineteen Eighty-Four, you will perhaps remember the fictive language Newspeak. By removing words from the English vocabulary, the powers that be gradually enclose the...
Today I opened a Disqus forum on the blog. You can find the discussion threads at the end of every post. I also have added the forum to the previous posts, so that you can retroactively express your opinion.
It's about 6 months that I started writing, and I could have done this earlier (The Grand Locus is a fork of Nick Johnson's Bloggart, which includes support for Disqus), but I must confess that my tolerance for trolls is very low (there, that's one on the left). I hate having to search information in the middle of personal insults on the forums.
But then I started to be active on Cross Validated which is Stack Overflow's statistics spin-off. And it was a double surprise. I realized that Internet communities are not intrinsically dysfunctional, and I also realized that people of an extreme competence, who deserve academic respect, are part of these communities. And the awesome news is that some of them read this blog!
Anyway, the forum is now open and here are a couple of guidelines in case you don't feel like fooling around and try stuff.
- To write or reply...
With this post I inaugurate the focus on series, where I go much more in depth than usual. I could as well have called it the gory details, but focus on sounds more elegant. You might be in for a shock if you are more into easy reading, so the focus on is also here as a warning sign so that you can skip the post altogether if you are not interested in the detail. For those who are, this way please...
In my previous post I exposed the multiple testing problem. Every null hypothesis, true or false, has at least a 5% chance of being rejected (assuming you work at 95% confidence level). By testing the same hypothesis several times, you increase the chances that it will be rejected at least once, which introduces a bias because this one time is much more likely to be noticed, and then published. However, being aware of the illusion does not dissipate it. For this you need insight and statistical tools.
Fail-safe $(n)$ to measure publication bias
Suppose $(n)$ independent research teams test the same null hypothesis, which happens to be true — so not interesting. This means that the...
I have always been amazed by faith in statistics. The research community itself shakes in awe before the totems of statistics. One of its most powerful idols is the 5% level of significance. I never knew how it could access such a level of universality, but I can I venture a hypothesis. The first statistical tests, such as Student's t test were compiled in statistical tables that gave reference values for only a few levels of significance, typically 0.05, 0.01 and 0.001. This gave huge leverage to editors and especially peer-reviewers (famous for their abusive comments) to reject a scientific work on the ground that it is not even substantiated by the weakest level of significance available. The generation of scientists persecuted for showing p-values equal to 0.06 learned this bitter lesson, and taught it back when they came to the position of reviewer. It then took very little to transform a social punishment into the established truth that 0.06 is simply not significant.
And frankly, I think it was a good thing to enforce a minimum level of statistical reliability. The part I disagree with is the converse statement...