In the first days of my PhD, I sincerely believed that there was a chance I would find a cure against cancer. As this possibility became more and more remote, and as it became obvious that my work would not mark a paradigm shift, I became envious of those few people who did change the face of science during their PhD. One of them is Andrey Kolmogorov, whose PhD work was nothing less than the modern theory of probability. His most famous result was the strong law of large numbers, which essentially says that random fluctuations become infinitesimal on average. Simply put, if you flip a fair coin a large number of times, the frequency that ‘tails’ turn up will be very close to the expected value 1/2.
The chaos of large numbers
Most fascinating about the strong law of large numbers is that it is a theorem, which means that it comes with hypotheses that do not always hold. There are cases that repeating a random experiment a very large number of times does not guarantee that you will get closer to the expected value — I wrote the gory detail on Cross Validated, for those interested...
The most annoying thing about us biologists is that we keep using words that we don’t understand. “Epigenetics” is one of those that has drawn my attention for several years, as I explained in my last post. I suggested that the invasion started in 2001, the year that the histone code hypothesis was proposed by Thomas Jenuwein and David Allis in a seminal paper entitled Translating the Histone Code.
The histone code hypothesis was arguably the most influential concept of the last decade in molecular biology. Yet, most biologists would be hard pressed to say what the hypothesis is. All you have to do is read what Thomas Jenuwein and David Allis actually wrote. But believe it or not, this blog is one of the only places on the Internet where the histone code hypothesis is spelled out clearly. Most sources, including the Wikipedia article diverge substantially from the original statement, which is the following.
Distinct qualities of higher order chromatin, such as euchromatic or heterochromatic domains, are largely dependent on the local concentration and combination of differentially modified nucleosomes.
DNA in the nucleus comes in a structure called the nucleosome. The picture above is a molecular...
I started to study biology at the time epigenetics became a buzzword. I first heard the term at university in 2001, and as many young enthusiastic people of the time, I did my PhD on epigenetics because it was cool. But buzzes come and go, I finished my PhD and I got bored with epigenetics. Meanwhile, I thought that my interest had been mirroring that of the community, and that the trend was towards a loss of interest for epigenetics. I was about to write a blog post entitled “The death of epigenetics” when I did a quick PubMed search and realized that the peak of popularity was... 2013. Epigenetics is not dead, it is on the rise!
Above is the number* of PubMed hits for “epigenetics” per month since 1996, with “chromatin” shown as a reference for comparison. PubMed now displays a histogram of the occurrence of your search term over the years (check here for epigenetics). The growth is not due to articles published in late-adopting journals, since the trend-setters Cell, Nature and Science published more than half of their papers labelled “epigenetics” in the last three years.
What is epigenetics anyway?
Some of you may remember planktonrules from my series on IMDB reviews. For those of you who missed it, planktonrules is an outlier. In my attempt to understand what IMDB reviewers call a good movie, I realized that one reviewer in particular had written a lot of reviews. When I say a lot, I mean 14,800 in the last 8 years. With such a boon, I could not resist the temptation to use his reviews to analyze the variation of style between users, and to build a classifier that recognizes his reviews.
I finally got in contact with Martin Hafer (planktonrule’s real name) this year, and since he had planned to visit Barcelona, we set up a meeting in June. I have to admit that I expected him to be a sort of weirdo, or a cloistered sociopath. The reality turned out to be much more pleasant; we had an entertaining chat, speaking very little about movie reviews. He also pointed out to me that doing statistics on what people write on the Internet is a bit weird... True that.
Anyway, as an introduction, here is a mini interview of planktonrules. You can find out more...
I recently gave a motivation speech at the CRG/Institut Curie international PhD retreat. There was only one slide and the content was fairly general, so I thought I could reproduce it here. My goal was to motivate people, but also to surprise them a little, especially at the end. Finally, I wish such a nice title were mine, but I have to acknowledge Jeff Atwood. I stole it from his post on Coding Horror (which I also invite you to read).
How to stop sucking and be awesome instead
Think about what we can do today. We can send people on the moon. We can talk to each other any time anywhere on the planet. We can go anywhere in about a day. We can transplant a heart. We can cure diseases that were fatal only 30 years ago. And yet, there is still one thing that we cannot do. We don’t know how to motivate people.
That’s right, we do not know how to make our colleagues enthusiastic about their work. If you watch a couple of TED videos or if you read a couple of books on management, you will see that we...
As most French students of my generation, I had to study Candide, a short philosophical novella written by Voltaire. Back then, I was convinced that Voltaire was an arrogant prick, and I never imagined that his dumb criticism of Leibniz’s theory of pre-established harmony, which he barely understood, would ever echo in my work as a biologist.
But here we are, years have passed, I have made peace with Voltaire, and the ENCODE consortium has issued its major and controversial statement that they find “biochemical functions for 80% of the genome”. As the arguments and the comments flow on the blogs and in the academic press, I cannot help thinking about the words of Dr. Pangloss – incarnating narrow optimism.
Observe, for instance, the nose is formed for spectacles, therefore we wear spectacles. The legs are visibly designed for stockings, accordingly we wear stockings.
What I will call the Panglossian reading of the “80% functional” statement above is the idea that 80% of the genome is meant to be the way it is. The architecture of a given locus is somehow designed to produce what happens there (transcription, transcription enhancing, transcription factor binding etc). Notice...
This article is neither interesting nor well written.
Everybody in the academia has a story about reviewer 3. If the words above sound familiar, you will definitely know what I mean, but for the others I should give some context. No decent scientific editor will accept to publish an article without taking advice from experts. This process, called peer review, is usually anonymous and opaque. According to an urban legend, reviewer 1 is very positive, reviewer 2 couldn't care less, and reviewer 3 is a pain in the ass. Believe it or not, the quote above is real, and it is all the review consists of. Needless to say, it was from reviewer 3.
For a long time, I wondered whether there is a way to trace the identity of an author through the text of a review. What methods do stylometry experts use to identify passages from the Q source in the Bible, or to know whether William Shakespeare had a ghostwriter?
The 4-gram method
Surprisingly, the best stylistic fingerprints have little to do with literary style. For instance, lexical richness and complexity of the language are very difficult to exploit efficiently. The unconscious foibles...
In the previous posts of this series on genetics and racism, I talked about two recent academic disputes over human races. With this post I hope to give a wider overview of what biology has to say about species, breeds and races.
Modern genetics was born in 1900 with the re-discovery of Mendel's laws. Since the Neolithic Revolution, genetics had been an empirical art. Our ancestors isolated most of the breeds of animals and plants that we know today, i.e. groups that carry a trait of interest to the next generation when crossed together (for instance Chihuahuas are small dogs and Great Dane are large dogs).
But over the generations, pedigrees got lost in the myst of time and the overwhelming differences between some breeds of the same species raised the question whether they share the same natural origin. Before Darwin, it was difficult to imagine that the Chihuahuas and the Great Dane would have a common ancestor, and the theory went that breeds actually came from different species. This is actually one of the first questions tackled by Darwin in The Origin of Species. In the following passage, he exposes his...
Like many other academic journals, Molecular and Cellular Biology takes copyrights very seriously. And to trace the criminals who share scientific publications funded by public institutions, they add to the margin of the pdf reprints downloaded from their website the date and the identity of the license owner.
I recently heard that some people downloaded and installed the pdf toolkit pdftk and at the Linux terminal issued a command like the one below, where they replaced
article.pdf by the name of the pdf they had downloaded.
pdftk article.pdf output uncompressed-article.pdf uncompress
Using their text editor, they opened the uncompressed pdf file and looked for lines like the ones below and commented them out with a % sign (or even deleted them, just in case).
10 0 0 10 0 0 cm BT
/R19 11 Tf
0 -1 1 0 579.5 456.847 Tm
[( on some day by Institution of the Evil Person)556]TJ
-94.148 0 Td
-89.2543 0 Td
[(Downloaded from )278]TJ
They then ran pdftk again to fix the pdf document, and the download information was gone.
pdftk uncompressed-article.pdf output...
In the first post of this series on genetics and racism, I explained how Richard Lewontin concluded from his work on human diversity that human races are of no value for taxonomy (the classification of living begins). This view was later criticized and even termed Lewontin's fallacy by A. W. F. Edwards. Yet, nobody ever doubted that Lewontin was honest in his approach. But more recently came another case that gives the shivers. The great Stephen Jay Gould, the author of the acclaimed Mismeasure of Man was accused of data manipulation.
The mismeasure of Gould
Stephen Jay Gould was this kind of scientist who pops up everywhere. I discovered him in a comment about the opinion of the Vatican on Evolution, others knew him for his statistical analyses of baseball records, while he was actually a paleontologist, author of the theory of punctuated equlibrium. But his most famous work is undoubtedly The Mismeasure of Man.
Like the author, the book is a strange chimera, somewhere in between scientific research and history, with a touch of lyricism. The Mismeasure of Man is a journey through the differences between people, or more precisely through the scientific discourse over this...