The Grand Locus / Life for statistical sciences

## Did Mendel fake his results?

You went to high school and you learned genetics. You heard about a certain Gregor Mendel who crossed peas and came up with the idea that there is a dominant and a recessive allele. You did not particularly like the guy because there would always be a question about peas with recessive and dominant alleles at the exam. But you grew up, became wiser and just as you started to like him, you heard from someone that he faked his data. You felt disoriented for a while, why annoy you with this stuff at school if it is wrong? But then you came to the conclusion that he just got lucky and that he was right for the wrong reasons. After all, he was just a monk on gardening duties, why would you expect him to understand anything about real science?

### Gregor Mendel

Gregor Mendel was a monk, but he was also a trained scientist. He studied assiduously for twelve years (including about seven years on physics and mathematics), to then become a teacher of physics and natural sciences at the gymnasium of Brno. He prepared his most famous experiment for two years, meticulously checking and choosing his specimens while setting up the experimental greenhouse that he required to avoid cross-fertilization. But this was just the beginning. He spent the next eight years with the help of two assistants inspecting around 28,000 pea plants until an epidemic destroyed his culture.

Twelve years of training, two years of preparation, eight years of execution. So much for the simple monk on gardening duties.

### The accusations

The charges against Mendel were laid by Ronald Fisher, the father of statistics. It is intriguing that he started to doubt the results of Mendel in the first place, because he greatly contributed to the modern evolutionary synthesis, i.e. the use of Mendelian genetics to explain Darwin’s theory of natural selection. Why doubt the honesty of someone you believe to be right?

At the age of 21, Fisher came across the unpublished work of Raphael Weldon who happened to have serious doubts about Mendel’s theory. Before his death 5 years earlier, Weldon had started most of the statistical work to disprove Mendel, the analyses were ready for Fisher to pick up and give a nice show on something he still knew little about. This was published in 1911, but nobody paid attention at the time.

Fisher picked up this line 25 years later and published an article in 1936, where in fact, he did not accuse Mendel of fraud. Here is how he expressed his opinion.

(...) it remains a possibility among others that Mendel was deceived by some assistant who knew too well what was expected. This possibility is supported by independent evidence that the data of most, if not all, of the experiments have been falsified so as to agree closely with Mendel’s expectations.

Only later was this article used to buttress the claims of fraud against Mendel. Fisher died in 1962 without knowing that his work had been used to defame Mendel’s reputation.

### The statistics

The very first numerical result from of “Versuche über Pflanzen-Hybriden” goes as follows.

Expt. 1: Form of seed. From 253 hybrids 7324 seeds were obtained in the second trial year. Among them were 5474 round or roundish ones and 1850 angular wrinkled ones. Therefrom the ratio 2.96:1 is deduced.

Is it too good to be true? The ideal ratio 3:1 corresponds to 5493 round seeds and 1831 wrinkled ones. What are the chances that the results deviate by only 19 seeds? The experiment corresponds to a binomial variable with $n = 7324$ trials, each with probability of success $p = 0.25$. The probability of obtaining a number between 1812 and 1831 included is $0.401$. Nothing wrong about it, says Fisher, except that it made Mendel overconfident.

It may be that the seed counts of 1859 were a revelation to him of the precision with which his system worked, and could be demonstrated ; they may also possibly have given him an exaggerated impression of the precision with which the theoretical ratios should be verified, but from that moment it is clear, from the form his experiments took, that he knew very surely what to expect, and designed them as a demonstration for others rather than for his own enlightenment.

The real issue appears a little further in the text, when Mendel tests the genotype of 600 plants by self-fertilizing them and growing 10 seeds. Mendel expected the proportion of homozygotes to be $1/3$, but Fisher’s now famous argument is that he should have misclassified about $5\%$ of them. Indeed, heterozygotes can have 10 identical children and this happens with probability $0.75^{10} = 0.056$. According to Fisher, the expected number of homozygotes is thus 222.5, but Mendel obtained 201. The probability of obtaining a deviation of 21 from the mean is $0.076$, so the blame is not as much that this is unlikely as that it is very close to the 'wrong' expected value.

To emphasize his point, Fisher finally collects all the experiments of Mendel, computes the chi square statistic and concludes that the probability of obtaining higher deviations from the expected values is 0.99993.

This is certainly true, but it is important to realize that this spectacular number is more attributable to the the size of the data set than to the magnitude of the dieviation. In the end, just a few errors out of tens of thousands of counts are sufficient to explain the figure.

### So, did he do it?

Since Fisher did not accuse Mendel of fraud, why should anyone do so? Prejudices and implicit biases are a reality of science. It is as easy for us to notice the prejudices of others as it is hard to notice our own. Looking at a sample of phenotypes assembled by Weldon, I doubt that I could classify all those seeds without mistake. So who is to blame? Mendel? His assistant? Or every minute that each of us does not spend fighting our own biases? As Fisher put it:

Each generation, perhaps, found in Mendel’s paper only what it expected to find.

### References

I recommend the interested readers to refer to the original articles of Mendel and Fisher in order to form their own opinion. The work of Mendel is available at http://mendelweb.org/. All the quotes from Fisher are taken from the article “Has Mendel’s work been rediscovered?” (Annals of Science 1936, vol. 1, issue 2). It is relatively easy to find online (on this website for instance). I also recommend the article Beyond the “Mendel-Fisher controversy” published in Science (behind paywall unfortunately).