One of the central talking points in the recent hullabaloo over evolutionary psychology has been the difference between correlation and causation. Thanks to Echidne, I came across this remarkable example of correlation being confused with causation in USA Today (not exactly a bastion of good science):
Breast-feeding has well-documented benefits. Studies have shown it nourishes babies while fighting off infections and even boosting IQ. Now a study in Monday’s Pediatrics suggests nursing also may protect infants from neglect..In a study of 6,621 Australian children over 15 years, researchers found that those who were breast-fed were far less likely to be neglected or abused by their mothers. Babies who weren’t breast-fed were more than 2½ times as likely to be maltreated by their mothers as those who were nursed for four months or more, the study shows. There was no link between breast-feeding and the risk of maltreatment by fathers or others.
Apparently a hormone released during breastfeeding that strengthens the bond between mother and child is responsible for this correlation. Although this dubious claim is discredited by a disinterested psychologist later in the article, the wording of the claim contains an interesting linguistic twist. Compare the sentence in the article
those who were breast-fed were far less likely to be neglected or abused by their mothers
with a revised version that switches the components around:
those who were neglected or abused by their mothers were far less likely to [have been] breast-fed.
What exactly is the difference between these two sentences? Well, among other obvious things, I had to change the aspect of the second sentence because the abuse usually comes after the breastfeeding, and the present perfect aspect indicates a completed action (i.e. the breastfeeding was completed before the abuse started). The order of events are important in this case because the two sentences have no overt semantic indication of causation other than the order in which the events occurred.
If we took a sample population that was made up of 20 cows and 20 dalmatians, and out of the 20 cows only 5 were Holsteins, we might say that those animals that are spotted are much more likely to be dalmatians. This claim has no inherent or implied indication that the cause of the animals being dalmatians is their spots; rather, the relationship between being spotted and being a dalmatian is simply one of strong correlation. (Note that the inverse, “those animals that are dalmatians are much more likely to be spotted,” is also true without need for the present perfect.)
However, if we take the same sample, and out of the 15 brown cows 13 of them were born in May, we might say that those cows that are born in May are much more likely to be brown. In this example, although their are still no overt or deliberate signs of causation, the sentence is more easily interpreted as depicting a relationship of causation because of the time element. Unlike in the case of the last example, the inverse of this sentence has to be “those cows that are brown are much more likely to have been born in May.”
I don’t want to give the impression that the cow examples are meant to be parallel to the breastfeeding example; I just want to show that the time element is encoded in the first type of sentence, and that time element implies causation rather than simply correlation. The use of the progression of time to indicate a relation of causation is known, in the parlance of our times, as the post hoc fallacy – post hoc, ergo propter hoc is Latin for “after this, therefore because of this,” and it is the name for the classic tendency to confuse correlation with causation that has been around since the dawn of argumentation.
Like the post hoc fallacy, the tendency for researchers to notice relationships of correlation and then make up reasons why the correlation might be causal is one of those things that just won’t go away.