Correlation vs Causation [closed]

Elsewhere on Stack Exchange I came across the following comment.

The sorting is based on values, not family. If you value knowledge, you will be set to Ravenclaw, for example. Needless to say, if your parents value the acquisition of knowledge most and foremost, their children are likely to share those values, but it is not a guarantee. Correlation, not causation.

I stated that this was causation rather then correlation because the commenter is arguing for a cause-and-effect relationship. The fact the cause-and-effect isn't guaranteed doesn't change that. Other comments have not agreed with me, and I thought rather than an off-topic discussion there, I'd ask over here to see if my understanding of causation vs correlation was correct.


You are using cause in a different sense to the quoted comment: the quotation seems to be talking about proximate cause while you are going deeper. The sorting is based on the individual without regard to the effect or not of the parents.

Taller than average parents more often than not have taller than average children: you would regard that as caused by genetics. Similarly, taller than average children more often than not have taller than average parents. Is that causal? Not to mention the Sam Levenson line “Insanity is hereditary — you get it from your kids.”

There is evidence that cities with more storks' nests have more human births. Is that causal or correlation? Is it indirectly causal if both seem to be related to the number of buildings in the city, which gives both storks and people potential places to live?

Ultimately this boils down to how cause is used. It is used differently by different people in different circumstances, which leads to a lot of confusion.

And then there is the xkcd 552 comment

http://xkcd.com/552/


The author of the comment you cite does not seem to be remarking on genetics but rather on indoctrination. Consider one possibility. A child could be raised by parents who fit the author's description and yet, despite the parents' best efforts, turn out to be the sort who despises knowledge. This is obviously unlikely. But it's possible and, no doubt, has occurred. Under such conditions we wouldn't reasonably say that his hatred of knowledge is a result of his parents' love of it (though certainly that happens too). Likewise, even though there are bound to be cases where the parents' love of knowledge literally engenders it in the child ( which, yes, would be cause and effect), the fact that this isn't always the result is enough to prevent us from playing an if-a-then-b tune. The concept of correlation allows for an increased or decreased likelihood without making any such sweeping ascription. There is that which causes, and there is that which increases the odds of a certain outcome. It seems the author is correct in his conclusion.


There are several different relationships in play in the comment quoted, and like much internet discourse, it's written pretty informally, so it's unclear which of the relationships the commenter argues are causative and which are correlated.

Assuming for the purposes of this answer that your values cause you to be sorted into a particular house (whether with certainty or, since an individual can hold multiple values, with a greater or lesser likelihood), it seems reasonable to say that there is a correlation between the values of your parents and the house that you're sorted into.

It seems less reasonable, mathematically, to say that the values of your parents cause you to be sorted into a particular house. It may be so, but it doesn't follow from the values->house relationship, and we would need more information to ascertain whether there was a causative relationship.

However, English is not mathematics, and it is reasonable, in English, to say that your parents' values have a cause-and-effect relationship with which house you end up in, because parents often pass on their values to their children.


The short answer: yes, the relation suggested in that comment by Borror is causal.


There can be several kinds of causal relations between phenomena A and B (though C may or may not be causally related):

  1. Direct causal relations:

    • I ) A causes B (blowing up the earth causes the death of many people).
    • II) A is caused by B (the falling of a rock to the ground is caused by gravity).
  2. Indirect causal relations:

    • I ) A and B are both caused by C (the tsunami and the fires were caused by the earthquake).
    • II) A is caused by B through C (the earthquake caused the reactor meltdown, because the tsunami disabled the cooling systems).
  3. Other relations:

    • I ) A and B together cause C (combination of high tide and west wind cases dikes to break, but there is no causal relation between tide and wind).
    • II) A seems to cause B but in fact does not (I pray to God for the sun to come up every morning).

Many other variations and combinations are possible.


Another important distinction is that between a necessary cause, and a sufficient cause. Any cause can be one or the other, or both, or neither.

  • A. Sufficient and necessary: a sizeable quantity of alcohol entering my bloodstream causes me to to get drunk (nothing else is needed to get me drunk, and nothing else could).

  • B. Sufficient (but not necessary): spreading a lethal virus causes the death of many people (you don't need anything more, but you could use something else instead).

  • C. Necessary (but not sufficient): light causes an oak tree to grow (it needs light, but it also needs water).

  • D. Neither: Smoking causes lung cancer (you don't always get it even if you smoke, and you could get it even if you don't smoke).


The last example is of the type 1.II: A causes B. So is your example from the Harry Potter question: there is a direct causal relation between your parents' valuing knowledge and your ending up in a house that values knowledge, though it is neither necessary nor sufficient.

However, between your parents' having been in that house (A) and your ending up there (B), there is no direct causal link. But there is an indirect causal link: the fact that your parents value knowledge (C) causes both their ending up in the house (A) and your ending up there (B), of type 2.I (A and B are both caused by C). But it is still a causal link.

In the example of my praying for the sun to come up and its coming up, there is certainly no direct causal link; but one could argue that the earth's rotation on the one hand causes the sun to come up in the morning, while, on the other hand, it causes darkness of night, which in turn causes me to be afraid and pray every night.

Similarly, I could say my pressing the "fire" button causes my enemy to die (direct causal link) or, conversely, I could say my pressing the button causes an electric charge to build up in my phaser gun, which in turn causes the lethal beam to strike my enemy to die, so that the relation is only indirect. That is a matter of definition; it shows the limits of the usefulness of the concept of "causation".


If we define correlation as a "statistically significant relation between phenomena A and B", we are dealing with two things that are probably causally linked, but it might be a coincidence after all. If it turns out to be so, it is of the type 3.II (appears to be causal but is not). In any case, where there is causation, there must be correlation, so that Borror's suggestion about the children of smart parents' having a higher chance of ending up in a certain house is about both correlation and causation.


To complicate things further, most competent philosophers since Hume and Kant have held that causality is a human category that we apply to our perceptions, not something that exists independently (i.e. not a Ding an sich). All we really know is a statistical relation based on previous instances (people fall down when they jump off a cliff); we know nothing else that adds anything of interest over and above this correlation—nothing that could help us make a more accurate prediction than the correlation already can. So causal relations are just a simplification of a reality of correlations. "A causes B" just means "there is a correlation between A's taking place at a certain time and B's happening a certain amount of time afterwards.