Is there a rule for pronouncing “th” at the beginning of a word?

Consider the th in thistle versus the th in this: the former is unvoiced, while the latter is voiced.

Is there a rule or reason for the differences?


Solution 1:

Part I

The rule that Peter pointed out in comments is that it is voiced only in function words, not in others. (In fact, this is more of a law than a rule really, because it has no exception in English.)

The complete list, excluding derived terms based on words in this list, is:

than, that, the, thee, their, them, then, thence, there, these, they, thilk1, thine, this, thither, those, thou, though, thus, thy, thyne, thyself.

Notice how those are all function words of one sort or another, not nouns or verbs.

Note that English also has a few words that begin with th where the h is silent, like Thomas and Thames. Those you just have to learn by rote. There are not very many of them, but they can be quite common.


Part II

The second part of the question asks whether there is a reason for these differences. We don’t know for sure, but function words tend not to be stressed, which tends to make them run together and experience greater assimilation with surrounding sounds. This might have caused the voicing to stick around there.

Wikipedia suggests:

In early Middle English times, a group of very common function words beginning with /θ/ (the, they, there, etc.) came to be pronounced with /ð/ instead of /θ/. Possibly this was a sandhi development; as these words are frequently found in unstressed positions they can sometimes appear to run on from the preceding word, which may have resulted in the dental fricative being treated as though it were word-internal.


In a comment, I note it is also rare to find a word than ends in voiced -th (without an e following it), with smooth and the verb mouth /maʊð/ being notable exceptions. And by rights, that one really ought to be spelled mouthe, like all the others (bathe, clothe, breathe, etc.). Voicing wasn’t phonemic at the ends of words, and happened only with a following inflectional vowel. Hence unvoiced in nouns house, wolf, bath but voiced in verbs house, wolve, bathe, even when not in the third-person singular -s form.

This is related to the intervocalic voicing described above. It is retained even when we have lost final e, whether just in pronunciation or in spelling as well.


Footnotes

  1. Thilk is an archaic or dialectal demonstrative, which per the OED is also:

thick /ðɪk/ is in dialect use from Cornwall and Hants to Worcester and Hereford; and also in Pembroke, Glamorgan, and Wexford. In many parts it has also the form thicky, thickee, or thicka. It generally means ‘that’, but. . . .

Solution 2:

In English, the digraph th is pronounced as the voiceless [θ] at the beginning of a word in almost all circumstances. The exceptions are all short function words, such as articles, demonstratives, and commonplace adverbs:

  • the
  • this / these
  • that / those
  • there
  • then

(The above list is not exhaustive.) These words all use the voiced sound [ð].

Any other regular noun or verb which begins with th uses the voiceless sound [θ].

The reason for this is that in Old English and earlier forms of the Germanic languages, there was only a single interdental fricative, which alternately regularly between the voiceless form [θ] at the beginning of words and the voiced form [ð] in the middle of words. Modern English retains traces of this regularity, as th is usually voiceless at the beginning of a word (thin, think) and usually voiced in the middle of a word (mother, bathe). However, the intervening centuries have muddled this regularity so that there is no longer any completely consistent rule. It seems that short, common words like the ones listed above acquired the voiced pronunciation because they are usually reduced in speech, and not really pronounced as separate words.

Solution 3:

After reading all these brillian replies and the Wikipedia entry on Phonology and distribution, it seems that one can get >90% correctly by following these few simple rules:

/ð/

  • At the beginning of short functional words (them, that, these, there, ...)
  • In the middle of native words
  • At the end of verbs

/θ/ - The rest