Do "cook the" and "cooked the" get pronounced differently?
How are they different in pronunciation?
In other words, how can one recognise the difference purely by sound?
Solution 1:
The pronunciation can vary with the English accent of the speaker. While many may pronounce "cook" and "cooked" followed by "the" in the same manner, as an EN_AU speaker, I would
-
in slow speech say "cook't the turkey", with two adjacent consonant sounds, or
-
in ordinary speech, glottalize the 't' sound used to pronounce 'ed', also known as "swallowing the 't'". This may sound almost identical to "cook" but it feels quite different to say and I suspect does not sound exactly the same.
See also T-glottalization on Wikipedia.
Solution 2:
John Lawler in a comment wrote:
In practice, there is no difference in pronunciation and the addressee is expected to infer the tense, if necessary. Tense is not very important in English (there's only the two tenses, and half the verbs are tenseless infinitives or gerunds anyway) and the difference rarely matters. If it does, one can enunciate more carefully.
Let me try to elaborate on that if I can. There will be times that both cook the and cooked the will end up sounding the same or almost the same in actual speech.
That doesn’t mean it is somehow impossible for native speakers to pronounce them differently. We certainly can when we want to do so or are specifically directed to do so. It’s just that it doesn’t always work out that way in all possible utterances, which is why we do not attempt to rely on sound alone to know which of the two was said.
It’s very easy for phonemic /t/ from cooked the to be phonetically realized as any of:
- an emphatically/intentionally aspirated alveolar stop [tʰ]
- to an unaspirated alveolar stop [t], possibly without an audible release [t̚]
- to any of a voiced alveolar stop [d] or a flap [ɾ] or a glottal stop [ʔ] to outright deletion
- to an affricate made up of a weak dental stop coarticulated with the following dental fricative such as [t̪͡ð] or [d̪͡ð]
All of those versions are perfectly natural in English due to the phonological effects seen in connected speech, especially when fast or casual or both.
Because of all this you cannot invariably use the sound alone to know which one has been said.
Native speakers therefore never need to hear /t/ represented physically to know which tense was used here. We have other mechanisms that kick in automatically to tell us which is which, and when that happens, we don’t even notice that there was no literal [t] sound there.
We still know they said cooked the and think nothing of it, so much so that when asked immediately afterwards which we heard, we often feel that we heard a /t/ realized even without a [t] there. This is what happens when mapping phones to phonemes in listening.
Real Examples of This
You can and should listen to many speakers saying cooked the books in Youtube videos here. Each of the 22 clips starts with the sentence that includes cooked the books. Hit the "play next" arrow at the bottom right to skip to the next one each time.
Notice how many do not make a /t/ there? Some do and some don’t. It simply is not audible in those who don’t. Many of those speakers are not "pronouncing" any /t/ there in real speech.
But you always know which they said, too, even when you can’t hear it. That’s how you learn this.
Boring Details
/ðeɪˈkʊktðəˈguːs/ → [ðeˈkʰʊk͉̬̚d̪͡ðə̆ˈguːs]
Theory aside, in practice when spoken quickly or casually by a native speaker in normal conversation where the fast-speech rules of connected speech apply — not dictionary pronunciations! — there is no difference between how those two sound.
If someone does not understand you and asks which one you meant, you can go slower and enunciate the sounds more carefully and deliberately. But that isn’t how connected speech is usually realized.
For example, here is how They cooked the goose, which in phonemic dictionary notation is simply /ðeɪˈkʊktðəˈguːs/, really works out in casual connected speech: (the decoding key is at the bottom of this post)
[ {allegro ðeˈkʰʊk͉̬̚d̪͡ðə̆ˈguːs allegro} ]
See how different that phonetic notation is from the phonemic notation you might be expecting to hear? Trying to hear some theoretical difference to figure out which one was said isn’t going to work here. You need other cues.
To know what was said, you do not try to hear a difference that is not there. That is not how native speakers determine which one was said. Sounds that occur in isolated citation forms are nothing at all like what people actually say. So we use other cues based on our lifelong experience of what makes sense here.
That’s why native speakers do not rely principally upon pronunciation in instances such as these when in their minds they assign one or the other sequence. Such sequences never exist in isolation in actual connected speech. They occur only with surrounding context. On rare occasion they may initially guess wrong before later clues appear; that usually happens so quickly they don't even notice it.
Under the fast-speech rules (also called allegro rules) that apply to all natural speech, many complex reductions occur both within a word and across word boundaries. No one puts convenient gaps from one word to the next in real speech. Consonant clusters are always simplified one way or another, but what happens in one utterance will often happen differently in another utterance of that same sequence even when it’s the same speaker in both cases.
Like the consonant clusters in sixths, twelfths, and on both ends of strengths, the abstract phonemic sequence /ktð/ always changes and simplifies phonetically. Your tongue can’t move quickly enough nor carefully enough to separate all those sounds. You certainly have no aspiration or gaps here in connected casual speech. The velar stop and the alveolar stop will likely fuse or be co-articulated, and they will have no audible release.
Consider what happens when you speak these two example sentences aloud at the speed of normal conversation such as you might hear in a book review given over the radio:
-
Much more than a cookbook, Jennifer McLagan’s Odd Bits: How to Cook the Rest of the Animal delves into the rich geographical, historical, and religious roles of nose-to-tail cooking.
-
In My Goose is Cooked: The Continuation of a West Texas Ranch Woman’s Story, we follow a century in the life of pioneer rancher Hallie Crawford Stillwell in the Big Bend country.
In (1), native speakers would always automatically assign the bare form of the verb because they know that’s an infinitive use because of the word to that comes before it. In (2), they would likewise automatically assign the past tense to the verb because the is that precedes is doesn’t license another possibility.
These are just two very simple examples. Other context will provide their own distinct clues. You have to practice listening until your brain makes predictive determinations like these automatically.
Key
The notation [{allegro ... allegro}] is the specific prosodic notation used to indicate fast speech. The detailed International Phonetic Alphabet symbols used there were:
[ðeˈkʰʊk͉̬̚d̪͡ðə̆ˈguːs]
ð voiced dental fricative U+00F0 LATIN SMALL LETTER ETH
e close-mid front unrounded vowel U+0065 LATIN SMALL LETTER E
ˈ primary stress U+02C8 MODIFIER LETTER VERTICAL LINE
kʰ voiceless velar plosive U+006B LATIN SMALL LETTER K
aspirated U+02B0 MODIFIER LETTER SMALL H
ʊ near-close near-back rounded vowel U+028A LATIN SMALL LETTER UPSILON
k͉̬̚ voiceless velar plosive U+006B LATIN SMALL LETTER K
weak articulation U+0349 COMBINING LEFT ANGLE BELOW
voiced U+032C COMBINING CARON BELOW
not audibly released U+031A COMBINING LEFT ANGLE ABOVE
d̪͡ voiced alveolar plosive U+0064 LATIN SMALL LETTER D
dental U+032A COMBINING BRIDGE BELOW
affricate or double articulation U+0361 COMBINING DOUBLE INVERTED BREVE
ð voiced dental fricative U+00F0 LATIN SMALL LETTER ETH
ə̆ mid-central vowel U+0259 LATIN SMALL LETTER SCHWA
extra-short U+0306 COMBINING BREVE
ˈ primary stress U+02C8 MODIFIER LETTER VERTICAL LINE
g voiced velar plosive U+0067 LATIN SMALL LETTER G
uː close back rounded vowel U+0075 LATIN SMALL LETTER U
long U+02D0 MODIFIER LETTER TRIANGULAR COLON
s voiceless alveolar sibilant U+0073 LATIN SMALL LETTER S
The phonemic notation, which is “never” what natives really say in connected casual speech spoken at a good clip, only in unnaturally carefully articulated citation form, decodes to:
/ðeɪˈkʊktðəˈguːs/
ð voiced dental fricative U+00F0 LATIN SMALL LETTER ETH
e close-mid front unrounded vowel U+0065 LATIN SMALL LETTER E
ɪ near-close near-front unrounded vowel U+026A LATIN LETTER SMALL CAPITAL I
ˈ primary stress U+02C8 MODIFIER LETTER VERTICAL LINE
k voiceless velar plosive U+006B LATIN SMALL LETTER K
ʊ near-close near-back rounded vowel U+028A LATIN SMALL LETTER UPSILON
k voiceless velar plosive U+006B LATIN SMALL LETTER K
t voiceless alveolar plosive U+0074 LATIN SMALL LETTER T
ð voiced dental fricative U+00F0 LATIN SMALL LETTER ETH
ə mid-central vowel U+0259 LATIN SMALL LETTER SCHWA
ˈ primary stress U+02C8 MODIFIER LETTER VERTICAL LINE
g voiced velar plosive U+0067 LATIN SMALL LETTER G
uː close back rounded vowel U+0075 LATIN SMALL LETTER U
long U+02D0 MODIFIER LETTER TRIANGULAR COLON
s voiceless alveolar sibilant U+0073 LATIN SMALL LETTER S