Why do we pronounce a long second vowel in "decide", but a short second vowel in "decision"?

Background info on pronunciation of Latinate words in English

Latin vowel length very rarely has a direct effect on the pronunciation of English vowels in Latinate words. (It can have an indirect effect, since Latin vowel length affected the placement of stress, and the placement of stress affects how we pronounce vowels in English.) The following Wikipedia article provides a fairly good overview: Traditional English pronunciation of Latin.

Instead, vowel length in English words taken from Latin (by "vowel length" I don't mean phonetic length, but the choice between the traditionally/historically paired vowel phonemes such as "short i" [ɪ] and "long i" [aɪ]) usually depends on the morpho-phonological context (surrounding letters and sounds), and the placement of the stress. (In this post, I won't go too much into the rules for placing the stress in words like this.) Usually, the most relevant part of the context is the letters after the vowel.

In "iCiV" words, the first "i" is pronounced as "short i", not "long i"

In division, the first "i" comes directly before a consonant followed by an unstressed letter "i" and then another vowel letter. The vowel letter "i" (and in addition, "y") tends to have a "short" pronunciation in this position (which I will represent as "iCiV"). This also applies to words with the spelling pattern "iCeV" or "yCeV", so overall it could be represented as "{i,y}C{i,e}V". The theoretical/historical reason for this pattern seems to be unclear, but I did find a paper that discusses it from a linguistic point of view: "English Syllable Structure and Vowel Shortening" by Balogné Bérces Katalin (1998), which calls it "-ion shortening." It seems people writing about English have been aware of this phenomenon for quite some time; I've also found it noted clearly in Walker's Critical Pronouncing Dictionary (1791) as rule 507, and alluded to less clearly in Nares' Elements of Orthoepy (1784), Chapter VII.

Many other words show this alternation: ignīte-ignĭtion, incīse-incĭsion, opīne-opĭnion, sacrifīce-sacrifĭcial, vīce-vĭcious, reptīle-reptĭlian (I realize some people reduce the vowel in the second syllable of reptile, in which case it would not be an example). You can see that the all the words with short "i" fit into the pattern "iCiV"; the identity of the consonant and the vowel letter after the second i can vary. For y, there are fewer alternating words to give as examples but we have for example pȳthon-Pythia and Scythia, chlamydia, steatopygia, myriad.

This is different from antepenultimate shortening/"trisyllabic laxing" (as in sāne-sănity, serēne-serĕnity, divīne-divĭnity, verbōse-verbŏsity). Trisyllabic laxing applies to all vowels except for u. In contrast, {a, e, o} are generally long, not short, before "_CiV" (as in relāte-relātion, complēte-complētion, devōte-devōtion). (I made a separate post about that lengthening tendency: Why is “salient” pronounced with a “long a” sound?)

As far as I know, there has been some debate about whether it is more accurate to analyze these length alternations in terms of vowels being shortened in certain contexts vs. being lengthened in certain contexts, but that's only a theoretical concern.

This rule is sometimes violated, but it's fairly reliable

Even though rules like this aren't as well-known as the rule about using a short vowel before doubled consonants or the rule about using a long vowel before "silent e," they still work most of the time for predicting the pronunciation of words.

There seem to be very few violations of the shortening rule for "i/y" before "_CiV"/"_CeV", but some do exist.

  • suffixed comparative or superlative forms of adjectives ending in /aɪCi/: Janus Bahs Jacquet pointed out in a comment that /aɪCiV/ is found in words like wīliest, tīniest, tīdiest, which are superlative forms of wīly, tīny, tīdy. Actually, I'm not sure if this is a true "violation" of the rule, or just an exception where the rule would not even be expected to apply if it were formulated correctly. Other words like this are whīniest, grīmiest, īciest, shīniest, spīciest, prīciest, spīkiest, dīcier.

Also, the rule obviously does not apply when the "e" is just a "silent e", as in the word "timeous", pronounced /ˈtaɪməs/, from the monosyllable "time" (although the OED entry for timeous says "Spelling pronunciations are sometimes found, e.g. Brit. /ˈtʌɪmɪəs/, U.S. /ˈtaɪmiəs/, South African /ˈtʌɪmiəs/, Brit. /ˈtɪmɪəs/, U.S. /ˈtɪmiəs/; N.E.D. (1912) mentions also a spelling pronunciation (ti·myəs) /⁠ˈtɪmjəs/."

Here is a list of some miscellaneous, mostly rare Latinate words and names with pronunciations (sometimes just optional variants) that violate this rule: hygienic, cyclothymia, pineal, piceous, ileus, pileus, pileate(d), pileole, pileous, Dionysius. There might be a few more that I am not aware of. See below for more detailed discussion:

  • hȳgienic (which as Merriam-Webster (MW) indicates is sometimes pronounced /ˌhaɪdʒiˈɛnɪk/; thanks to fdb for bringing my attention to this fact on Linguistics SE!)

  • cyclothȳmia (I actually have no idea how this word came to have the pronunciation listed in dictionaries)

  • pineal can have an irregular pronunciation, although the regular pronunciation with short "i" /⁠ɪ⁠/ also exists. I found three pronunciations listed in dictionaries: /ˈpɪniəl/ PIN-ee-al (regular), /ˈpaɪniəl/ PINE-ee-al (irregular), and /paɪˈniːəl/ pie-NEE-al (irregular stress placement, but regular aside from that). I can't find a good explanation for why the last two exist.

    • The Latin etymon pīnea does have a "long i" (/iː/) in the first syllable, but normally that would be irrelevant to the English pronunciation, as we see from comparing this word with lineal, from Latin līneālis < līnea. It's related to the English word pine, which always has /aɪ/, but lineal is likewise related to the English word line. It seems to have entered English via the French word pinéal(e), but the French word doesn't have the sound /aɪ/.

    • The pronunciation with stress on the second syllable is irregular because the e in this word was short in Latin, although there is known to be at least one other word with established stress that is technically "irregular" like this. Idea is standardly stressed on the second syllable despite coming from Latin idea with short e. However, idea does have stress on the e in Greek (ἰδέα), and this seems to have passed into the Romance languages (e.g. Spanish "idea", which has stress on the "e", and French idée). There doesn't seem to be any similar excuse for stressing the e in pīneus/pīnea/pīneum; the descendents in Romance that I have found all indicate stress on the first syllable, like Spanish piña and Italian pigna.

    The etymologically related aphid genus name Pineus has an irregular pronunciation "⁠\ ˈpīnēəs \⁠" according to Merriam-Webster.

  • piceous (optionally): The OED says that this uncommon adjective meaning "pitchy" or "resembling pitch" (and more specifcally, used in entomology to mean "of a pitchy or brownish-black colour") has various pronunciations, including a regular and an irregular variant, and apparently for British English a third variant with irregular penult stress: "Brit. /pʌɪˈsiːəs/, /⁠ˈpʌɪsɪəs/, /ˈpɪsɪəs/, U.S. /ˈpaɪsiəs/, /ˈpɪsiəs/". Collins also mentions the two variants with /aɪ/ and /ɪ/ (it doesn't mention the penult-stressed variant). Merriam-Webster and the AHD only give the irregular pronunciation with a stressed "long i" sound in the first syllable.

  • ileus (optionally): The Oxford English Dictionary (OED) gives the pronunciation as "Brit. /⁠ˈɪlɪəs/, /ˈʌɪlɪəs/, U.S. /ˈɪliəs/". It gives the etymology as "< Latin īleus, īleos, < Greek ἰλεός or εἰλεός".

  • Words starting with pile-: The noun pileus (which the OED says is from Latin "pīleus, pilleus") is listed with only a single pronunciation with an irregular "long i" sound in the OED ("⁠/⁠ˈpʌɪlɪəs/") MW (" \ ˈpī-lē-əs \"), AHD ("pī´lē-əs"), and Oxford Dictionaries. But Collins "⁠(ˈpaɪlɪəs, ˈpɪl-)" and Dictionary.com/Random House mention a regular pronunciation with "short i" in addition to the irregular pronunciation with "long i". Webster's 1913 puts the stress mark after the i, unlike in e.g. "cilia", which seems to indicate a pronunciation with a long vowel.

    Irregular pronunciations with "long i" also seem to be possible for the related words pileate(d) (the OED gives "/ˈpʌɪlɪeɪtɪd/, /ˈpɪlɪeɪtɪd/"), pileole (the OED gives "/ˈpʌɪlɪəʊl/"; the non-anglicized form pileolus, which may be more common, is stressed on the antepenult according to MW ("\ pīˈlēələs \") in which case the "long i" in the first syllable is not irregular) and pileolated as in "pileolated warbler" (MW says "\ ˈpīlēəˌlātə̇d- \").

    The adjective pileous, despite its similar appearance, doesn't seem to be etymologically related to pileus (the OED says pileous is "< classical Latin pilus hair (see pilus n.) + -eous suffix, after osseous adj., carneous adj., etc. Compare earlier pilous adj."), but pileous also has an irregular pronunciation with "long i", which is the only pronunciation given by the OED ("/ˈpʌɪlɪəs/"). MW and Collins mention a regular alternative, however. As with pineal, I don't see any reason for the existence of the irregular pronunciation; perhaps it arose due to the influence of the regular "long i" sound in the related words pilus n. and pilous/pilose adj.

  • Dionysius (occasionally): The name "Dionysius" apparently is sometimes pronounced with a "long i" sound, I would guess in part due to influence from the similar name "Dionysus".

    The AHD gives the pronunciation as "dī-ə-nĭsh´ē-əs, -nĭsh´əs, -nī´sē-əs"; MW gives "\ ˌdī-ə-ˈni-shē-əs , -sē-əs , -shəs ; -ˈnī-sē-əs \"; Collins gives only "ˌdaɪəˈnɪsɪəs" for British English but "ˌdaɪəˈnɪʃəs ; dīˌənishˈəs; ˌdaɪəˈnɪsiəs ; dīˌənisˈēəs; ˌdaɪəˈnaɪsiəs ; dīˌənīˈsēəs" for American English.

Note: Greek names ending in "-eus" may be pronounced either with a hiatus or with the "long u" sound. When "eu" is pronounced as "long u", the shortening rule does not apply (apparently), so the name "Phineus" for example has a short i when it is pronounced with three syllables ("FIN-e-us") or a long i when it is pronounced with two syllables ("FINE-yoos"). This isn't really an exception to the rule, but it's close enough that I thought I'd mention it.

Violations of the rule in fictional names

It's probably not surprising that the rule can be violated in names that occur in works of fiction. I doubt I could provide a comprehensive list, but one example that I know of is that the video game The Legend of Zelda has some proper nouns and related terms pronounced with [aɪ] before "_CiV": a goddess named Hylia /haɪliə/ and an adjective of ethnicity Hylian /haɪliən/, both related to a fictional land with the name of Hyrule /'haɪruːl/. (For information on the pronunciation of these terms, I looked at the following GameFAQs threads: "Now I have to change how I pronounce Hylia/Hylian.", "Pronunciation of Hyrule, Hylia and other according to japanese pronunciation".)

(Possible) Exceptions with a "long e" sound (IPA /iː/)

There are some words and proper nouns spelled with "iCiV" that are pronounced by some people with /iː/ (a "long e" sound, not a "long i" sound) instead of /ɪ/, such as Parisian, artemisia, aphrodisiac, Tunisia, bulimia, prestigious, Phoenicia(n), Cecilia.

  • I haven’t found any dictionary that mentions it, but I’ve noticed that some people also seem to use /iː/ in words ending in -philia such as p(a)edophilia. Here are a few examples from videos on Youtube that I found using Youglish: 1) Matthew Vines: "God and the Gay Christian" | Talks at Google, 2) Timothy Snyder, Monday, March 9, 2015, 3) Gregory Orfalea, "Journey to the Sun" | Talks at Google, 4) Critical Thinking: Theatre in another language.

I'm not sure how the use of /iː/ in the above words can be explained. In at least some cases, it seems plausible that it might be due to French influence, as in Parisian, Tunisia, bulimia, and prestigious (which is related to the word prestige, which is always pronounced with /iː/).

In all the examples that I have found, a regular pronunciation with /ɪ/ exists alongside the one with /iː/, although it may not be as common.

I found out about this phenomenon researching my answer to the following question: Why are "suffice" and "sufficient" pronounced so differently? You may be interested in reading it and the other answers to that question.

herisson says in their answer that it's different from Trisyllabic shortening, I beg to differ with that. As per "Sound Pattern of English" (Authors: Noam Chomsky and Morris Halle), it is because of Trisyllabic shortening. They have explained it in-depth in the book I linked. They say that the suffix ion must have been disyllabic when Trisyllabic shortening applied to words like "decision", "division" (and some others like them which I forgot atm). And if it's disyllabic, then we have two unstressed syllables after the stressed syllable so the vowel in the stressed syllable is most likely to get shortened.

Decide [dɪˈsaɪd]
Decision [dɪˈsaɪ.zɪ.ən] (the pronunciation of the suffix "ion" is not accurate as I don't know what its pronunciation would have been but it's just a supposition).

I wrote [zɪən] because the current pronunciation of "decision" has [ʒ] and I'm pretty sure it's because of the fusion of [z] and the following vowel, which is in fact a glide.

Now we have [dɪˈsaɪ.zɪ.ən] and the dipthong [aɪ] get shortened to [ɪ] (as you probably know) so the whole pronunciation became [dɪˈsɪ.zɪ.ən] and then after the fusion of [z] and the glide, [dɪˈsɪ.ʒən]

