Identifying Compound words in Modern English

Compound words like SNOWMAN etc, are obvious compound words in Modern English, as both words that make up the compound word exist as words in Modern English.

However, words like SHEPHERD aren't words made up of 2 actual English words (SHEP isn't a word in Modern English), but was originally a compound word in Old English (the word originates from Middle English schepherde, from Old English sċēaphierde, a compound of sċēap (“sheep”) and hierde (“herdsman”)) or a compound word in another language like Latin for example. In these instances, would these words (words that seem like compound words but aren't made up of 2 actual English words) be defined as "compound words" in Modern English?

Secondly, is https://en.wiktionary.org/wiki/Category:English_compound_words a reliable/accurate source for identifying whether a specific word is a compound word?

EDITED FOR CLARIFICATION: I'm sorry for the vagueness of the question, as I'm asking as layman who's confused about the conditions for a word to be defined as a compound word. For clarification, I'm referring only to closed compound words (just a single word) and more specifically words being spelled/written out. And my interpretation of compound words matches what Edwin Ashworth has quoted, 'Compounding derives a new word by joining two morphemes that would each usually be free morphemes.' and I define a free morpheme as 'a morpheme that can stand alone as its own word'. My issue with SHEP in SHEPHERD is that SHEP isn't a word in the dictionary, which is why I am uncertain as to whether SHEPHERD meets the requirement to be a compound word.

My questions are: (Q1): is SHEPHERD a compound word?

(Q2): does a (closed) compound word have to be made up of 2 or more english words that exist in the dictionary (SNOW+MAN = SNOWMAN)? (whereas shep is typically not a word in the dictionary, thus I do not think of it as a free morpheme)

(Q3): if the answer to (Q1) is yes, did SHEPHERD inherit its status as a compound word due to it evolving from the Old English "sċēaphierde" which is a compound word formed by sċēap and hierde?


Solution 1:

At least one highly regarded authoritative source, CGEL, would say that etymology is not a reliable guide to whether a word should be considered a compound one or not. Here is the relevant segment (p. 1627); the relevant paragraph is the second one, but I include the first one for context.

Morphological analysability vs etymology

Words are most clearly analysable into constituent parts when the latter occur with the same or similar meaning elsewhere, as with bed·​room, un·​kind, soft·​ness, etc. But this is by no means a necessary condition for analysability. There is no difficulty in recognising straw·​berry and draw·​ing·​room as compounds even though the meaning of the whole is not predictable from the meanings of the component bases: it is enough that the second base is formally and semantically identifiable with the berry and room that occur as separate words or in semantically more transparent compounds like black·​berry and bed·​room. Similarly with derivative bases like dur·​able and the others listed in [2ii], event hough affixes characteristically have less specific meanings than bases. There are even cases where neither component contributes a clearly separable component of meaning to the whole. The meaning of black·​mail, for example, is not predictable from the meanings of black and mail as independent words, but black remains easily recognisable as a separate morphological unit because it occurs in a considerable number of compounds and phrases where it likewise does not have its literal meaning: blackleg, blacklist, blacksmith, black magic, black mark, black market, black spot, and so on.

The case of blackmail is to be distinguished from that of blackguard. Blackmail is morphologically analysable but semantically opaque as a result of historical change. (The original meaning of the mail component was "coin, rent" and with black having the meaning "illicit" still seen in black market: the compound was interpretable as "illicit money".) But with blackguard historical change has resulted in the loss of /k/ from black, and /blæɡɑːd/ is now a simple base, not a compound: the first syllable is neither phonologically nor semantically identifiable with /blæk/. The original base black is retained in the spelling, but this can be seen as a reflection of the historical source rather than as a justification for treating blackguard as a compound. Compare, similarly, /kʌbəd/ cupboard, /ˈbrɛkfəst/ breakfast, and so on. With husband even the orthography gives no indication of the original compounding of house and bonda "householder", a word which has now vanished from the language. Any analysis of such words as blackguard, cupboard, breakfast, husband, and the like belongs therefore to the field of etymology, the study of the historical source of words, not to the field of morphology, the study of the grammatical structure of words. There is nothing in the present-day language system to motivate an analysis of such words into smaller morphological units.

Q1. If we adopt CGEL's criteria, it would seem that shepherd would not be considered as having a compound base.

Q2. Setting aside compound bases for a moment, let's first note that, in general, the base need not be a separate word in a dictionary. A base that can stand on its own is called free; otherwise it's bound. (CGEL, p. 1625). Some examples of English words with bound bases are dur·​able, dur·​ation, aggress·​ive, pre·​empt, and dis·​perse.

Getting back to compounds, there doesn't seem to be any rule of English morphology that would prohibit compounds consisting of more than one bound base. They are likely to be quite rare, but I would suggest regi·cide as an example.