Of sweet tooths and black sheep: when does the plural of a compound turn regular?

According to many dictionaries, the plural of sweet tooth is sweet tooths, and not *sweet teeth (see e.g. here and here; the OED doesn't address the issue explicitly, but one of the examples it lists is A symphony for sweet-tooths.) After looking at the examples at lexico and google books, it appears that sweet tooths is mostly used in the sense of 'people who have a sweet tooth', i.e. who have a liking for sweet foods. Still, there are plenty of examples where sweet tooths is the plural of 'a liking for foods that are sweet':

So for a lighter effect and taste, especially for young children whose taste buds and sweet tooths are just being established, dilute juice with water… (source)
They were soon joined by many home-bakers, indulging their sweet tooths, honing their whisking and frosting skills, wowing their friends, colleagues and families with impressive multi-layer cakes, fluffy cupcakes and abundant sweet bakes. (source)
“I love sweet stuff. I have a real sweet tooth. Do you have a sweet tooth, Hatch?” “I have a whole mouth full of sweet tooths. Or is that sweet teeth?” (source)
Another couple with matching sweet tooths created an elaborate Candy Land theme, complete with… (source)
And there must have been creatures of such affluence that I cannot even speculate about their day-to-day outside the fact of their sweet tooths. (source)
Natives nurse their sweet tooths with Sacher Torte (a rich chocolate cake layered with marmalade) and Linzer Torte (a light yellow cake with currant jam). (source)
The younger ones started squirming around, their sweet tooths temporarily satisfied. (source)

On the other hand, the plural of black sheep seems to be—black sheep and not *black sheeps (see e.g. here; the OED again doesn't address the issue explicitly, but one of its examples is To pick out of the whole mass of English clergy, one or two, or one or two and twenty black sheep.)

Is there a systematic reason for why, in these constructions, the plural of tooth becomes regular, but the plural of sheep doesn't?

My usual sources (CGEL and ComGEL) seem to be silent on the subject, but it is certainly possible that I simply missed the relevant sections.

So far, all I have been able to find is one relevant discussion (here), which says that, in the case of sweet tooths, the reason the plural of tooth becomes regular is that the compound as a whole is both exocentric (meaning, the head of the compound doesn't match what the whole term describes) and idiomatic.

First of all, if this is correct, it would be nice to find an actual reputable source for it (like a comprehensive grammar, a monograph, or a journal article). At the very least, it would seem we need lots more examples of such behavior.1

1Just as important, we also need nonexamples, such as endocentric idiomatic compounds where the plural remains irregular.

Even more urgently, however, it seems to me that the theory does not quite work: black sheep would also seem to be both exocentric and idiomatic, and yet the plural remains irregular. Let me explain.

What CGEL does say (p. 1645) is that

Compounds formed by patterns that invariably result in non-hyponymic compounds are commonly called 'exocentric' with others being, by contrast, 'endocentric'. Considerable problems arise, however, in giving rigorous definitions for these categories, and we shall not make use of this taxonomy in the present discussion.

As far as hyponymy, they explain it like this (same page):

A high proportion of compounds, especially compound nouns, are hyponymic: the compound as a whole is a hyponym of the base that functions as head. Hyponymy is a semantic relation that can in the first instance be most easily explained by reference to nouns. We say that noun X is a hyponym of noun Y when X denotes a subset of what is denoted by Y. This relation may hold between morphologically unrelated words. For example, tulip, daffodil, and rose are hyponyms of flower, while alsatian, poodle, and cocker-spaniel are hyponyms of dog: a tulip is a kind of flower, an alsatian is a kind of dog,and so on. With compounds, the relation of hyponymy is reflected in the morphological structure: wall-flower consists of wall as dependent and flower as head, and denotes (in its literal sense) a kind of flower; bulldog has bull as dependent, dog as head, and denotes a kind of dog. We can generalise from nouns to compounds of all categories by talking in terms of entailment rather than subsets. This is a wall-flower entails This is a flower, but This is a flower does not entail This is a wall-flower. Similarly for such an adjective as paper-thin: This is paper-thin entails This is thin, but not conversely. And for a verb like hand-wash: They hand-washed it entails They washed it but again the reverse entailment does not hold. For a compound to be hyponymic can be regarded as the default case: it is when the compound is not a hyponym of the head that we need to consider why this is so. There may be a variety of reasons why a compound fails the entailment test for hyponymy. Consider:

[2] hotshot, glow-worm, cholesterol-free, sunset, breath-taking, redskin

The informal term hotshot does not denote a kind of shot, but a person who is skilled or successful in some field: this illustrates the common case where the non-hyponymic property of a compound is simply a matter of lexicalisation, an idiosyncratic feature of the particular compound in question. Glow-worm, which denotes a kind of beetle, is also lexicalised, but in this case there has also been a historical change in the meaning of worm, which earlier had a broader denotation than is now current, being applicable to any animal that crawled, such as snakes, legless lizards, caterpillars, and long-bodied insects like glow-worms. Cholesterol-free is not lexicalised but it is non-hyponymic because free in the sense it has here cannot stand alone as a phrase but requires a complement. The sense of free in It is free of/from cholesterol is not the same as in It is free. Sunset involves a particular sense of set which occurs only as a verb (the corresponding noun being the derivative setting), so again It was a beautiful sunset doesn't entail It was a beautiful set. Similarly with the adjective breath-taking: there is no adjective taking (except with the specialised sense of "captivating"), and hence His arrogance was breathtaking does not entail His arrogance was taking. Redskin "Red Indian" is another example of lexicalisation, but it illustrates a pattern of compounding which necessarily results in a non-hyponymic form. It belongs to the pattern (discussed in §4.2.1 below) where the literal meaning gives a property of the entity the compound denotes: a redskin is not a kind of skin but a kind of person, the kind that has (or is perceived as having) redskin.

Thus the question becomes, is black sheep hyponymic or not? In the comments, users herisson and GEdgar suggest that it is. This would imply that saying He is a black sheep of the family implies He is a sheep (possibly He is a sheep of the family). It seems to me that this is clearly not the case. It seems to me that black sheep is more like2redskin in the passage from CGEL.

2I apologize for using that offensive term; I use it only to make a grammatical point and would not use it if I could find a different term which is explained to be non-hyponymic by an authority as reliable as CGEL.

And if it turns out that, despite everything I just said, black sheep is hyponymic/exocentric after all, I would still very much like to see a reputable source for the theory that exocentric and idiomatic NP compounds have regular plurals even if their heads do not. At the very least, I would like to see more examples (and, just as important, nonexamples) supporting that theory.

Appendix: what is a compound?

This is motivated by Araucarias question in the comments. I did think about whether compound is the right term. CGEL gives no example of a morphological compound that isn't at least hyphenated. On the other hand, CamGEL's examples of compound nouns include assistant director, cleaning woman, washing machine, book review, office management, crime reporter, spending money, diving board, air rifle, steam engine, oil well, food poisoning, hay fever, piano keys, oak tree, toy factory, cough drops, and others. Their definition is (p. 1567)

A compound is a lexical unit consisting of more than one base and functioning both grammatically and semantically as a single word.

They add that (pp. 1569-1570)

If prosody reflects the semantic structure, so too does orthography. The semantic unity of a compound is reflected in an orthographic unity:

a black bird but a blackbird

Spelling conventions are however less dependable than prosody. Practice varies in words and some compounds may even occur in three different forms, 'solid', hyphenated, and 'open'; eg:

a flower pot   a flower-pot   a flowerpot

But in general there is a progression from open to solid as a given compound established, and hence widely recognized and accepted as a 'permanent' lexical item.

In AmE, hyphenation is less common than in BrE, and instead we find the items open or solid (more usually the latter) where BrE may use a hyphen:

language retarded (esp AmE), language-retarded (esp BrE),
psychosomatic (esp AmE), psycho-somatic (esp BrE).

and

It may be useful to conceive of 'partial' compounding to account for the formal and semantic gradience between phrase and compound:
We need some furniture for the offices.            [1]
We need some office furniture.                         [2]
Office furniture is getting more expensive.      [3]
In [2] we have an expression appropriate to phase structure with no necessary lexicalization of 'office furniture' but merely referring to furniture that will be used in the office(s). In [3] however, the generic statement makes the beginning of lexicalization a more plausible interpretation: it is implied that there is furniture of a kind designated specifically for office use. 'Partial' compounding may be said to have taken place, though the stress pattern and spelling still lean in the phrasal direction.


"Black sheep" is always endocentric

"Black sheep" is endocentric. CGEL's decision to avoid using that term because of difficulty in defining it is probably well justified, but even without a rigorous definition, I think it is possible to give a more accurate definition of the concept than the one given in the linked blog post. And I think it is necessary to do so to address your question because I don't know of any claim that non-hyponymic phrases or compounds in general pluralize regularly: the claim that I am familiar with is specifically about exocentric compounds (I linked to some relevant Language Log posts in the answer I wrote to Bigfoots or Bigfeet?).

"Exocentric" refers to compounds that have an implied head rather than being headed by any of the constituent words

The terms "endocentric" and "exocentric" are based on the concept of a "head". I don't have a rigorous understanding of this concept, but my informal description would be that the head of a linguistic structure is the part from which it inherits its syntactic category and basic meaning. E.g. the noun "hair" can be combined with the adjective "black" to form the noun (phrase) "black hair". Because "black hair" is headed by the noun "hair", it is used in the same kinds of grammatical contexts and has a similar meaning to "hair". There are various complications that can arise; e.g. disputes about whether determiners are heads. The term "head" is relative to a given linguistic construct; the construct "child with black hair" is headed by the noun "child", but contains the construct "black hair" which is headed by the noun "hair".

An exocentric compound is one that is analyzed as lacking a head, or having an implicit head that is not any of the explicit constituents of the compound. In reference to a person, "sweet-tooth/sweet tooth" can be interpreted as an exocentric compound standing for something like "someone who has a sweet tooth". The implied head of the construct would be some invisible element that contributes the meaning "someone who has...". "Tooth" would be the head of the embedded construct "sweet tooth" that refers to what the person has, but not the head of the construct that refers to the person.

The types of compounds referred to as "exocentric" are things like Bahuvrihi compounds: the linked Wikipedia article gives a list of examples including lowlife and redhead. Redskin would fall into this category because the word "skin" refers to an attribute of the referent rather than to the referent itself: it stands for something like "a person with red skin", as covered by the last paragraph of the CGEL quote.

It is disputed whether compound words in this category are actually exocentric. Laurie Bauer in "English Exocentric Compounds" suggests that there is no implied head and instead the term refers to a person through synecdoche (p. 7).

Exocentricity is not the same thing as non-hyponymicity

When present, the head of a grammatical construct cannot always be used by itself to accurately describe the referent of the overall construct. For example, the construct "a fake passport" is headed by the noun "passport", even though it would be misleading to describe a fake passport by just saying that it "is a passport". "Fake" is what is called a "non-subsective adjective", an adjective that is regularly used to create non-hyponymic constructs. These constructs can have irregular plural forms; e.g. the plural of "a fake tooth" is "fake teeth".

"Black sheep" in that quotation doesn't refer to a subset of "sheep" in the usual sense because it has come to be used as a metaphor. (Note the careful wording of CGEL's "formed by patterns that invariably result in non-hyponymic compounds"--"black sheep" is not formed in such a way, since it also has a hyponymic literal use.) But whether used literally or metaphorically, "black sheep" is still headed by the noun "sheep", so "black sheep" is an endocentric construct.

"Sweet teeth" exists as an endocentric compound

Separately, I wanted to note that "sweet teeth" is not unacceptable in all contexts. Of course, it could be used compositionally if you had to describe literal teeth that literally tasted sweet. But it is also used metaphorically by some speakers in situations describing something people have rather than something people are. Here's one example from Google Books:

Every year there are more little feet padding about, and more cheeky faces craving goodies to satisfy their sweet teeth.

("Introduction: Sweets and Treats", All Things Sweet)