In 'The hat is red', how do we know that 'is red' is a constituent?

As far as I understand, in the sentence

[1] The hat i̲s̲ ̲r̲e̲d̲.

the underlined part, is red, is a syntactical constituent ('a group of words that functions as a single unit within a hierarchical structure'). I'm interested in the point of view of phrase structure grammar.

So far, I can justify the claim that it is a constituent only by appeal to authority. For example, on p. 50 of CGEL, it is stated that in the sentence he is ill, the is ill is a verb phrase (VP) and a constituent, as evident from the following tree diagram:

constituent structure of 'he is ill' from CGEL, p. 50

And surely, whatever reasons CGEL has to favor its analysis of he is ill, completely analogous arguments will show that is red is a VP and a constituent in [1].

The trouble is, it is not entirely clear what those arguments are.

The authority of grammars and textbooks must ultimately come from linguistic evidence, which, in turn, is collected one specific example at a time. In particular, I was hoping that it may be possible to demonstrate that is red is a constituent by applying some sort of constituenthood test. An example of such a test: a pro-form substitution test. A word sequence 'passes that test' (and is thus a constituent) if we can find a sentence, acceptable to most native speakers, in which that word sequence is substituted by a pro-form.

The pro-form substitution test can be used to show that large classes of verb phrases are constituents. For example, consider

[2] John h̲i̲t̲ ̲t̲h̲e̲ ̲b̲a̲l̲l̲.

We suspect that hit the ball is a constituent. To demonstrate this, note that the following is a grammatically acceptable sentence:

[3] Jack h̲i̲t̲ ̲t̲h̲e̲ ̲b̲a̲l̲l̲ and John d̲i̲d̲, too.

We see that in the second conjunct, hit the ball is substituted for by the pro-form did, and is therefore a constituent.

However, I haven't been able to come up with a definitive constituenthood test that is red clearly passes. As far as I understand, testing for constituenthood of verb phrases (VPs) is indeed trickier than in the case of other kinds of phrases. Apart from pro-form substitution, there are other kinds of reliable tests that can often be used for determining whether some word sequence is a VP. One, for example, is (what McCawley calls) V'-deletion. Unfortunately, thus far, I am unable to find a way to use that (or any other) reliable test in the case of is red.

True, is red can be coordinated with something that is definitely a VP, as in

[4] The hat is red and weighs 151 g.

However, this is not enough to show that is red is a VP and a constituent, because (as I'm told in many sources, including the Wikipedia) coordination is not a reliable test for constituenthood.

Could someone provide a reliable, clear test of constituenthood that is red passes? Or is the argument for its constituenthood more indirect? If the latter, could you provide an outline of the argument? (If possible, I'd like something more concrete than this sentence, found in CGEL (p. 21): 'The full support for a decision in grammatical description consists of confirmation from hundreds of mutually supportive pieces of evidence of many kinds'.)


I did mention in my question that there is one piece of evidence that is red is a constituent: it can be coordinated with something that is definitely a VP, as in

[1] The hat is red and weighs 151 g.

I also said that this is weak evidence because it is known that one can sometimes coordinate non-constituents.

However, I now think that if one supplements [1] with additional considerations, as I will do below, this will turn it into a much stronger argument---possibly strong enough to be acceptable as the answer to my question.

Please tell me what you think by upvoting (or downvoting) and by commenting. (But if you do downvote, please explain why.) Whether I accept this answer or not will depend on the feedback I receive (unless, of course, a clearly better answer appears in the meantime).

The required additional considerations are outlined in CGEL (pp. 1348-1350): 'Coordination as evidence for a VP constituent'. Basically, we will ask 'what else could is red in [1] be but a constituent?', consider all known alternatives in turn, and conclude that the analysis where is red is a constituent is better than all the alternatives.

The principle

What CGEL says is that we have the following principle:

[35] In general, if a sequence X can be coordinated with a sequence Y to form a coordination X and Y, then X and Y are constituents,

but adds:

Coordination clearly does not provide a simple and absolute criterion for constituent structure: the qualification 'in general' in [35] is indispensable. It nevertheless remains a useful criterion: if a sequence X can be coordinated, then the simplest account will be one where it is a constituent entering into basic coordination, and we will adopt some other, more complex, analysis only if there are independent reasons for doing so.

Thus we will say that X is a constituent if that is the simplest analysis that accounts for all facts---and that is usually the most we can ask for in any analysis, linguistic or otherwise, for there is no upper bound on how complex an analysis one can invent.

CGEL's discussion uses the sentence Sue found the key as its test case. I will reproduce what they say about it, and then, at each stage, discuss how what was said relates to The hat is red.

How does coordination support the analysis of a clause like Sue found the key into two immediate constituents, as in [36i], rather than three, as in [ii]?

[36] i Sue | found the key.                       [NP + VP: the VP analysis]
        ii Sue | found | the key     [NP + V + NP: the 'no VP' analysis]

Principle [35] supports the VP analysis

The sequence found the key can be readily coordinated:

[37] Sue found the key and unlocked the door.

Only analysis [36i] is therefore consistent with principle [35]; other things being equal, it is to be preferred over [36ii] because it allows us to subsume [37] under basic coordination, so that it requires no special treatment. Principle [35], however, is qualified, not absolute, so we need to consider the matter further.

So far, word for word, the discussion applies to The hat is red: does the sentence have two immediate constituents (The hat | is red) or three (The hat | is | red)?

The additional considerations

VP-coordination vs clause-coordination with ellipsis of the subject

An alternative treatment of the coordination here, one that is consistent with [36ii], is to say that the coordinates are clauses, the second having ellipsis of the subject. The first coordinate will then be not found the key but Sue found the key, and the second will be ̲ ̲   unlocked the door. This accounts for the equivalence between [37] and Sue found the key and she unlocked the door. But we saw in §1.3.1 that there are many cases where no such equivalence obtains, as in:

[38] i No one treats me like that and gets away with it.                       [VP-coordination]
       ii No one treats me like that and no one gets away with it.   [clause-coordination]

An elliptical clause analysis doesn't provide a satisfactory account of coordination like [38i].

In our case, the challenge to explain why The hat is red and weighs 151 g must be an instance where we coordinate the VP is red with the VP weighs 151 g (so-called 'VP-coordination'). Why couldn't it instead be an instance of 'clause coordination', where we have an ellipted subject in the second coordinate? In other words, why can't we say that The hat is red and weighs 151 g is completely equivalent to The hat is red and [it]/[the hat] weighs 151 g? It seems that either reading would work equally well.

CGEL points out that in some settings, the two readings do not work equally well. So consider No hat is red and weighs 2 lbs, and note that it does not mean the same thing as No hat is red and no hat weighs 2 lbs . The former only says that there are no hats that are both red and weigh 2 lbs, whereas the latter says that there are no red hats, whatever their weight, and also that there are no hats that weigh 2 lbs, whatever their color. We conclude that The hat is red and weighs 151 g is better analyzed as a VP-coordination than as a clausal coordination.

Basic vs right nonce-constituent coordination

Another alternative consistent with [36ii] would be to say that the underlined sequences in [37] are merely nonce-constituents, constituents in the coordination but not elsewhere. This is to treat [37] like our earlier example:

[39] I gave $̲1̲0̲ ̲t̲o̲ ̲K̲i̲m̲ and $̲5̲ ̲t̲o̲ ̲P̲a̲t̲. (=[i6ii])

There are, however, two important differences between [37] and [39]. In the first place, the nonce-constituents in [39] have to be parallel in structure (as noted in §4.3), whereas those in [37] do not - compare, for example, Sue found the key and departed. Secondly, the reason why $10 to Kim is not a normal constituent is that there is no direct syntactic relation between the parts, $10 and to Kim: they are, rather, separately dependents (complements) of give. Found the key in [37] is quite different: here there is a syntactic relation between the parts, found being head and the key dependent. The nonce-constituent analysis is a more complex type of construction than basic coordination, applying under restricted conditions (the requirement of parallelism) and justified by strong independent arguments against recognising the coordinates as normal constituents: in the case of [37] we have no reason to prefer the more complex analysis to the one that follows the general principle given in [35].

In our case, we also lack any requirement of parallelism (e.g. The hat is red and reeks), and we also have that there is a syntactic relation between is and red, namely is is the copula and red a predicative complement of it. Thus, like CGEL, we have no reason to prefer the more complex nonce-constituent analysis to the one that follows the general principle given in [35].

Finally,

Basic vs delayed right constituent coordination

The relevance of [37] to constituent structure might be challenged on the grounds that it is also possible for coordination to group found with Sue:

[40] M̲a̲x̲ ̲l̲o̲s̲t̲ but S̲u̲e̲ ̲f̲o̲u̲n̲d̲ the key.

If coordination can group found with either the key or Sue, the argument would go, then it can't provide evidence for a constituent grouping of found with just one of them, the key. But such an argument fails to recognise that there is a major difference between the coordination of [37] and that of [40]. The latter represents a much less usual type of coordination than [37], and this instance of it is indeed of somewhat marginal acceptability because of the low weight of the key; acceptability is increased by expanding to the key to the safe but greatly diminished by reducing to it. Example [40] would characteristically have special prosody, with a clear break before the key. [These properties do not hold for all cases of delayed right constituent coordination, but even when they do not there will be independent evidence for treating the coordination as non-basic. Take, for example, [24iii], He was accused but found not guilty of stalking a woman for seven years: it is evident that in the non-coordinative He was found not guilty of stalking a woman for seven years the of phrase is a complement of guilty, not find not guilty, because it regularly occurs with guilty quite independently of the presence of find, as in He was/seemed guilty of treason.] But [37], by contrast, has no such limitations or special prosody, and can be taken to represent the most elementary type of coordination: as such, it does provide valid evidence in support of the VP analysis.

Again, what CGEL says of its example, we may say of ours. Consider the following instances of 'delayed right constituent coordination' (CGEL's preferred term for what everyone else calls right node raising):

[2] a. The hat is but the scarf is not red.
       b. The scarf only seems but the hat is red.

As in CGEL's example [40], [2a] and [2b] are of marginal acceptability, which increases if the 'weight' of the 'extracted element' is increased:

[3] a. The hat is but the scarf is not a family heirloom left to me by my grandpa.
       b. The scarf only seems but the hat is well over 100 years old.

Also, when [2a] and [2b] are pronounced, there will characteristically be a prosodic break before red.

In contrast, there are no special limitations on the 'weight' of what can be coordinated with is red at a given level of acceptibility: on one extreme we may coordinate with a single word, The hat is red and reeks, on the other with a much longer phrase, The hat is red and makes me think of my childhood summers which I spent with my grandpa, and both are equally acceptable. Also, there is no special prosodic break in any of these.

Conclusion

We have contrasted the analysis of

[1] The hat is red and weighs 151 g.

(and similar sentences) in which is red is a constituent to three possible alternatives: 1. one where [1] coordinates clauses and the second clause has subject ellipsis; 2. one where is red and weighs 151 g are nonce constituents; and 3. one where is is argued to be just as easily grouped with the hat as it is with red. We found that in all three cases, the analysis where is red is a constituent is simpler, more plausible, or both. This gives us solid evidence that is red is therefore indeed a constituent.