Using "non-" to prefix a two-word phrase

Does "non-" prefixed to a two word phrase permit another hyphen before the second word?

If I want to refer to an entity which is defined as the negation of another entity by attaching "non-" it seems strange to attach the "non-" only to the first word when the second one is really the word naming the entity. For example,

non-control freak

looks like a person obsessed with not being in control rather than one who is simply not obsessed with control.

non-control-freak

actually looks better because the "freak" is attached to the "non-" as much as it is to "control-", without the space implying the presence of a phrase break.

This question and its answers make it sound like totally detaching the "non" could be an option, but if I were following the convention of hyphenating "non-" could I add a hyphen after the next word to resolve this problem?


Solution 1:

The standard, but not very satisfying, answer is that you use an EN DASH (codepoint U+2013) as a higher-order HYPHEN (codepoint U+2010). Wikipedia says:

In English, the en dash is usually used instead of a hyphen in compound (phrasal) attributives in which one or both elements is itself a compound, especially when the compound element is an open compound, meaning it is not hyphenated itself.

So for example, it would be a non–Red Sox game, because it is an open compound. Or when you have something that is already a compound, you need a non–child-molester for someone who is not a child-molester, and a non-child–molester for someone who molests non-children. Or if you have a flower that is colored red-violet, then it is a red-violet–colored flower.

However, opinions and recommendations — and perhaps expectations and familiarity — do vary regarding what to do in these situations. An example is how in the draft manuscript of my last book, we originally said (with regard to pattern matching with regular expressions) that:

A \W matches a non–word character.
A \H matches a non–horizontal-whitespace character.

But in copyedit, it was decided that although correct, this was too alien for normal people to immediately apprehend. So we adopted a courageous but unambiguous convention that programmers would immediately apprehend:

A \W matches a non-(word character).
A \H matches a non-(horizontal whitespace) character.

We did it that way because we felt this style, although innovative and hardly something you will find in Strunk and White, was more likely to be clearly and immediately understood by computer programmers than carefully distinguishing en dashes from hyphens. We retained en dashes only in their two traditional and uncontroversial uses:

  • for ranges, like values in the 128–256 range or supplying 1–3 arguments;
  • in dash compounds like a Boyer–Moore search, which should of course not be hyphenated.

See also this question for more about hyphens, en dashes, and em dashes.

Note also that most North American publishers use a hyphen after non only when it precedes a capital letter, so non-British and non-European, but nonbeliever and even nonnative. British publishers are much more apt to hyphenate all non- compounds no matter the following latter, so non-believer and non-native. Just don’t hyphenate nonchalant. :)


Unicode Considerations

In Unicode, there are more dashes than you would believe. In fact, Unicode v6.1 attributes to all these code points the Dash character property, along with their general category and script properties:

U+0002D ‭ -  GC=So SC=Common       HYPHEN-MINUS
U+0058A ‭ ֊  GC=Pd SC=Armenian     ARMENIAN HYPHEN
U+005BE ‭ ־  GC=Pd SC=Hebrew       HEBREW PUNCTUATION MAQAF
U+01400 ‭ ᐀  GC=Pd SC=Canadian_Aboriginal CANADIAN SYLLABICS HYPHEN
U+01806 ‭ ᠆  GC=Pd SC=Mongolian    MONGOLIAN TODO SOFT HYPHEN
U+02010 ‭ ‐  GC=Pd SC=Common       HYPHEN
U+02011 ‭ ‑  GC=Pd SC=Common       NON-BREAKING HYPHEN
U+02012 ‭ ‒  GC=Pd SC=Common       FIGURE DASH
U+02013 ‭ –  GC=Pd SC=Common       EN DASH
U+02014 ‭ —  GC=Pd SC=Common       EM DASH
U+02015 ‭ ―  GC=Pd SC=Common       HORIZONTAL BAR
U+02053 ‭ ⁓  GC=Po SC=Common       SWUNG DASH
U+0207B ‭ ⁻  GC=Sm SC=Common       SUPERSCRIPT MINUS
U+0208B ‭ ₋  GC=Sm SC=Common       SUBSCRIPT MINUS
U+02212 ‭ −  GC=Sm SC=Common       MINUS SIGN
U+02E17 ‭ ⸗  GC=Pd SC=Common       DOUBLE OBLIQUE HYPHEN
U+02E1A ‭ ⸚  GC=Pd SC=Common       HYPHEN WITH DIAERESIS
U+02E3A ‭ ⸺  GC=Pd SC=Common       TWO-EM DASH
U+02E3B ‭ ⸻  GC=Pd SC=Common       THREE-EM DASH
U+0301C ‭ 〜 GC=Pd SC=Common       WAVE DASH
U+03030 ‭ 〰 GC=Pd SC=Common       WAVY DASH
U+030A0 ‭ ゠ GC=Pd SC=Common       KATAKANA-HIRAGANA DOUBLE HYPHEN
U+0FE31 ‭ ︱ GC=Pd SC=Common       PRESENTATION FORM FOR VERTICAL EM DASH
U+0FE32 ‭ ︲ GC=Pd SC=Common       PRESENTATION FORM FOR VERTICAL EN DASH
U+0FE58 ‭ ﹘ GC=Pd SC=Common       SMALL EM DASH
U+0FE63 ‭ ﹣ GC=Pd SC=Common       SMALL HYPHEN-MINUS

Note that codepoints with the general category Dash Punctuation (GC=Pd) do not include U+2212, the MINUS SIGN, which has the Math Symbol general category, GC=Sm. Here are codepoints whose names includes "DASH" but which do not have the Dash character property (which is different from the Dash Punctuation general category, perversely enough):

U+000B1 ‭ ±  GC=Sm SC=Common       PLUS-MINUS SIGN
U+002D7 ‭ ˗  GC=Sk SC=Common       MODIFIER LETTER MINUS SIGN
U+00320 ‭ ◌̠  GC=Mn SC=Inherited    COMBINING MINUS SIGN BELOW
U+02052 ‭ ⁒  GC=Sm SC=Common       COMMERCIAL MINUS SIGN
U+02213 ‭ ∓  GC=Sm SC=Common       MINUS-OR-PLUS SIGN
U+02216 ‭ ∖  GC=Sm SC=Common       SET MINUS
U+02238 ‭ ∸  GC=Sm SC=Common       DOT MINUS
U+02242 ‭ ≂  GC=Sm SC=Common       MINUS TILDE
U+02296 ‭ ⊖  GC=Sm SC=Common       CIRCLED MINUS
U+0229F ‭ ⊟  GC=Sm SC=Common       SQUARED MINUS
U+02756 ‭ ❖  GC=So SC=Common       BLACK DIAMOND MINUS WHITE X
U+02796 ‭ ➖  GC=So SC=Common       HEAVY MINUS SIGN
U+0293C ‭ ⤼  GC=Sm SC=Common       TOP ARC CLOCKWISE ARROW WITH MINUS
U+02A29 ‭ ⨩  GC=Sm SC=Common       MINUS SIGN WITH COMMA ABOVE
U+02A2A ‭ ⨪  GC=Sm SC=Common       MINUS SIGN WITH DOT BELOW
U+02A2B ‭ ⨫  GC=Sm SC=Common       MINUS SIGN WITH FALLING DOTS
U+02A2C ‭ ⨬  GC=Sm SC=Common       MINUS SIGN WITH RISING DOTS
U+02A3A ‭ ⨺  GC=Sm SC=Common       MINUS SIGN IN TRIANGLE
U+02A41 ‭ ⩁  GC=Sm SC=Common       UNION WITH MINUS SIGN
U+02A6C ‭ ⩬  GC=Sm SC=Common       SIMILAR MINUS SIMILAR

Solution 2:

The Chicago Manual of Style gives these recommendations for hyphenating compounds formed with prefixes:

Compounds formed with prefixes are normally closed, whether they are nouns, verbs, adjectives, or adverbs. A hyphen should appear, however,

  1. before a capitalized word or a numeral, such a sub-Saharan, pre-1950;
  2. before a compound term, such as non-self-sustaining, pre–Vietnam War (before an open compound, an en dash is used; see 6.80);
  3. to separate two i’s, two a’s, and other combinations of letters or syllables that might cause misreading, such as anti-intellectual, extra-alkaline, pro-life;
  4. to separate the repeated terms in a double prefix, such as sub-subentry;
  5. when a prefix or combining form stands alone, such as over- and underused, macro- and microeconomics.

In your example, control freak is an open (non-hyphenated) compound, and so would be compounded with the prefix non– as non–control freak (using an en dash, although an ordinary hyphen would be acceptable in email, or other such places where typographic niceties are overlooked).

Solution 3:

No, in written English you may not detach 'non', unless you're reporting a spoken utterance verbatim - in this case you're probably best off with no hyphens or dashes, since any hyphen or dash represents an editorial interpretation.

Yes,'non-' (or 'non–', which as @tchrist's answer tells you is an ingenious and elegant neopunctism for resolving some ambiguities) may be attached to a hyphenated phrase to indicate its negation: 'non–interest-bearing account', for instance.

Yes, you may insert a hyphen into an unhyphenated phrase to which you have prefixed 'non-' (or, again, 'non–') in order to clarify your meaning; but this should be done only if the hyphenated phrase accurately reflects the meaning which is being negated. In your instance 'control-freak' would be correct only if used adjectivally: 'I dislike his control-freak way of handling things.'

And inserting a hyphen or en dash may not resolve the ambiguity. For instance: we all know what a child molester is - a child-molester, a person who molests a child, rather than a molester who is a child. But what about 'non-child-molester'? Is it a person who is not a child-molester, or one who molests adults? This ambiguity is resolved by the usage @tchrist describes—but this is a relatively new device which may not be correctly interpreted or even perceived by readers like me who did not grow up with it.

The best solution, to my mind, is to find another way of saying what you mean, without 'non-' (or 'non–'). Instead of 'He's a non-control(-)freak' say 'He's not a control freak'. Instead of 'I like his non-control-freak way of handling things' say 'I like his way of handling things, he's not a control freak.'