When does whitespace matter in HTML?
The reality is somewhat complicated. There are two parts
- What the parsing does.
- What the rendering does.
The parsing actually removes very little white space whilst parsing text (as opposed to markup). It will remove an initial line feed character at the start of <textarea>
and <pre>
elements and also on the invalid <listing>
element, but that's about it.
Jukka refers to the HTML 4.01 section B.3.1 Line breaks saying that "a line break immediately following a start tag must be ignored, as must a line break immediately before an end tag" but that is in a non-normative appendix and browsers do not follow it except for the three elements mentioned above.
That can be demonstrated using Jukka's example here on line breaks with no spaces . Note the #text:
nodes around the button elmeents in the tree display, and that if the line breaks are removed, the '#text:` nodes no longer appear.
We can also see that the rule is not applied by using that first example from the specification here. By adding display:pre
it's clear that the line breaks are not exactly ignored but that the rendering the two examples the same is merely a property of the default white-space handling being white-space:normal
Which brings us to the relevant spec, which is 16.6.1 The 'white-space' processing model in the CSS spec. This covers the systematic rules to be applied to the text characters for each of the white-space setting values.
HTML collapses serial whitespaces to a single whitespace, but doesn't eliminate it. A newline is a whitespace, therefore a space.
It's not just leading/trailing - serial whitespace within an element is rendered as single whitespace as well. These will render identically:
<div> a b </div>
<div> a b </div>
<div>a b</div>
A full answer to the question “When does whitespace matter in HTML?” would be rather long and detailed and would need to discuss things like whitespace between attribute specifications and elements with special rules like textarea
. But the address what seems to be the primary concern:
Whitespace between tags generally creates anonymous text nodes. Whitespace inside leaf elements (elements not containing other elements) constitutes part of the text content of the element.
There are few fixed rules for rendering, but browsers generally ignore leading and trailing whitespace within an element’s text content.
If there is whitespace between elements that are rendered inline (such as button
elements by default), it normally acts as a separator equivalent to one space.
However, whitespace consisting of one line break is ignored, by specifications and usually in browser practice, when it immediately follows a start tag or precedes an end tag. So
<div>
<button>one</button>
<button>two</button>
</div>
would be treated as if the line breaks were not there. But when spaces are used, as in
<div>
<button>one</button>
<button>two</button>
</div>
then there are anonymous text nodes before the first button
element and between the two button
elements. Normally only the latter matters, and it acts like a normal word space.
Update: As a comment below and Alohci’s answer point out, this old (HTML 4.01) principle has mostly not been implemented in browsers and has mostly been removed in HTML5. So in most cases, a line break between elements creates a text node, containing a line break character, which is treated as equivalent to a space.