At the end of the day, why choose XHTML over HTML? [closed]
I wonder why I should use XHTML instead of HTML.
XHTML is supposed to be "modularized", but I haven't seen any server side language take advantage of any of that.
XHTML is also more strict, and I don't see the advantage. What does XHTML offer that I need so bad? How does it make my code "better"?
EDIT: another question I found in the comments: Does XHTML parse faster than HTML?
EDIT2: after reading all your comments and the links, I indeed agree that another post deserves to be the correct answer, so I chose the one that directly links to the best source.
Also, goes to show that people upvote the green comment without even reading it.
Solution 1:
You should read Beware of XHTML, which is an informative article that warns about some of the pitfalls of XHTML over HTML.
I was pretty gung-ho about XHTML until I read it, but it does make several valid points. Including the following bit;
XHTML 1.x is not “future-compatible”. XHTML 2, currently in the drafting stages, is not backwards-compatible with XHTML 1.x. XHTML 2 will have lots of major changes to the way documents are written and structured, and even if you already have your site written in XHTML 1.1, a complete site rewrite will usually be necessary in order to convert it to proper XHTML 2. A simple XSL transformation will not be sufficient in most cases, because some semantics won't translate properly.
HTML 4.01 is actually more future-compatible. A valid HTML 4.01 document written to modern support levels will be valid HTML 5, and HTML 5 is where the majority of attention is from browser developers and the W3C.
Future compatibility can be huge when working on some projects. The article goes on to make several other good points, but I think that may have stood out the most for me.
Don't mistake the article for a rant against XHTML, the author does talk about the good points of XHTML, but it is good to be aware of the shortcomings before you dive in.
Solution 2:
I was going to add this as a comment to one of the other posts, but it grew a little too large.
What the fundamental point that most people seem to be missing, is the purpose behind XHTML. One of the major reasons for developing the XHTML specification was to de-emphasise presentation-related tags in the markup, and to defer presentation to CSS. Whilst this separation can be achieved with plain HTML, this behaviour isn't promoted by the specifcation.
Separating meta-markup and presentation is a vital part of developing for the 'programmable web', and will not only improve SEO, and access for screen readers/text browsers, but will also lead towards your website being more easily analysable by those wishing to access it programmatically (in many simple cases, this can negate the need for developing a specific API, or even just allow for client-side scripts to do things like, identify phone numbers readily). If your web-page conforms to the XHTML specification, it can easily be traversed using XML-related tools, and things such as XPath... which is fantastic news for those who want to extract particular information from your website.
XHTML was not developed for use by itself, but by use with a variety of other technologies. It relies heavily on the use of CSS for presentation, and places a foundation for things like Microformats (whether you love them, or hate them) to offer a standardised markup for common data presentation.
Don't be fooled by the crowd who think that XHTML is insignificant, and is just overly restrictive and pointless... it was created with a purpose that 95% of the world seems to ignore/not know about.
By all means use HTML, but use it for what it's good for, and take the same approach when looking at XHTML.
With regard to parsing speed, I imagine there would be very little difference in the parsing of the actual documents between XHTML and HTML. The trade-off will come purely in how you describe the document using the available markup. XHTML tags tend to be longer, due to required attributes, proper closing, etc. but will forego the need for any presentational markup in the document itself. With that being the case, I think you're talking about comparing one type of apple, with a very slightly different type of apple... they're different, but it's unlikely to be of any consequence (in terms of parsing and rendering) when all you want is a healthy, tasty apple.
Solution 3:
For the visitor of a website it probably doesn't make any visible difference. Furthermore, XHTML is usually more of a pain to use as at least one widespread browser still doesn't know how to handle it and you need to serve it as text/html in that case (which yields invalid HTML).
If your HTML is going to be regularly processed by automated tools instead of being read by humans, then you might want to use XHTML because of its more strict structure and being XML it's more easy to parse (from an application standpoint. Not that XML is inherently easy to parse, though).
Apart from that I don't see any compelling reasons to use it, though. XHTML was created in an approach of making use of XML features for HTML and basically it boils down to "HTML 4 with several annoying side-effects" (IMHO, at least).
Solution 4:
Use HTML (HTML4 Strict or HTML5).
HTML can fully utilize CSS, can be validated and parsed unambiguously. Separation of structure and presentation has been done in HTML4 and XHTML merely continued that.
All browsers support HTML. Only some browsers support XHTML and those that do, often have more mature and better tested and optimized support for HTML (it's caused by the fact that tiny fraction of pages uses XML mode).
If you care about IE and Google, you have to use HTML or subset of XHTML and HTML defined in Appendix C of XHTML spec. The latter is almost worst of the both worlds, because such XHTML cannot be generated with standard XML tools, cannot use extension mechanisms new to XHTML and has additional limitations over those in HTML alone.
XHTML1.0 is now over 10 years old, it was designed in "Web1.0" times, and as head of W3C said, in retrospect it didn't work out and better approach is needed. W3C HTML5 is written as we speak and addresses needs of web applications used today, and has very good backwards compatibility.
HTML5 closes many gaps that were between HTML4 and XHTML1 (e.g. adds inline SVG, MathML i RDF), cleans up language beyond what was done in XHTML1.0 and XHTML1.1.
XHTML2 is not going to be supported by web browsers in forseeable future. It's likely that it will never be supported (all browser vendors heavily support [X]HTML5, some have already declared that they won't implement XHTML2).
XHTML1.0 has exactly the same semantics and separation of presentation from structure as HTML4.01. Anybody who says otherwise, hasn't read the specification. I encourage everybody to read the spec – it's suprisingly short and uninteresting.
- Stylesheets were introduced in HTML4.01 and were not changed in XHTML1.0.
- Presentational elements were deprecated in HTML4.01 and were not removed in XHTML1.0.
XHTML myths.
There are no untractable differences in HTML and XHTML that would make parsing of one much slower than another. It depends how the parser is implemented.
- Both SGML and XML parsers need to load and parse entire DTD in order to understand entities. This alone is usually more work than parsing of the document itself. HTML parsers almost always "cheat" and use hardcoded entities and element information. XHTML parsers in browsers cheat too.
- Parsing of HTML requires handling of implied start and end tags, and real-world HTML requires additional work to handle misplaced tags.
- Proper parsing of XHTML requires tracking of XML namespaces.
- Draconian XML rules require checking if every character is properly encoded. HTML parsers may get away with this, but OTOH they need to look for
<meta>
.
The overall difference in cost of parsing is tiny compared to time it takes to download document, build DOM, run scripts, apply CSS and all other things browsers have to do.
Solution 5:
I'm surprised that all the answers here recommend XHTML over HTML. I am firmly of the opposite opinion - you should not use XHTML, for the foreseeable future. Here's why:
No browser interprets XHTML as XHTML unless you serve it as mimetype
application/xhtml+xml
. If you just serve it with the default mimetype, all browsers will interpret it as HTML - eg, accepting unclosed or improperly nested elements.However, you should never actually do this, as Internet Explorer does not recognise
application/xhtml+xml
, and would fail to render the page completely.There are significant differences in the DOM between XHTML and HTML. Since all so-called XHTML pages are being served as HTML at the moment, all javascript code is written using the HTML DOM. If, support for the XHTML mimetype becomes significant enough to convince people to start using it, most of their javascript code will break - even if they think their pages validate as XHTML.