W3C Considering An HTML 5 414
An anonymous reader writes "When the decision was initially made to move in the direction of XHTML, instead of a new version of HTML proper, it seemed like a good idea. Years later and the widespread adoption of CSS (among other things) has proven that things don't always develop the way we expect. As a result, HTML 5 has been revived by the W3C. After some lobbying and continued work by the Web Hypertext Application Technology Working Group, the old web markup language is getting an official face-lift. A post to the Webforefront blog explains the history behind the initial decision to move to XHTML, and why things are so different in the here and now."
The Author is Not Completely Wrong (Score:5, Informative)
Re:The Author is Not Completely Wrong (Score:5, Informative)
http://blog.whatwg.org/html-vs-xhtml [whatwg.org]
Re:W3C is aggrivating sometimes (Score:2, Informative)
Re:W3C is aggrivating sometimes (Score:2, Informative)
First, the italic and bold tags don't have semantic replacements either. You have the em tag, which is supposed to represent emphasised text, and the strong tag, which is supposed to represent strongly emphasised text. Following standard typographic conventions, emphasised text is rendered in italics, and strongly emphasised text is rendered in bold. These are not the same thing as the i and b tags. A screen reader would completely ignore those, but might use the em or strong tags to apply emphasis to the spoken version of a page. A non-graphical user agent might represent them in a different way.
That's why the i, u, and b elements still exist in XHTML 1.1. They're just for italic, bold, and underlined text that has no semantic meaning. When writing something in a foreign language, for example, it's conventional to use italics. That's not the same as emphasised text - a screen reader should not emphasise the italic portion just because it's in italic. So you'd use an i element, or some other element with italic font styling applied to it.
Of course lists aren't allowed in paragraphs. How could they be? If you're writing a paragraph, and you then start writing a list, have you not in fact finished the paragraph first?
The HTML entities thing is from SGML.
Re:W3C is aggrivating sometimes (Score:1, Informative)
URLs have NEVER reserved & as part of their syntax. Go read the RFC [gbiv.com]. This ampersand business came from the CGI spec, it's been wrong from day 1, and they introduced use of semicolons instead to compensate. Using a bare ampersand in an XML doc will always give you grief, no matter where you put it, unless it's in CDATA. Using & will always result in a URL with an ampersand in it, and if your browser renders that as five characters instead of one, it's broken.
I rather imagine every browser will degrade to a "quirks mode" for any bare ampersands in URLs, so you can continue doing what you do. It just won't be strictly compliant, and it's already going to result in errors in xml-based frameworks like facelets.
Re:Cry for relevency (Score:5, Informative)
Um, what? Seriously, the b, u, i and big tags are _exactly the same_ in XHTML. There was never a super element in HTML 4, it's just sup, and it's unchanged. The a tag does everything from HTML 4 the same way in XHTML. The only difference in it is that it's allowed extra attributes.
Out of all of those things, the only one that's changed at all is the img tag, and that's only in two places - first, in XHTML you are required to provide an alt= attribute (instead of just strongly recommended like in HTML 4), and second, you have to close the tag properly, with a
Frames are also still part of the XHTML spec.
The font tag however, is gone and won't be missed any more than the blink tag was, by anyone other than frontpage (which absolutely loves adding thirty or so font tags in a row setting and unsetting the color 'white' from the text.
Re:Absolutely right (Score:5, Informative)
That's a bit cynical, don't you think?
HTML5 is the result of the hard work done by the Web Hypertext Application Technology Working Group [whatwg.org] (WHATWG). The WHATWG is composed of members from all browser makers, with the occasional public comment thrown in for good measure. As a result, the group has been removing or reducing the ambiguity about implementing the various standards (especially the parser!) and have added features that bring HTML up to a true application platform. Their work is represented in web browsers every time someone uses the Canvas tag, Audio object, Storage API, and other modern features.
The WHATWG was formed because the W3C was seen as too slow to execute such new technologies. Now that the WHATWG specs are stablizing, the W3C has taken a dump of the WHATWG HTML 5 standard and proposed it for ratification under W3C bylaws. This has several advantages over the WHATWG standardization, not the least of which is extracting patent waivers from companies like Apple who technically "own" some of the technologies behind the WHATWG standards.
Note that the HTML5 group at the W3C is a bit different from most. In an attempt to remain as open as the WHATWG, they are accepting just about anyone as an "invited expert" to provide input and comments on the standards process. This is a huge departure from the way that most W3C standards are handled, and is probably a good choice for a standard as comprehensive and complex as HTML5.
Re:Absolutely right (Score:4, Informative)
Re:Absolutely right (Score:5, Informative)
Don't believe me? Here are the two standards. Compare:
WHATWG HTML5 [whatwg.org]
W3C HTML5 [w3.org]
Save for some slight divergences as the WHATWG's standard is updated, they're exactly the same.
Re:Cry for relevency (Score:4, Informative)
I think a reason that XHTML has not taken off is due to its unforgiving strictness. From what I understand, if you make a single mistake in XHTML the page will not work and for that reason it is not intended to be handwritten. But with HTML you often have different ways of achieving the same effect, such as with centering.
Actually, one of the reason many people have picked up on XHTML is because it's a lot "cleaner" than "good" ol' HTML 4, the strict rules are one of the reasons for this, in XHTML you're not allowed to do stupid shit like "<i>foo and <b>bar</i> are both words</b>". And writing XHTML by hand is much easier than relying on some horrible WYSIWYG tool's generated code.
This is the reason for the continuing appeal of HTML: its simplicity. My understanding that XHTML requires is that document formatting be separate from the content of the document. Yet sometimes is so much simpler to use a CENTER tag versus having to mark a section of text with a customized tag and then go into a style sheet to center a single section of text.
Actually, formatting should be kept separate from the content for several very good reasons. Maintainability is a biggie as anyone who's ever had to redesign a static HTML website riddled with <font> tags. Extra points if it was made using a WYSIWYG tool that uses three or for tags when you only need one...
Anyway, I for one hope that XHTML is path we stay on. And I think the main problem that XHTML+CSS has had is Internet Explorer and its craptastic handling of CSS (still crappy in IE7 although it's gotten slightly better).
/Mikael
Re:Absolutely right (Score:3, Informative)
Re:Absolutely right (Score:5, Informative)
Well seeing as it's starting from their work I rather suspect it will include the bulk of it, because it's highly interdependent.
Then again you seem to have an axe to grind with the W3c, so don't let me stop you..
Re:Absolutely right (Score:5, Informative)
That is incorrect: the HTML5 parsing algorithm [whatwg.org] never just stops and returns an error message (like in XML) - it specifies how every single stream of bytes is parsed into a DOM, with error-correction where necessary, in a way that tries hard to be compatible with the ~10^11 existing HTML pages on the web (which, in most cases, means being compatible with the behaviour of IE6).
Almost all the content on the web today is invalid HTML, and it's never going to go away, which is why the browser developers have been pushing for a specification that describes how to handle invalid content instead of pretending it's not important.
Re:Absolutely right (Score:5, Informative)
I'm a participant in the HTML Working Group [w3.org] and I can tell you that this is incorrect. You're thinking of XHTML2, not HTML 5. XHTML2 has the XML parser strictness and pages will fail to display if they're not well-formed. HTML 5 is going the complete opposite direction, assuming that people will code poorly and defining failure modes for browser vendors to follow when that happens.
Re:Absolutely right (Score:3, Informative)
Re:Cry for relevency (Score:3, Informative)
This is not true. There is only one class of errors that causes a fatal error, and that's when the document isn't well-formed. Invalid pages can still be served without tripping the mandatory error-handling.
No, handwritten is still fine. Handwriting XHTML and then publishing it without any attempt to load it in a browser or check it for errors first is a bad idea though. But then again, it was a bad idea with HTML too.
No, this is also not true. It's considered best practice with both HTML and XHTML, both have document types that enforce this separation, and both have document types that don't enforce this separation. It's your choice.
XHTML 1.0 has the <center> element type (not "tag").
HTML 5 Won't Matter... (Score:3, Informative)
Re:Absolutely right (Score:5, Informative)
That being said, Chris Wilson (at least) talks the talk, and IE 7 was a (small) step in the right direction.
The more important, and encouraging, signal imo is MS hiring Standardista Molly Holzschlag. Given her history, I think we can expect more and better from MS on this front in the future.
Re:date tag? (Score:2, Informative)
Re:Absolutely right (Score:5, Informative)
The working group is open to the public [hixie.ch] and costs nothing to join. If you don't like the state of HTML, come over and help make it better.
Re:date tag? (Score:3, Informative)
The problem comes when languages got religion. Lisp went from a list based language to list based syntax jihad. Ditto for Pascal and his strictly enforced strongly typed functions and Java and its everything-is-an-object global-variables-are-forbidden jihad.
SGML as well as the newer versions of HTML are in a format-is-100%-orthogonal-to-content jihad.
Notice that all of the principles listed above are good and correct. The problem comes from emforcing them too strictly.
Re:It's CSS thats the problem (Score:3, Informative)
You mean like display:table-cell that's part of CSS since 1996, and works reliably in every browser except IE?
For constants use server-side processing (color:%foo% is trivial to implement). For really variable-variables you have DHTML and W3C DOM2 Style.
Re:Absolutely right (Score:3, Informative)
HTML 5 revived <embed>, <iframe> and added <video> precisely because <object> has failed. It turned out to be too difficult to implement interoperably -- you have one element, that might be an image, a page, a video, an applet or any plugin content, but you can never be sure what it is, because it's dependent on a remote resource and can even change dynamically. It must have DOM API for all of possible content types. It sometimes has intrinsic dimensions, sometimes it hasn't. And on top of it all, Eolas has been given patent for its most obvious implementation.
HTML 5 has fully documented parsing, including error handling. So you have unambiguousness of XML (handling of every possible input is covered by the spec) with fail-safety of real-world HTML (because users prefer browsers that don't show Yellow Screen of Death, ever).