Follow Slashdot stories on Twitter


Forgot your password?
Software The Internet

W3C Considering An HTML 5 414

An anonymous reader writes "When the decision was initially made to move in the direction of XHTML, instead of a new version of HTML proper, it seemed like a good idea. Years later and the widespread adoption of CSS (among other things) has proven that things don't always develop the way we expect. As a result, HTML 5 has been revived by the W3C. After some lobbying and continued work by the Web Hypertext Application Technology Working Group, the old web markup language is getting an official face-lift. A post to the Webforefront blog explains the history behind the initial decision to move to XHTML, and why things are so different in the here and now."
This discussion has been archived. No new comments can be posted.

W3C Considering An HTML 5

Comments Filter:
  • by ronadams ( 987516 ) on Friday July 20, 2007 @08:52AM (#19925629) Homepage
    There was an interesting discussion about this in the xml-dev mailing list []. Rick Jelliffe had this to say:

    XML was developed as a subset of SGML. Most of the ISO working group which looked after SGML were also involved with the creation of XML (Clark, Kimber, Bosak, also Goldfarb, Peterson, me, and others). The correction for SGML came out before XML was finally put as a recommendation (AFAIR) so there never was a time when XML was not a true subset of SGML. Where there were differences, ISO8879 was corrected specifically to make sure that XML was indeed a subset. In fact, Charles Goldfarb even said at one stage "XML *is* the revision of SGML" (debate on the revision of ISO 8879 had started years before: XML was the embodyment of that).
    XML can be argued as both the revision to and a subset of SGML. Hence my disappointment in anything new that seems to shy away from this path, like HTML 5 instead of XHTML.
  • by tolan-b ( 230077 ) on Friday July 20, 2007 @08:59AM (#19925713)
    HTML 5 is also 'XHTML 5'. You can use well-formed XHTML style syntax, and deliver it with an application/xml or application/xhtml+xml mimetype, *or* you can format it HTML style and deliver it with a standard HTML mimetype. []
  • by sveard ( 1076275 ) on Friday July 20, 2007 @09:07AM (#19925763) Homepage
    Isn't the & character forbidden in XHTML links? You have to use the & entity, if I remember correctly. Sure, it's allowed in HTML 4.01, but not in XHTML (again, if I remember correctly)
  • by Anonymous Coward on Friday July 20, 2007 @09:13AM (#19925837)
    Quickly, and in the order you mentioned them...

    First, the italic and bold tags don't have semantic replacements either. You have the em tag, which is supposed to represent emphasised text, and the strong tag, which is supposed to represent strongly emphasised text. Following standard typographic conventions, emphasised text is rendered in italics, and strongly emphasised text is rendered in bold. These are not the same thing as the i and b tags. A screen reader would completely ignore those, but might use the em or strong tags to apply emphasis to the spoken version of a page. A non-graphical user agent might represent them in a different way.

    That's why the i, u, and b elements still exist in XHTML 1.1. They're just for italic, bold, and underlined text that has no semantic meaning. When writing something in a foreign language, for example, it's conventional to use italics. That's not the same as emphasised text - a screen reader should not emphasise the italic portion just because it's in italic. So you'd use an i element, or some other element with italic font styling applied to it.

    Of course lists aren't allowed in paragraphs. How could they be? If you're writing a paragraph, and you then start writing a list, have you not in fact finished the paragraph first?

    The HTML entities thing is from SGML.
  • by Anonymous Coward on Friday July 20, 2007 @09:15AM (#19925847)
    > URLs have already had & reserved for years,

    URLs have NEVER reserved & as part of their syntax. Go read the RFC []. This ampersand business came from the CGI spec, it's been wrong from day 1, and they introduced use of semicolons instead to compensate. Using a bare ampersand in an XML doc will always give you grief, no matter where you put it, unless it's in CDATA. Using & will always result in a URL with an ampersand in it, and if your browser renders that as five characters instead of one, it's broken.

    I rather imagine every browser will degrade to a "quirks mode" for any bare ampersands in URLs, so you can continue doing what you do. It just won't be strictly compliant, and it's already going to result in errors in xml-based frameworks like facelets.

  • Re:Cry for relevency (Score:5, Informative)

    by HappyHead ( 11389 ) on Friday July 20, 2007 @09:27AM (#19925945)
    After years and years, a critical mass of people are finally learning a, b, u, i, big, super, img, and other standard tags, most of which just don't work the same or at all under XHTML.

    Um, what? Seriously, the b, u, i and big tags are _exactly the same_ in XHTML. There was never a super element in HTML 4, it's just sup, and it's unchanged. The a tag does everything from HTML 4 the same way in XHTML. The only difference in it is that it's allowed extra attributes.

    Out of all of those things, the only one that's changed at all is the img tag, and that's only in two places - first, in XHTML you are required to provide an alt= attribute (instead of just strongly recommended like in HTML 4), and second, you have to close the tag properly, with a /> at the end.

    Frames are also still part of the XHTML spec.

    The font tag however, is gone and won't be missed any more than the blink tag was, by anyone other than frontpage (which absolutely loves adding thirty or so font tags in a row setting and unsetting the color 'white' from the text.
  • Re:Absolutely right (Score:5, Informative)

    by AKAImBatman ( 238306 ) * <> on Friday July 20, 2007 @09:38AM (#19926079) Homepage Journal

    Or are the W3C just trying to justify their existence?

    That's a bit cynical, don't you think?

    HTML5 is the result of the hard work done by the Web Hypertext Application Technology Working Group [] (WHATWG). The WHATWG is composed of members from all browser makers, with the occasional public comment thrown in for good measure. As a result, the group has been removing or reducing the ambiguity about implementing the various standards (especially the parser!) and have added features that bring HTML up to a true application platform. Their work is represented in web browsers every time someone uses the Canvas tag, Audio object, Storage API, and other modern features.

    The WHATWG was formed because the W3C was seen as too slow to execute such new technologies. Now that the WHATWG specs are stablizing, the W3C has taken a dump of the WHATWG HTML 5 standard and proposed it for ratification under W3C bylaws. This has several advantages over the WHATWG standardization, not the least of which is extracting patent waivers from companies like Apple who technically "own" some of the technologies behind the WHATWG standards.

    Note that the HTML5 group at the W3C is a bit different from most. In an attempt to remain as open as the WHATWG, they are accepting just about anyone as an "invited expert" to provide input and comments on the standards process. This is a huge departure from the way that most W3C standards are handled, and is probably a good choice for a standard as comprehensive and complex as HTML5.
  • Re:Absolutely right (Score:4, Informative)

    by tolan-b ( 230077 ) on Friday July 20, 2007 @09:43AM (#19926125)
    No you're wrong I'm afraid, the HTML5 that W3C is talking about *is* based on WhatWG's HTML5. It supports HTML and XHTML syntaxes to the the same serialisation, so MS supporting XHTML isnt' wasted. They're basically merging HTML and XHTML.
  • Re:Absolutely right (Score:5, Informative)

    by AKAImBatman ( 238306 ) * <> on Friday July 20, 2007 @09:52AM (#19926239) Homepage Journal
    Who modded this informative? Suv4x4 is incorrect. The W3C came up with their HTML5 standard by taking a dump of the WHATWG HTML5 standard and putting the W3C colors on it. Which isn't surprising as most of the WHATWG members are also W3C members. It was always their intention to make their standard more "legitimate" by submitting it to the W3C once it was ready.

    Don't believe me? Here are the two standards. Compare:

    W3C HTML5 []

    Save for some slight divergences as the WHATWG's standard is updated, they're exactly the same.
  • Re:Cry for relevency (Score:4, Informative)

    by mikael_j ( 106439 ) on Friday July 20, 2007 @10:14AM (#19926493)

    I think a reason that XHTML has not taken off is due to its unforgiving strictness. From what I understand, if you make a single mistake in XHTML the page will not work and for that reason it is not intended to be handwritten. But with HTML you often have different ways of achieving the same effect, such as with centering.

    Actually, one of the reason many people have picked up on XHTML is because it's a lot "cleaner" than "good" ol' HTML 4, the strict rules are one of the reasons for this, in XHTML you're not allowed to do stupid shit like "<i>foo and <b>bar</i> are both words</b>". And writing XHTML by hand is much easier than relying on some horrible WYSIWYG tool's generated code.

    This is the reason for the continuing appeal of HTML: its simplicity. My understanding that XHTML requires is that document formatting be separate from the content of the document. Yet sometimes is so much simpler to use a CENTER tag versus having to mark a section of text with a customized tag and then go into a style sheet to center a single section of text.

    Actually, formatting should be kept separate from the content for several very good reasons. Maintainability is a biggie as anyone who's ever had to redesign a static HTML website riddled with <font> tags. Extra points if it was made using a WYSIWYG tool that uses three or for tags when you only need one...

    Anyway, I for one hope that XHTML is path we stay on. And I think the main problem that XHTML+CSS has had is Internet Explorer and its craptastic handling of CSS (still crappy in IE7 although it's gotten slightly better).


  • Re:Absolutely right (Score:3, Informative)

    by 1110110001 ( 569602 ) <<slashdot-0904> <at> <>> on Friday July 20, 2007 @10:38AM (#19926763)
    I copied both specs to a pure text file and the only difference I found is a mark in the header lines:

    -Working Draft -- 28 June 2007
    +W3C Editor's Draft 28 June 2007
    -You can take part in this work. Join the working group's discussion list.
    +This Version:
    +Lat est Published Version:
    +Latest Editor's Draft:
    + Ian Hickson, Google, Inc.
    + David Hyatt, Apple, Inc.
    -Web designers! We have a FAQ, a forum, and a help mailing list for you!
    -One-page version:
    -Multiple-page version:
    - multipage/
    -Version history:
    - Twitter messages (non-editorial changes only):
    - Commit-Watchers mailing list:
    - Interactive Web interface:
    - Subversion interface:
    - Ian Hickson, Google,
    +This specification defines the 5th major revision of the core language of the World Wide Web, HTML. In this version, new features are introduced to help Web application authors, new elements are introduced based on research into prevailing authoring practices, and special attention has been given to defining clear conformance criteria for user agents in an effort to improve interoperability.
    +Status of this document
    -&#169; Copyright 2004-2007 Apple Computer, Inc., Mozilla Foundation, and Opera Software ASA.
    +This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at
    -You are granted a license to use, reproduce and create derivative works of this document.
    +If you wish to make comments regarding this document, please send them to (subscribe, archives) or (subscribe, archives). All feedback is welcome.
    -This specification introduces features to HTML and the DOM that ease the authoring of Web-based applications. Additions include the context menus, a direct-mode graphics canvas, inline popup windows, and server-sent events.
    -Status of this document
    +Implementors should be aware that this specification is not stable. Implementors who are not taking part in the discussions are likely to find the specification changing out from under them in incompatible ways. Vendors interested in implementing this specification before it eventually reaches the Candidate Recommendation stage should join the aforementioned mailing lists and take part in the discussions.
    +The latest stable version of the editor's copy of this specification is always available on the W3C CVS server and in the WHATWG Subversion repository. The latest editor's draft (which may contain unfinished text in the process of being prepared) is available on the WHATWG site. Detailed change history can be obtained from the following locations:
    -This is a work in progress! This document is changing on a daily if not hourly basis in response to comments and as a general part of its development process. Comments are very welcome, please send them to Thank you.
    + * Twitter messages (non-editorial changes only):
    + * Interactive Web interface:
    + * Commit-Watchers mailing list:
    + * Subversion interface:
    + * CVS log:
    -Implementors should be a

  • Re:Absolutely right (Score:5, Informative)

    by tolan-b ( 230077 ) on Friday July 20, 2007 @10:41AM (#19926811)

    WHATWG are the group that pitched W3C to consider HTML5. W3C's HTML5 isn't based on anything right now since it doesn't exist yet.
    From the WHATWG list:

    The W3C's HTML working group today resolved to start from the current WHATWG work. Specifically, the group resolved to review our work, and will probably build on it. They also resolved to call this work HTML5. Thus, the "Web Applications 1.0" spec is now officially named "HTML5"! I have also checked a copy of the two main WHATWG specs (but with the W3C boilerplate) into the W3C CVS server. Going forward, any changes will be committed to both the WHATWG and the W3C repositories simultaneously.

    It may include in some form some HTML5 features, but don't delude yourself that W3C will beat the heck out of it, until it's a tortured mix of their XHTML2 standard and WHATWG's HTML5.

    Well seeing as it's starting from their work I rather suspect it will include the bulk of it, because it's highly interdependent.

    Then again you seem to have an axe to grind with the W3c, so don't let me stop you..
  • Re:Absolutely right (Score:5, Informative)

    by Excors ( 807434 ) on Friday July 20, 2007 @11:11AM (#19927159)

    HTML 5.0 = HTML 4 with some new sugar + XHTML parser strictness.

    That is incorrect: the HTML5 parsing algorithm [] never just stops and returns an error message (like in XML) - it specifies how every single stream of bytes is parsed into a DOM, with error-correction where necessary, in a way that tries hard to be compatible with the ~10^11 existing HTML pages on the web (which, in most cases, means being compatible with the behaviour of IE6).

    Almost all the content on the web today is invalid HTML, and it's never going to go away, which is why the browser developers have been pushing for a specification that describes how to handle invalid content instead of pretending it's not important.

  • Re:Absolutely right (Score:5, Informative)

    by jalefkowit ( 101585 ) <> on Friday July 20, 2007 @11:15AM (#19927211) Homepage

    You jest, but it is actually that simple. HTML 5.0 = HTML 4 with some new sugar + XHTML parser strictness.

    The result is that browsers will show you the finger if you don't code to the standard.

    I'm a participant in the HTML Working Group [] and I can tell you that this is incorrect. You're thinking of XHTML2, not HTML 5. XHTML2 has the XML parser strictness and pages will fail to display if they're not well-formed. HTML 5 is going the complete opposite direction, assuming that people will code poorly and defining failure modes for browser vendors to follow when that happens.

  • Re:Absolutely right (Score:3, Informative)

    by jalefkowit ( 101585 ) <> on Friday July 20, 2007 @11:19AM (#19927251) Homepage
    Microsoft has several people participating in the HTML Working Group [], and Chris Wilson [], the leader of the IE team, is the chair of the group. So you don't have to worry about Microsoft being left out.
  • Re:Cry for relevency (Score:3, Informative)

    by Bogtha ( 906264 ) on Friday July 20, 2007 @11:21AM (#19927283)

    I think a reason that XHTML has not taken off is due to its unforgiving strictness. From what I understand, if you make a single mistake in XHTML the page will not work

    This is not true. There is only one class of errors that causes a fatal error, and that's when the document isn't well-formed. Invalid pages can still be served without tripping the mandatory error-handling.

    for that reason it is not intended to be handwritten.

    No, handwritten is still fine. Handwriting XHTML and then publishing it without any attempt to load it in a browser or check it for errors first is a bad idea though. But then again, it was a bad idea with HTML too.

    My understanding that XHTML requires is that document formatting be separate from the content of the document.

    No, this is also not true. It's considered best practice with both HTML and XHTML, both have document types that enforce this separation, and both have document types that don't enforce this separation. It's your choice.

    Yet sometimes is so much simpler to use a CENTER tag

    XHTML 1.0 has the <center> element type (not "tag").

  • by Nom du Keyboard ( 633989 ) on Friday July 20, 2007 @11:43AM (#19927647)
    HTML 5 won't matter until Microsoft almost handles it in Internet Explorer. I'd guess that might happen 5 years after the standard is adopted.
  • Re:Absolutely right (Score:5, Informative)

    by Trails ( 629752 ) on Friday July 20, 2007 @12:03PM (#19928017)
    Chris Wilson is a guy with his heart in the right place working for people who, in the past, put business strategy over standards support (I'm not editorializing, that's what they did). This is why MS's standard support is lame.

    That being said, Chris Wilson (at least) talks the talk, and IE 7 was a (small) step in the right direction.

    The more important, and encouraging, signal imo is MS hiring Standardista Molly Holzschlag. Given her history, I think we can expect more and better from MS on this front in the future.
  • Re:date tag? (Score:2, Informative)

    by 'The '.$L3mm1ng ( 584224 ) on Friday July 20, 2007 @12:05PM (#19928069)
    You got it.

    The input element's type attribute now has the following new values:

    • datetime
    • datetime-local
    • date
    • month
    • week
    • time
    • number
    • range
    • email
    • url
    HTML 5 differences from HTML 4 []
  • Re:Absolutely right (Score:5, Informative)

    by jalefkowit ( 101585 ) <> on Friday July 20, 2007 @12:23PM (#19928349) Homepage

    Wow. I hate you.

    The working group is open to the public [] and costs nothing to join. If you don't like the state of HTML, come over and help make it better.

  • Re:date tag? (Score:3, Informative)

    by Alomex ( 148003 ) on Friday July 20, 2007 @12:27PM (#19928427) Homepage

    The problem comes when languages got religion. Lisp went from a list based language to list based syntax jihad. Ditto for Pascal and his strictly enforced strongly typed functions and Java and its everything-is-an-object global-variables-are-forbidden jihad.

    SGML as well as the newer versions of HTML are in a format-is-100%-orthogonal-to-content jihad.

    Notice that all of the principles listed above are good and correct. The problem comes from emforcing them too strictly.

  • by porneL ( 674499 ) on Friday July 20, 2007 @04:09PM (#19931841) Homepage

    What CSS needs is a way of defining columns, or a way of gluing DIVs together to it's easier to stack them side by side without running into all the problems you get if you float them.

    You mean like display:table-cell that's part of CSS since 1996, and works reliably in every browser except IE?

    It also REALLY REALLY needs variables.

    For constants use server-side processing (color:%foo% is trivial to implement). For really variable-variables you have DHTML and W3C DOM2 Style.

  • Re:Absolutely right (Score:3, Informative)

    by porneL ( 674499 ) on Friday July 20, 2007 @07:48PM (#19934291) Homepage

    HTML 5 revived <embed>, <iframe> and added <video> precisely because <object> has failed. It turned out to be too difficult to implement interoperably -- you have one element, that might be an image, a page, a video, an applet or any plugin content, but you can never be sure what it is, because it's dependent on a remote resource and can even change dynamically. It must have DOM API for all of possible content types. It sometimes has intrinsic dimensions, sometimes it hasn't. And on top of it all, Eolas has been given patent for its most obvious implementation.

    HTML 5 has fully documented parsing, including error handling. So you have unambiguousness of XML (handling of every possible input is covered by the spec) with fail-safety of real-world HTML (because users prefer browsers that don't show Yellow Screen of Death, ever).

Someday your prints will come. -- Kodak