W3C's New XHTML 2.0 Draft A Mistake? 50
EchoMirage writes "The World Wide Web Consortium (W3C) has been quietly working on drafts for a proposed XHTML 2.0 standard. But some well-known and well-respected web authors are balking at the proposal, because it invalidates several well-used tags. Given that XHTML 1.1 hasn't even seen any wide use yet, and many browsers are still working on basic HTML 4 and CSS1 compatibility, many people are questioning the W3C's push to create new standards before the old ones are solidly in place."
why some people are upset (Score:4, Informative)
I can see why some people are questing the value of this standards proposal.
Re:why some people are upset (Score:5, Informative)
It was always misused anyway. An acronym is pronouncable.
They were only special cases of the <object> element anyway, which is still there, and far more flexible.
Remember, they are called elements, not tags. The tags are the funny things in angle brackets, elements are the whole things.
The question remains, why are there deprecated elements in a non-backwards compatible markup language?
Re:why some people are upset (Score:1, Troll)
It sounds like no logic, or perhaps, Microsoft Logic.
Serious Question: How much influence does our favorite pet, Microsoft, have in the W3C?
No I am not trying to bash Microsoft, we all know they are good at somethings (like sucking....)
Re:why some people are upset (Score:3, Interesting)
On an organisational level? Little, I believe, it certainly doesn't seem like Microsoft's participation in the w3c is putting forth the typical Microsoft nastiness.
On a specifications level? They were intimately involved in a couple of crucial recommendations, I believe, including CSS (which of course, originated within Opera).
On a day-by-day basis? Tantek Ãelik [tantek.com] is usually found hanging around the w3c mailing lists and commenting on current affairs in his blog. He seems like a smart guy, he worked on IE5/Mac, which was one of the first decent CSS1 implementations.
Re:why some people are upset (Score:2, Informative)
Huh? What have you been smoking?
CSS was first proposed in 94. MS got on board first (which is why they're in on the recommendations) and had it in IE3 in 96, the first commercial browser with support. Netscape then gave up on its JavaScript Style Sheets and included CSS in Navigator 4. It wasn't until 98 that Opera included CSS.
The guy that invented CSS (Hakon Lie) later went to work for Opera but that was years after the standard was created and in wide use.
Re:why some people are upset (Score:2)
I wasn't aware of that, I was under the impression that he was working at Opera at the time. Thanks for the info.
Re:why some people are upset (Score:4, Insightful)
I don't.
# acronym tag is gone
Who used this? I've never seen it used. It's demise has been overdue for a long time. I hope they killed its identical twin ABBR too. While you're at it get rid of CODE (computer code), KBD (keyboard input), VAR (program variable), DFN (defining instance), SAMP (sample program output), and ADDRESS (was this suppoed to be email or postal or both? It's ambiguity destroys any semantic value).
# q tag is gone
Good. We already have something for this. It's called the quote. It looks like this: "
# cite tag is gone
CITE would be useful document, but without any citation fields defined its useless. Is this a standard citation like APA or MLA [utoronto.ca]? Is it IEEE or ACM? Is it some sort of made up one? Or does CITE just indicate a superscript, parenthentical citation, or bracket citation? Without these fields denoted you can't do anything with it.
# img tag is gone (yes, really!)
It's handled by OBJECT, after all this is the HyperText Markup Language. Layout and textually there's no difference between an image and an applet or plugin.
Which brings us to...
# applet tag is gone (also really!)
Good. Why should Java be handled differently from Shockwave or Flash? APPLET is better replaced by OBJECT. IMG can also be handled by OBJECT as well.
# br tag is deprecated
This is a style markup, not unlike U, TT, SUP, SUB, STRONG, STRIKE, SMALL, S,I, EM, CENTER, and B.
BR will be replaced with LINE which encoses a single line of text.
# h1 thru h6 are deprecated
They were only sort of used properly. People knew they were headings, but they tended to pick them based on how they looked rather than what they meant. By depreciating these, they can be replaced by something with stricter semantics.
Re:why some people are upset (Score:2)
Neither. How about you read the specification [w3.org]? It certainly doesn't seem very ambiguous.
Re:why some people are upset (Score:2)
I did. My problem with is "What constitutes 'contact information'?", and what kind of information do you put in it. "Whatever you want" really isn't a useful for use in a generic (cross document) sense.
Re:why some people are upset (Score:1, Informative)
Differentiating speech marks from quote tags is useful for different languages.
Re:why some people are upset (Score:1)
Talking about destroying semantic value ... Do you know that this probably is the most annoying error for non-native speakers of English?
There's more than a mile between a new subject / verb and a possessive pronoun.
http://www.wsu.edu:8080/~brians/errors/its.html
English is a quite easy language to learn and use. Please don't make it harder than necessary by voluntary destroying its assets.
Thanks.
b
Re:why some people are upset (Score:1)
Good. We already have something for this. It's called the quote. It looks like this: "
The graphic representation of the quote depends on the language.
In French, a quotes text might look like this: << quoted text >>
The <q> tag was supposed to be rendered with the correct representation according to the user's language preference
Re:why some people are upset (Score:2)
So the quotation marks are "translated", but the text itself isn't? I fail to see the advantage. Afterall when I'm looking at Japanese or Chinese I know quotes are >. I know that when I'm viewing Spanish what the inverted ! and ? means. My problem isn't that I can't understand the punctuation, it's that I can't understand the language the text is written.
is a solution for a problem that doesn't exist.
Misguided, not mistaken (Score:4, Insightful)
I'd agree with Zeldman [zeldman.com] on this one. They need to rename it to something else, and do some sort of forking, because switching terminology in midstream (for no appreciably visible reason) is simply not a good idea. I can see that it makes a certain amount of logical sense to convert images to objects, etc. ... but getting rid of H* tags and, as Mark mentioned, CITE? There isn't really anything to replace that kind of semantic markup, which is unfortunate.
I do remember reading some other stuff by IBM [ibm.com] on this, and it was a fairly cogent explanation of the reasoning, but I still don't agree with the theories behind the changes.
I wouldn't mind most of the changes myself, but it will take decades for this standard to replace the current version if they drop compatibility. I'm sure that a few days afterwards Mozilla will have support for it, but the massive numbers of people who haven't yet upgraded to CSS-capable browsers tell me that we might see a few sites using XHTML 2 in, say, 2010.
Re:Misguided, not mistaken (Score:2, Informative)
First off, CSS support has been around since IE4 NS4.
Second, if you have Windows 98, you have IE4.
Third, Mozilla simply doesn't follow all CSS properties.
Re:Misguided, not mistaken (Score:2, Informative)
First off, CSS support has been around since IE4 NS4.
Limited yes, but it is there.
Second, if you have Windows 98, you have IE4.
Sorry, not everyone does.
Third, Mozilla simply doesn't follow all CSS properties.
And neither does any other browser on the market at this time. However, aside from the horible float bugs in Mozilla, [mozilla.org] it does have the best CSS in any browser right now.
Re:Misguided, not mistaken (Score:1)
Sorry, not everyone does.
Okay, fine, be that way. If you have an operating system that was created less than 6-7 years ago. You have a capable browser. Unless you use a text-based browser or promptly in some odd fashion downgraded. There.
However, aside from the horible float bugs in Mozilla, it does have the best CSS in any browser right now.
IE doesn't have that bug. IE has a CSS bug, and it has to do with bordering. Honestly, floating is a LITTLE bit of a bigger deal than not having the right pixel count on a box.
Re:Misguided, not mistaken (Score:1)
This is true, but it does have it's fair share of CSS bugs [richinstyle.com] as well. It's a bit more than one bug.
Re:Misguided, not mistaken (Score:5, Informative)
If you had ever read HTML 4.01 spec fully you would have noted how it suggested that object element should be used for all embedded objects. To quote w3c [w3.org]:
And now they have removed IMG and APPLET. Are you surprised?
About what comes to CITE, as many already noted that was simply an authoring mistake while making the draft and it has already announced on the mailing list that it'll be back in the next draft.
I think that anybody that says h1-h6 should be kept instead of SECTION and H elements doesn't simply know what she is talking about. The problem with Hx is that one can skip levels. If the headers were numbered by default it would be highly visible. I know many slashdotters do start their web page with H3 simply because "it looks better than that god-awful-huge h1". If the first header said "1.1.1. I'm the first header" even an idiot author would think something smells fishy. In addition, there's no way to tell where one section of document ends. They new way is to enclose sections inside SECTION elements. Those sections can be nested and section can contain header enclosed in a H element. Browser then uses the nesting depth of SECTIONs what enclose a single H element to decide the size the header should be rendered at.
Example:
<section><h>header1</h>foo
<section><h>header2</h>bar</section>
baz</section>
Note that "baz" belongs to logical section named "header1". You cannot represent a structure like this with H1-H6.
One possible rendering could be:
header1
foo
| header2
| bar
baz
But the truth is, this all really doesn't matter. People will continue to author HTML3.2, stick XHTML2 DOCTYPE identifier and serve the documents as text/plain, regardless of what the possible final spec will say. And they will complain if their cool frameset doesn't work. Those few of us who care to follow the recommendation can easily do it. Simply change those few remaining IMG elements to objects and upgrade your document structure to SECTION and H and you're pretty much done. If you feel adventureous you could start authoring XHTML2 pages immediatly and simply write the missing rendering instructions in the user stylesheet. Yeah, it might work in Mozilla only, but that can be used for testing and as a real world prototype.
If you want to bitch about something it should be that XForms are too complex, but anything is better than current HTML forms.
Standards . . . HAH! (Score:1)
oops, just used a <br
Damnitall, well be seeing websites with HTML 4.0 forever.
Microsoft, Mozilla, Opera, Kde, etc., etc.. Screw the interface of your browser! Just get the dammed thing to display pages consistently with the law of the internet! [w3.org]
Then again, the law has no teeth, you abide because "you're supposed to."
Or that's what I get from it at least.
Re:Standards . . . HAH! (Score:3, Informative)
I don't believe the w3c has ever claimed xhtml as a standard. Or any of their html specifications either. I think the only "standard" html is ISO-HTML [cs.tcd.ie].
Comment removed (Score:3, Interesting)
new standard (Score:4, Interesting)
maybe we shouldn't try to fix HTML.
perhaps it's just time to
screw all backward compatability
and make a
small
simple
modular
markup language
from scratch
with a well defined way to do scripts and other dynamic things from the beginning
Re:new standard (Score:1)
Re:new standard (Score:2)
Re:new standard (Score:1)
Missing the point (Score:5, Insightful)
XHTML 1.0 made HTML 4.0 into an XML language, by requring that you close all tags you open and quote all attributes, and that's pretty much it. Anyone who has ever tried to write code to manipulate HTML in middleware or on the client end will see the point of making sure that the markup is syntactically well-formed.
XHTML 1.1 with modularization breaks XHTML up into modules; modularization is a key idea in building extensible systems of any type, and it is put to good use in HTML. You get a tables module, a forms module, etc. There is now modularization for both DTD-based XML and XML Schema-based XML, so you can get your job done either way.
But the real goal is to make the modules well enough defined and have the semantics and the presentation and the underlying infrastructure (what is a link, when does it get followed, what are events, etc) all well enough separated so that HTML can become just another application of XML, without a lot of special knowledge hard-coded into browsers about what "i m g" or "b r" or "r e l" means. In other words, the goal is to define modules of functionality in XML, and then be able to use those together with other modules (things like SVG ) in kind of the same way that people can use class libraries in PHP and Java and other systems, without having to have someone write a new browser for every possible combination.
Remember how years ago during the browser wars, vendors kept inventing their own tags and writing browsers that understood the tags? Well, now we've got enough of all that figured out to be able to factor the web and its interfaces into a bunch of different parts, and then let you mix and match those parts together to make your own cogent language. You use CSS for presentation styling, XML Events for events, and markup for semantically describing the content. If you have to build your own language for displaying your weblog and call it RSS, well you just go do that and you put your tags in a namespace named by your favorite URI, and you go off. You and your friends (and enemies) can write CSS style sheets or XSLT transformations or what-have-you to display the resulting pointy bracket file in a browser, and it will look and act indistinguishable from today's HTML, but will offer other advantages -- blind people will be able to use their style sheets for reading it, and programs will be able to parse the format without having to screen-scrape the HTML, and you won't have to have six versions of it around for different devices and so on.
So after we move all of the essential stuffness of the web (events, hyperlinks, object embedding, forms, styling) into their own standards and get browsers and other user agents have to hard-code those implementations, what you're left with is a need for a common semantic markup language where things are clearly expressed and easy to write.
That's where tags like <section> and <h> come from, and why they make perfect sense as transitions from <h*>.
In summary, XHTML 2.0 is just the meaning-laden parts of web pages that are left over after all of the plumbing has been moved out to other specs, where it can be shared.
Wrong area of focus? (Score:3, Interesting)
I'm all for using XHTML, and have been doing so for at least a year now. Mostly the Transitional part, but it's still XHTML. However, these new standards are being defined much too fast for the real world to catch up. Backwards compatibility will really go away with 2.0, so it will be YEARS until major sites are fully compliant.
Might I suggest that the focus move to stuff like, say, SOAP? It's a good little proposal, but the W3C moves SOOOO slowly there that Microsoft and other large companies just go ahead and implement their own extensions, which will then find their way into the standards later - much like the chaos that was HTML 3.2 (shudder!). The end result? Crappy standards, to the detriment of most of us.
So if anyone from the W3C happens to be reading this (not likely, I know), *PLEASE* focus your energy on where the action is *AT THE MOMENT*.
It could be you (Score:2)
Otherwise they way they conduct their business is kind of none of your business.
how things change ... (Score:3, Interesting)
It wasn't so long ago that the W3C couldn't keep up with the pace of change. Netscape and Microsoft were adding elements like the dreaded <BLINK> and features like tables, while the HTML DTD's languished in draft form. Now the W3C are the ones pulling ahead, while everyone else struggles to implement the last generation of their specifications.
Chris
Re:how things change ... (Score:1)
Fact is, Netscape never got around to implementing HTML 2.0 or 3.0 fully before adding crap like and layers. To say that they were pulling ahead is to bend the truth.
This is what XHTML 1.0 should have been (Score:2, Interesting)
I'm glad that XHTML 2.0 is no longer backwards compatible. Given that it's not, fixing known problems with the HTML vocabulary is a good idea.
Why do I dislike XHTML 1.0's backwards compatibility? XHTML 1.0 encouraged authors to serve XHTML as text/html, the same MIME type as legacy HTML. Furthermore, it didn't provide any guidelines for how browsers should decide whether something served as text/html should be handled using an XML parser. (Had XHTML 1.0, right from the start, decreed that any HTML document begining with "<?xml " be treated as XHTML, the problem might have been avoided.) Some authors (although not that many) started writing XHTML right after the spec came out, thinking it was the cool new thing to be doing. This meant that there was already a good bit of invalid XHTML as text/html on the web before any user agents could start parsing it as XML and enforcing the strict error handling that is one of the main advantages of XML.
This is a good thing (Score:2)
1) Most of these changes have been antcipated for a long time. applet has been deprecated sine HTML 4.0. The functionality of the H/Section tags is much better than H1-6. The formatting aspects of H1-6 can easily be accomplished through CSS. The same can be said for all of the symantic tags like cite, acronym, quote and so on. Use XML to create your own tags and CSS to give them formatting. Then use whatever engine, it may nto be a browser, to handle your new tags in whatever way is neccesary. There are so many possible specific instances that browsers can't, and shouldn't, be required to handle all of them.
2) Knowing that it will take a while before the standards are adopted, whatever they are, is all the more reason to come out with standards that are so different from the current standards. If you try to make small incremental fixes through standards recomendations, you will languish in constant browser lag time and nothing substantial will change in a uniform way. Instead you will get browser manufacturers creating their own tags to handle these situations and they will undoubtedly be incompatible with other browsers.
This a good thing. Prepare for the standards to be implemented now, so you won't be caught off guard when browsers finally start implementing them.
It is not just about technology (Score:1)
DOCTYPE (Score:2)
That's what Content-Type: text/html
and <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
are for.
If you have just validated your website then the DOCTYPE will be present, requiring no more effort on your part.
HTML has evolved into itself.
That doesn't preclude an XHTML Revolution.
Don't worry (Score:2)
w3c standards always read like RFC's written by martians, if RFC's wern't hard enough to read.
W3C can _say_ whatever they want but... (Score:1)