Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Microsoft

Stephane Rodriguez Dismantles Open XML 188

Elektroschock writes "Stephane Rodriguez, a reengineering specialist who became popular for his article on MS Office 2007 binary data, now comprehensively debunks Microsoft's new Open XML format. With small case studies he demonstrates the impossible challenges third-party developers will face. His conclusion: it is 'defective by design.' Next week members of the International Standard Organization are likely to approve the format as a second official ISO standard for office documents, even though most nations have submitted comments. Rodriguez claims he is 'not affiliated to any pro-MS or anti-MS party/org[anization]/ass[ociation].'"
This discussion has been archived. No new comments can be posted.

Stephane Rodriguez Dismantles Open XML

Comments Filter:
  • by tsa ( 15680 ) on Sunday August 26, 2007 @08:13AM (#20361263) Homepage
    This is not proof of OOXML being defective by design. It only shows that apparently MS's software isn't able to handle OOXML properly.
  • by darkatom ( 94914 ) on Sunday August 26, 2007 @08:26AM (#20361313)
    But that's still a problem. Microsoft's implementation becomes the de facto standard and all others must (attempt to) conform to the behavior of that implementation or be judged defective. This is what happened when MS published the MAPI (Mail API) spec and then released an implementation alongside it. Lotus and others could never fully mimic what the MS implementation did, so they eventually languished.
  • by setagllib ( 753300 ) on Sunday August 26, 2007 @08:33AM (#20361349)
    It's deliberate. The standard is just a distraction, to keep competitors busy trying to implement it, while documents are actually being created in the Office 2007 variant of OOXML. A few months of legacy almost guarantees a transition to the real OOXML would be an uphill battle, especially with no real documentation of how *either* format works. So even with a supposed 'standard' and a near-enough implementation, the vendor lockin is just as strong as it was with the binary formats.
  • by edxwelch ( 600979 ) on Sunday August 26, 2007 @09:03AM (#20361499)
    You are correct.
    That's why the title says "Microsoft Office XML Formats? Defective by design"
    not "OOXML defective by design"
    He is dissing the Microsofts claims of transparency and openness of Microsoft Office XML
  • ISO Credibility (Score:1, Insightful)

    by Anonymous Coward on Sunday August 26, 2007 @10:16AM (#20361841)
    We all clearly benefit from international open standards. It's also clear that a central coordinating authority like ISO can expedite widespread adoption of such standards by virtue of being even-handed impartial experts on such matters. However, when ISO starts publishing so-called "standards" that are nothing more than paid-for advertising, the credibility of everything they do is called into question. If ISO is unable to withstand political and financial pressure, and we can no longer depend on them to impartially adjudicate the standards making process, then ISO has become irrelevant, at least as far as the IT industry is concerned. The ISO directors have a choice: they can either be paid shills, or a respected standards body - but not both.
  • This is not proof of OOXML being defective by design. It only shows that apparently MS's software isn't able to handle OOXML properly.

    If Office can't read OOXML files produced by other tools, and other tools can't read Office OOXML files, where do you suppose end users will place the blame?

    And what do you suppose users will do when faced with incompatibilities?

    It's a brilliant strategy: Define a new "standard" but don't quite implement it yourself, ensuring that no one can implement a competitive office suite that is compatible with yours. Further, make the standard complex and weird enough that you can always blame inconsistencies on the other implementations. Voila! You get to proclaim to the world that your de facto standard office suite supports an open, ISO-blessed international standard format -- but with no worries about losing your lock-in.

  • by SwashbucklingCowboy ( 727629 ) on Sunday August 26, 2007 @10:21AM (#20361871)

    For example, the part about "Entered versus stored values" is certainly valid (though I wonder if that's not a problem with Excel itself, and not the format). The complaint about the date format is also on the money.

    However, other things seem either wrong or have a bias towards hand editing of the files, e.g. "International, but US English first and foremost". He complains that it uses U.S. English settings. He may not like the U.S., but it's called picking a canonicalized format. Consider the alternative for implementing this in software, parsing of the values in the XML would now depend on settings also found in the XML. That would be insane.

  • by kabz ( 770151 ) on Sunday August 26, 2007 @10:33AM (#20361945) Homepage Journal

    He may not like the U.S., but it's called picking a canonicalized format. Consider the alternative for implementing this in software, parsing of the values in the XML would now depend on settings also found in the XML. That would be insane.
    Here's a reference to XML DTDs [w3schools.com]. This is exactly what should be used to defining localized formula names etc. With XML, you might not be able to do much with it, but given a 'real', properly defined XML format, it should at *least* be possible to parse all the information in the damn thing!!

    Why use a DTD?

    XML provides an application independent way of sharing data. With a DTD, independent groups of people can agree to use a common DTD for interchanging data. Your application can use a standard DTD to verify that data that you receive from the outside world is valid. You can also use a DTD to verify your own data.

    A lot of forums are emerging to define standard DTDs for almost everything in the areas of data exchange. Take a look at: CommerceNet's XML exchange and http://www.schema.net./ [www.schema.net]
    Where is a DTD referenced? That's right, at the top of the XML file.
  • by Karellen ( 104380 ) on Sunday August 26, 2007 @11:00AM (#20362113) Homepage
    Uh, UTF-8 files do not need a BOM. What the fuck is the point of a byte-order-mark on an encoding that is byte-order neutral?

    One of the advantages of UTF-8 for text files is that you don't need a BOM. With XML it's even easier because, as you point out, the XML declaration ("XMLDecl" in the spec) header can contain the "EncodingDecl" to tell explicitly you the file is in UTF-8. If the EncodingDecl says UTF-8, and the file is encoded in UTF-8, then if an XML parser cannot handle that, it's seriously fucked an needs to be fixed.

    You might also want to go read STD-63 at some point. It points out that there are a few problems with using BOMs in UTF-8, and that if there is a way for UTF-8 to be determined in a way other than with the use of a BOM, that should be used instead. Given that XML specifically includes support for an "EncodingDecl" in the "XMLDecl", it is clear that best practices dictate that you *shouldn't* use a BOM when working with UTF-8 encoded XML files. Even if your tools _insist_ on writing BOMs to such files, they had *better* still be able to work if the BOM is missing.

    Heck, with OOXML, you could also use the ZIP's manifest file to keep track of file metadata like the character encoding.
  • Re:Personally.. (Score:4, Insightful)

    by Tony ( 765 ) on Sunday August 26, 2007 @11:24AM (#20362269) Journal
    Can you show us any evidence he hasn't?

    Yes.

    Didn't you read the original article? Haven't you been following the OOXML story at all? There is every evidence that Microsoft has not changed, and works hard to pervert standards and processes to favor their platform over any other. Not just here, but in other areas, as well. Name one major Microsoft product that follows open, published standards without proprietary deviation. Just one. I dare you.

    Also important to note, Bill Gates isn't running MS anymore.

    No. Ballmer is. Bill Gates is a very smart guy (in business, at least). Ballmer is vicious, and even more cold-blooded than Gates (if that can be possible). And the corporation idolizes Gates. His influence will remain long after he's completely retired from the company.
  • Call me a cynic (Score:3, Insightful)

    by PinkyGigglebrain ( 730753 ) on Sunday August 26, 2007 @12:06PM (#20362577)
    I already know how this is going to turn out.

    OOXML will be voted in as an ISO standard.

    Third party vender's trying to implement the "standard" will waste time, money and effort and accomplish nothing of import.

    MS will continue as normal, claiming support for open standards while locking anyone they can into formats/software they own.

    ODF will continue as a marginalized format used by people on the "fringe".
  • by Anonymous Coward on Sunday August 26, 2007 @12:29PM (#20362761)

    Microsoft's implementation becomes the de facto standard

    No, I don't think so. It will serve Microsoft's purposes better if they too cannot properly implement the OOXML standard. Then their fully proprietary file formats would continue to be used since no one could trust that an OOXML document hasn't been corrupted by the OOXML save process.

    This is how Microsoft destroyed the nascent RTF standard that the US Navy wanted to use: they implemented it, but gee there were problems in getting it to work right so maybe all you sailor boys should use Word's native file formats until we get things worked out (which never happened).

    Windows just don't belong on a battleship or aircraft carrier. You would have thought the US Navy would have known that, but no, they had to go and try it anyway.

  • by TaoPhoenix ( 980487 ) <TaoPhoenix@yahoo.com> on Sunday August 26, 2007 @01:02PM (#20363059) Journal
    Don't forget the delicious language. Instead of the legendary "syntax error", we now get a "catastrophic failure". Do it yourself FUD!

    (Scene at office)
    ComputerGuy: "Sure, let's open that with GoogleApps."
    Colleague: "Why am I getting a catastrophic failure? Maybe I better use Excel."

  • In addition, Excel happens to recover nicely from the lack of data that Stephane complains so loudly about, you just happen to get a warning if the file you feed it happens to be incorrectly formed and even offers you an option to "repair" it.

    Yep. Brilliant, isn't it. Given a horribly complex and incomplete specification, Microsoft can easily blame any problems on the other tools -- and they can do this with a straight face because they'll be right! (Quietly ignoring the fact that their own tool produces non-compliant OOXML). Even better, they can smugly point out how their tools fix the "errors" caused by other crappy tools, even as the text of their messages frighten users away from trying any tool that doesn't come from Microsoft ("catastrophic failure", no less!).

    If MS weren't trying to pull a fast one, they'd have designed a more reasonable format, one that does make it practical to make small edits to the XML and expect reasonable results or, even better, used an existing standard like ODF. If ODF can't fully represent all facets of Office documents, the format has a well-defined technical and procedural path to add any necessary extensions.

    By way of comparison, try the same series of experiments with a .ods document, using any of the handful of available applications that supports it, and you'll quickly see how a format that is designed to be straightforward, accessible and specifiable in less than 500 pages compares to the brilliantly-executed monstrosity that is OOXML.

  • by Anonymous Coward on Sunday August 26, 2007 @04:48PM (#20364979)
    Here he comes to save the day! It's WonderMiguel. Always read to come to the defense of Microsoft.

    Otherwise, horrible things could happen, like ODF could be used instead, or it could be extended to include stuff in OOXML and then the world would have one unified standard, instead of two of them even experts can't use that are not interoperable. We couldn't have that.

    So the entire FOSS world wishes to thank Miguel for helping Microsoft keep its users locked in. Hey, man, what kind of game are you playing?
  • by Jeremy_Bee ( 1064620 ) on Sunday August 26, 2007 @04:49PM (#20364991)

    However, other things seem either wrong or have a bias towards hand editing of the files, e.g. "International, but US English first and foremost". He complains that it uses U.S. English settings. He may not like the U.S., but it's called picking a canonicalized format.
    This is offensive bull.

    I don't think you intended it that way, but you should be aware of the vast number of people you just insulted. US English and US dates are only "canonical" in the minds of US citizens. If not for Microsoft purposely and determinedly screwing up the implementation of anything but US standards in their software the usage would have no traction at all.

    The majority of the "English speaking" world still uses the English language and English formats and standards, not US variant ones. The fact that the USA has seen fit to re-invent English, still refer to that as English, and then foist it on the rest of the world doesn't make it "canonical."

    As the author of this article so aptly describes, date formats and language implementations are a multi-stage nightmare in Office. To the point that the majority of users even in English speaking countries like Canada, Australia, New Zealand and the UK itself, often end up using American English and American dates simply because Office is the only game in town and you cna only bash your head against the wall on these things for so long. That doesn't make it right, and that doesn't mean that those users wouldn't be happier and more productive if they were not forced to use a US standard when they may have not even traveled to the US.

    Any kind of English except the US variant, is severely broken in Office and always has been. Your answer sounds to me a lot like: "So what, they should all be using our standards and language anyway." Not helpful at all, and illogical as well.
  • by harlows_monkeys ( 106428 ) on Sunday August 26, 2007 @05:00PM (#20365079) Homepage
    What he should have done for the first example is take the original document, change that one cell from a formula to a constant, like he was trying to do in his by-hand edit, save out that document, and show the differences between it and his by-hand document. That would show us just what has to be changed to keep Excel happy.

    Then we could judge if his example is reasonable or not. I realize we could all do this ourselves, but I for one am not going to go out and buy Excel 2007 just to do that!

  • Not so much. (Score:3, Insightful)

    by SanityInAnarchy ( 655584 ) <ninja@slaphack.com> on Sunday August 26, 2007 @05:49PM (#20365503) Journal
    He wanted to remove a formula from a given cell. His first attempt was to simply remove the formula and change the value.

    Instead, he has to go update all the reference and dependency information, which programs have to generate and update all the time anyway. I can't really think of a good reason this information needs to be saved to disk, and I certainly can't think of a good reason that Excel deletes the cell, rather than updating the dependencies itself to reflect the physical document.

    In fact, I can't think of a good reason to store the value alongside the formula, except as an optional cache, which a program can recalculate if needed.

    They are using XML in the first place. The point of XML is interoperability and human-readability/editability, not performance.
  • by QuestorTapes ( 663783 ) on Sunday August 26, 2007 @06:34PM (#20365967)
    > But that's still a problem. Microsoft's implementation becomes the de facto standard
    > and all others must (attempt to) conform to the behavior of that implementation or
    > be judged defective.

    It's worse than that. Since MS defines a number of aspects of the specification solely
    in terms of compliance with MS application software, the MS implementation is not only
    the -defacto- standard, but the very explicit standard. Not only can no one conform
    to a sufficient level to be judged compliant in the marketplace, for all contractual
    specifications, -nothing- but MS software can -ever- be 100% compliant.

    This means on big, contract driven projects, such as many government projects, MS
    and vendors using MS tools are effectively the only possible competitors, unless
    the contracts and specifications specifically waive vendor compliance with those
    parts of the spec.

    And I strongly doubt anyone would ever write a contract like that.

"May your future be limited only by your dreams." -- Christa McAuliffe

Working...