Dark Corners of the OpenXML Standard

Follow Slashdot blog updates by subscribing to our blog RSS feed

Dark Corners of the OpenXML Standard 250

Posted by CowboyNeal on Friday January 05, 2007 @12:54AM from the dared-to-comply dept.

Standard Disclaimer writes "Most here on Slashdot know that Microsoft released its OpenXML specification to counter ODF and to help preserve its market position, but most people probably aren't aware of all the interesting legacy code the OpenXML specification has brought to light. This article by Rob Weir details many of the crazy legacy features in the dark corners of OpenXML. As it concludes after analyzing specification requirements like suppressTopSpacingWP, 'so not only must an interoperable OOXML implementation first acquire and reverse-engineer a 14-year old version of Microsoft Word, it must also do the same thing with a 16-year old version of WordPerfect.'"

This discussion has been archived. No new comments can be posted.

Dark Corners of the OpenXML Standard

Load All Comments

Search 250 Comments Log In/Create an Account

Comments Filter:

It's not a true standard... (Score:4, Funny)

by __aaclcg7560 ( 824291 ) writes: on Friday January 05, 2007 @12:58AM (#17469738)

Until it supports WordStar [wikipedia.org] documents.

Share
twitter facebook
- Bah! (Score:3, Funny)
  
  by mrchaotica ( 681592 ) * writes:
  
  The strings command supports all legacy document formats! What more could you possibly need? Besides, formatting is overrated anyway...
  - - Re: (Score:2)
      
      by MrMr ( 219533 ) writes:
      
      pdftotext
      
      ( http://www.glyphandcog.com/index.html [glyphandcog.com] )
Length (Score:4, Funny)

by jcnnghm ( 538570 ) writes: on Friday January 05, 2007 @01:08AM (#17469792)

I don't know why anyone would complain, the spec is only 6,000 pages long.

Share
twitter facebook
- Size (Score:5, Funny)
  
  by Kadin2048 ( 468275 ) writes: <slashdot.kadin@xox y . net> on Friday January 05, 2007 @01:30AM (#17469924) Homepage Journal
  
  I don't know why anyone would complain, the spec is only 6,000 pages long.
  
  And the best part is, these [umn.edu] are the pages it uses... (I mean, why else do those specs cost so much?)
  
  Parent Share
  twitter facebook
  - Re:Size (Score:4, Funny)
    
    by SuluSulu ( 1039126 ) writes: on Friday January 05, 2007 @06:20AM (#17471326)
    
    6,000 pages is nothing! Try reading The Wheel of Time series.
    
    Parent Share
    twitter facebook
    - - Re: (Score:3, Insightful)
        
        by h4rm0ny ( 722443 ) writes:
        
        I read the back cover. Looked derivative. Put it back.
  - Re: (Score:2, Funny)
    
    by Squigley ( 213068 ) writes:
    
    Ow! I got a paper cut, and now I need a prosthetic arm!
- Re: (Score:2)
  
  by MillionthMonkey ( 240664 ) writes:
  
  Well, if you were to start implementing that spec at the rate of one page per hour, you'd be done in just 6000 hours.
  - Re: (Score:2)
    
    by Heir Of The Mess ( 939658 ) writes:
    
    The company boss would say why not just give 8 pages each to 750 developers and by the end of the day we should have a fully working product.
    
    While this is rediculous, I'm sure the spec could be broken up into specs for a few different modules. Afterall if Microsoft wrote the spec, and has implemented the spec, then how difficult could it be?
    I once spent 18 months writing a 3000 page spec, and it only took a team of 5 another year to implement it. Of course since then whenever someone asks me if I would li
    - Re: (Score:3, Funny)
      
      by chthon ( 580889 ) writes:
      
      Yes, but modular programming is anti-thetical to Microsoft's way of doing things.
    - Re: (Score:2, Insightful)
      
      by redcane ( 604255 ) writes:
      
      I think they may have implemented it, and then made a spec to take into account their horrible implementation.
    - Re: (Score:3, Insightful)
      
      by 99BottlesOfBeerInMyF ( 813746 ) writes:
      
      Afterall if Microsoft wrote the spec, and has implemented the spec, then how difficult could it be?
      
      Did you read the article. Some of the spec is things like "do what MS Word 5.1.4 did with line spaces." How exactly is anyone other than MS supposed to implement that? By reverse engineering a whole slew of old products that are not even available on the market anymore?
      I once spent 18 months writing a 3000 page spec, and it only took a team of 5 another year to implement it.
      That's fine but this spec isn
- MS areslow learners (Score:4, Interesting)
  
  by WebCowboy ( 196209 ) writes: on Friday January 05, 2007 @03:14AM (#17470484)
  
  ...but they do learn....slowly...eventually.
  
  Their "open" XML format for office docs is a prime example of this.
  
  I think Steve Jobs was the one who first said "Microsoft just doesn't get it". Microsoft was probably the very first third-party software developer for the Mac and this was Jobs' reaction to Microsoft's first Mac applications (I think a port of Multiplan--which was re-incarnated into Excel IIRC, and MSBasic). They really WERE "tasteless", ugly and took almost no advantage of the revolutionary GUI interface--their DOSness really showed through--I think in the case of Multiplan the mouse could be used only to jump the cursor to a certain cell and that was it--the rest was all like in DOS.
  
  MS Windows is another example--Microsoft didn't "get it" well enough until the third major release. Now MS is SLOWLY "getting it" with the beneficial characteristics of XML standards. Microsoft's early XML efforts are like Windows 1.0--there is some very rudmentary understanding of the mechanics but not the philosophy of XML, and I wonder if this is why SOAP ended up NOT so simple (given Microsofties were involved in its creation and seemed to be trying to make it a DCOM-in-XML-but-dumber thing). Microsoft's "Version1" XML might look like this:
  
  <Soap:Envelope> <Soap:Body> <wsWriteLegacyData> <encodedBinaryData> SDFgkdfkljSDFJLDFSJKLkjdfbks df jklsdfklj;hk/jkjnb.kndf jk.sdfjkldfsddfsdfkkjsdfh kvbkjnkjkjksdfkjsdfkeuieru903 oijooeoefvkmefmklef lmkseflkvfeklmlmermklemleflmdvldflk </encodedBina ryData> </wsWriteLegacyData> </Soap:Body> </Soa p:Envelope>
  
  "See? We're using XML and SOAP! We're hip! We're cooool! You can't say we don't play by the rules now!"
  
  Of course, this is an obtuse, opaque and obsfucated way to use XML andtotally NOT in the spirit of interoperability and openness. I won't even go into the nifty XML tools MS has made...nifty to use but they've done a lot to obliterate the S out of SOAP in their crazy output.
  
  OOXML (Opaque and Obsfucated XML) standard is "version 2.0"--they're doing their best to eliminate ambiguity but now we've gone over to hyper-specificity, and the standard is being shared a bit better...problem is that they don't fully describe the interpretation of the standard elements so as to keep its advantage. All they've done is taken every formatting option and mapped it to an XML element--it is monolithic and completely non-extensible. But hey, at least its publicly available and doesn't involve weirdness like encoded-binary-blobs.
  
  In a few years MS will reach version 3.0 of "getting" XML...
  
  Parent Share
  twitter facebook
The author is exactly right. (Score:4, Insightful)

by JoshJ ( 1009085 ) writes: on Friday January 05, 2007 @01:09AM (#17469800) Journal

This is why the Microsoft Office XML (let's not kid ourself, this is far from "open") format should not become an ISO standard.

Share
twitter facebook
- Re: (Score:3, Interesting)
  
  by Zaiff Urgulbunger ( 591514 ) writes:
  
  Totally agree. I wonder how it managed to get approved by ECMA? IIRC only IBM didn't agree to its approval; all other parties (whoever they are) agreed. I don't understand what they felt was good about this "standard" especially given that ODF had already been approved.
  - Re: (Score:2, Funny)
    
    by Helldesk Hound ( 981604 ) writes:
    
    I always thought the ECMA was something to be purchased. ;o)
  - Re: (Score:3, Informative)
    
    by mwvdlee ( 775178 ) writes:
    
    Nobody takes ECMA seriously anyway.
    
    You probably know that JavaScript has been standardized as "EcmaScript" by ECMA; everybody just ignores that standard.
- Re:The author is exactly right. (Score:5, Insightful)
  
  by _|()|\| ( 159991 ) writes: on Friday January 05, 2007 @02:51AM (#17470366)
  
  Prior to reading this article, I was ambivalent about Office XML. The push to standardize Office's "DNA sequence" seemed disingenuous, but at least the format was described in detail. Now I see that the table-sagging 6,000 pages is just the tip of the iceberg: this "standard" effectively includes, by reference, the source code for every prior version of Office, to which only Microsoft has access.
  
  Parent Share
  twitter facebook
- Isn't ODF an updated OpenDoc? (Score:2)
  
  by msobkow ( 48369 ) writes:
  
  I thought ODF was an updated version of the venerable OpenDoc standard pioneered by IBM, Apple, and others. Doesn't it mean "Open Doc Format"?
  If so, it was a defacto industry standard long, long, long before OpenOffice existed.
  - Re: (Score:2, Informative)
    
    by itlurksbeneath ( 952654 ) writes:
    
    No. See OpenDocument [wikipedia.org] and OpenDoc [wikipedia.org]. Two different things. Sort of...
  - Re:Isn't ODF an updated OpenDoc? (Score:5, Informative)
    
    by imroy ( 755 ) writes: <imroykun@gmail.com> on Friday January 05, 2007 @05:34AM (#17471094) Homepage Journal
    
    No. Everyone shortens the expansion of ODF to "Open Doc Format". As you note, there was an "OpenDoc [wikipedia.org]" long before OpenOffice.org and the OpenDocument [wikipedia.org] Format, but they have nothing to do with one another. ODF is based on the StarOffice format that has been around a while, that much is true. Just a sad case of mixing up names...
    
    Parent Share
    twitter facebook
- Disadvantages of ISO (Score:5, Interesting)
  
  by BillGatesLoveChild ( 1046184 ) writes: on Friday January 05, 2007 @03:08AM (#17470444) Journal
  
  Once it is ratified as an ISO Standard, the standard is locked up and anyone that does want to a copy has to buy it from ISO. These are copyrighted. They're not cheap; thousands of dollars. Out of the reach of the average hobbyist, and not listed anywhere on the Internet. That 6,000 page draft will vanish into the mists of time.
  
  Larger Companies can afford this, but garage companies and hobbyists definitely can't. So what's the chance of an open source or even small upstart challenging Microsoft's Documentonopoly? Zero.
  Want another example? ISO country codes. The country codes (e.g. .us, .jp) are actually ISO, and ISO ended up backing off on a demand for royalties for this(!) But if you want state codes (e.g. California, Kantou), well, forget it unless you want to buy them off ISO. http://www.alvestrand.no/pipermail/ietf-languages/ 2003-September/001472.html [alvestrand.no]
  ISO aren't the only ones guility of doing this. IEEE do it as well. Want the latest simulation standard? Then get out your checkbook: http://standards.ieee.org/catalog/olis/compsim.htm l [ieee.org]
  ISO and the IEEE are enemies of openness. Microsoft is taking a page out of their gamebook.
  ISO or IEEE certification is a *bad* thing.
  
  Parent Share
  twitter facebook
  - Re: (Score:2, Informative)
    
    by EvanED ( 569694 ) writes:
    
    Once it is ratified as an ISO Standard, the standard is locked up and anyone that does want to a copy has to buy it from ISO. These are copyrighted. They're not cheap; thousands of dollars. Out of the reach of the average hobbyist, and not listed anywhere on the Internet. That 6,000 page draft will vanish into the mists of time.
    
    You mean like the C++ standard (ISO:14882) which can be downloaded as a PDF for $32 or purchased hardcopy for something like $300, and for which there are multiple sources for drafts
    - Re: (Score:2, Informative)
      
      by BillGatesLoveChild ( 1046184 ) writes:
      
      They wouldn't get too far gauging you for a C++ manual. Here are some examples of what I am saying: ISO/IEC TR 9126 "Software engineering -- Product quality " US$153 each volume * 4 volumes = US$612 IEEE 1278 US$151 each volume * 6 volumes = US$906 Problem is when you are told your software has to comply with one of these, these are the only shops in town. They prohibit copying or sharing the information. Anyone who wants to meet the standard has to send I$O or I money, and there are many, many of these
The power of legacy systems... (Score:5, Insightful)

by Anonymous Coward writes: on Friday January 05, 2007 @01:09AM (#17469804)

The power of legacy systems is at once both Microsoft's greatest strength and greatest weakness. Nobody in OSS is going to have the patience to rebuild the same level of backwards compatibility needed to displace them but the code must be an absolute tarpit of accumulated cruft and security holes that's incredibly difficult for them to keep going.

Share
twitter facebook
- Re: (Score:2)
  
  by blincoln ( 592401 ) writes:
  
  Nobody in OSS is going to have the patience to rebuild the same level of backwards compatibility
  
  I'm sure there are plenty of people that would do it if they had access to the dev docs that Microsoft works from.
  
  The hitch here is that *not* having them means tons and tons of reverse engineering, and that's only after tracking down every release of every version of every MS Office ever. Reverse engineering can be fun, but I have a hard time imagining that figuring out character spacing in the Mandarin version
  - the real hitch - it never was clear (Score:5, Interesting)
    
    by Erris ( 531066 ) writes: on Friday January 05, 2007 @05:31AM (#17471080) Homepage Journal
    
    The hitch here is that *not* having them means tons and tons of reverse engineering, and that's only after tracking down every release of every version of every MS Office ever.
    
    The real hitch, as the article hints, is that the releases are contradictory. For instance, the Mac version of small caps is different from others. This is part of the reason Word is so bloated and does not preserve printing type setting from one machine to the next.
    
    Ten years ago, a state agency I was working for was forced to move from Word Perfect to Word. Hundreds, if not thousands, of documents were painstakingly converted from one format to the other. The typesetting, which they had never had a problem with previously, was easily broken by moves from one machine to the other or by changing printers. That is the kind of thing that no program can account for - it was broken from then and can not be created correctly today. It's also probably the reason for all of the nebulous "guidance" sections that don't tell you anything other than to look at, and presumably measure, old printed examples. Not even M$ knows what it was really doing in the field. As I saw at the time, no two were alike.
    
    Of course, the time to get things right is not in your XML it's when you import the document. The author tells us this in so many words. The XML should be general enough to encompass any kind of typesetting. It is the importing program's task to figure out what the old format wanted things to look like. As the author points out, the spec does not do anything other create something impossible to follow. It's not going to magically make things look right no matter how hard they wish it would.
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Informative)
      
      by stg ( 43177 ) writes:
      
      I had a fun problem with a version of Word (for Windows 2.0, I think) many years back. Some friends came by to print a paper for a CS class, and the files they brought were made with the Brazilian Portuguese version of Word.
      
      I had the English version of Word. When I tried to print, I discovered (after a lot of pages, of course) that I had to fix the formatting because some of the formatting was translated... And not even logical stuff like accents - page breaks, footnotes, etc.
14 year-old Word and 16-year old WP (Score:2, Funny)

by AiY ( 175830 ) writes:

Sweet! I actually have copies of those somewhere. The reverse engineering process will begin immediately. Now where did I put my 286....
Basically (Score:5, Insightful)

by DrYak ( 748999 ) writes: on Friday January 05, 2007 @01:11AM (#17469810) Homepage

ODF is the former SXW format that was taken and transformed into a standard by a committee comprising several Office software makers. It's suppose to describe the normal features that anyone should expect from any Word processing application, be it OpenOffice.org, KWord, AbiWord, Corel Word Perfect, etc. all this in a perfectly neutral way. It was designed with a function in mind (storing word processing documents in an open and interoperable way). Its benefits are comparable to the standardisation of HTML.

OpenXML is Microsoft trying to translate its proprietary DOC file inside a XML container (because it's a big buzzword) and propose it as a standart to ECMA (because everyone is speaking about ODF being an ISO standard). It describes not only what is to be expected from a word processor, but also all MS-Word specific microsoftism. It was designed with a specific software in mind (and partly derives from the internal functionning of MS-Word). It's only a small improvement over the previous MS XML format (which had a lot of informations hidden in a binary blob).

The good thing for Microsoft, is that they can pretend this limitation is "Not-a-bug-but-a-feature", and brag around that there are a lot of stuffs that MS-Word couldn't store inside an ODF and only OpenXML can carry.

Microsoft's plan :
1. Embrace
2. Extend <- They are here
3. Extinguish

Share
twitter facebook
- Don't forget the page counts... (Score:5, Interesting)
  
  by Anonymous Coward writes: on Friday January 05, 2007 @01:34AM (#17469946)
  
  ODF spec page count: 722 [iso.org].
  
  OpenXML spec page count: 6000 [regdeveloper.co.uk]!!
  
  Parent Share
  twitter facebook
- No bragging rights there. (Score:2, Troll)
  
  by Erris ( 531066 ) writes:
  
  OpenXML is Microsoft trying to translate its proprietary DOC file inside a XML container .... The good thing for Microsoft, is that they can pretend this limitation is "Not-a-bug-but-a-feature", and brag around that there are a lot of stuffs that MS-Word couldn't store inside an ODF and only OpenXML can carry.
  
  Pretend is the operative word. Translation is supposed to happen when you import the crufty old crap. M$ may have an advantage there, but you won't find that ability in the 6000 pages of their spe
  - Re: (Score:2, Funny)
    
    by medlefsen ( 995255 ) writes:
    
    When you hold down shift and slowly extend your finger towards the 4 key are you seriously thinking to yourself, "Ha, take that Microsoft!" Cause if you are you need a new hobby.
- Re: (Score:3, Insightful)
  
  by megabyte405 ( 608258 ) writes:
  
  ODF is a nice idea in theory, but really, it's a similar situation (OpenOffice.Org internal dataformat jammed into a standard, so designed with OO.o in mind by necessity) just with more OSS-positive karma associated. There's nothing wrong with saving in a file format that matches your internal representation, in fact, it's a darn good idea (see .ABW for AbiWord, .DOC for Word, .WPD for WordPerfect I would also wager is the same idea). However, interoperability seems to work best when taken from the ground
  - Re:Basically (Score:4, Insightful)
    
    by blincoln ( 592401 ) writes: on Friday January 05, 2007 @02:45AM (#17470338) Homepage Journal
    
    There's nothing wrong with saving in a file format that matches your internal representation, in fact, it's a darn good idea (see .ABW for AbiWord, .DOC for Word, .WPD for WordPerfect I would also wager is the same idea).
    
    I would argue that when it's taken to the extreme of Office prior to 2007, it *is* a bad thing. AFAIK, the old Word format is more or less a (very) partial RAM dump (which is why you can often find all sorts of interesting stuff in Word files that the authors think they've deleted). That makes for faster dev times, but because the load and save functions don't really "understand" the content of the file, IMO the developers made things a lot harder for themselves in the big picture. I imagine reproducing issues in testing is a particular nightmare.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by megabyte405 ( 608258 ) writes:
      
      Oh, I'm sure testing is a nightmare, and it can't be good for performance to be going from a binary memory dump to a binary memory dump probably encoded somehow and shoehorned into XML so you can use those three letters. (Apologies for the lack of solid knowledge - for legal reasons I'd rather not know too much about the intricacies of Microsoft OpenXML.) I was reffering fmore to the fact that .doc is reasonable for use by Word, though it certainly is a pain to load and no good for interchange even betwee
  - Re:Basically (Score:5, Interesting)
    
    by Nicopa ( 87617 ) writes: <nico.lichtmaierNO@SPAMgmail.com> on Friday January 05, 2007 @02:46AM (#17470342)
    
    No. ODF has several real, factual, benefits. It might have been originated in a single product but... it reuses existing standard technologies (SVG, CSS...). It has properly designed XML tags that act as "markup", in OpenDocument xml tags act as container for chunks of data. ODF tries to separate content from style.
    
    And about your RTF suggestion... can I draw diagrams with RTF? Can I have a ToC? Can I do complex styling? Can I have a "galery" of styles? Can I include images? No. RTF is not a solution.
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Interesting)
      
      by megabyte405 ( 608258 ) writes:
      
      Actually, I think for most of the things you suggest, you can do them - I know AbiWord supports them at least. (images, complex styles, TOC) RTF's really not the old dog it seems to be - keep in mind that for copy/paste of any sort of rich text to work in any sensible manner on Windows, one _must_ support RTF well.
    - Re: (Score:2)
      
      by I'm Don Giovanni ( 598558 ) writes:
      
      RTF can do all of the things that you mention. But not all apps support RTF as well as others. I think Word has the most complete impl of RTF.
    - Re: (Score:3, Informative)
      
      by dominator ( 61418 ) writes:
      
      Can I draw diagrams with RTF? Can I have a ToC? Can I do complex styling? Can I have a "galery" of styles? Can I include images? No. RTF is not a solution.
      Actually, you can. RTF can express most (if not all) of what the Microsoft Word format can. Let me answer your objections using excerpts from the RTF 1.8 specification:
      
      The \tc control word introduces a table of contents entry, which can be used to build the actual table of contents.
      
      The \stylesheet control word introduces the style sheet group, which conta
  - Re:Basically (Score:4, Informative)
    
    by iluvcapra ( 782887 ) writes: on Friday January 05, 2007 @03:15AM (#17470486)
    
    After having written some tools on OS X that do stuff with RTF:
    
    RTF is well documented and you can make an RTF document on all manner of platforms (I've done it in Ruby and Cocoa), but many platforms have extended RTF in their own way in order to support special features. OS X has added a few special methods to RTF files to support Mac OS X typography, and I've noticed that different versions of Word handle document attributes (like headers and page numbers) in different ways.
    
    RTF is great if you want to make up something quick that is ONLY formatted text, but readers have all manner of different ways of interpreting the exact appearance of tables, page layouts and margins, and there doesn't seem to be any manageable common mechanism for including images or other documents, something Word and OO.org excel(pun) at. Even HTML seems to be better at this.
    
    I use RTF output in a few little in-house tools I have, so people can get the text+attributes they create and open them in a text editor of their choice for touching-up and delivery. When my tools have to create something that is supposed to be finished, they make PDFs.
    
    RTF is great for interoperability, but I never expect an RTF file to contain a "finished product," unless the recipient expects quality on par with a Selectric. It is merely a relatively-open serialization format for strings with attributes.
    
    Parent Share
    twitter facebook
  - Documents outlive applications (Score:5, Insightful)
    
    by Geof ( 153857 ) writes: on Friday January 05, 2007 @04:03AM (#17470690) Homepage
    
    There's nothing wrong with saving in a file format that matches your internal representation, in fact, it's a darn good idea (see .ABW for AbiWord, .DOC for Word, .WPD for WordPerfect I would also wager is the same idea).
    
    Documents are worth far more than software, and they outlive the applications used to create them. See the comment [robweir.com] to the original article - reading documents after 5, 20, 30, 100 years or more is not optional. You can pay the price of developing an independent format now, or you can pay the price of reverse engineering over and over again every time you change your internal representation.
    
    Repeated implementation limits future change and innovation. It's expensive: it likely costs more even for Microsoft. But they can afford it; their competitors may not be able to. Plus, Microsoft already has their first implementation.
    
    interoperability seems to work best when taken from the ground up - when working with another application's data structure of any complexity, you simply can't do a lossless roundtrip without losing before you've started.
    
    Perhaps so. But compare that cost to the cost I've just outlined. It is in the best interest of users and software developers (maybe even of Microsoft) to bite the bullet now, do the conversion once, and develop a clean format for the future.
    
    Maybe you have in mind an argument you're not making, but I don't see any sufficient basis for your broad contention that using a file format based on an internal representation is a "darn good idea". In specific cases, yes (e.g. where the cost of development time or effort are the most important factors). In general, I very much doubt it. That successful applications in the past have taken that approach is weak evidence. They were developed when the up-front cost of development in a time of rapid innovation, the loss of customer lock-in, and a lack of open-format competition where good business reasons for making such a choice - even if it was inferior technically, increased cost in the long term, and was bad for consumers. In today's climate of slower innovation, competition from open formats, and customers who are running into their own long-term interests, the situation is different.
    
    Which is not to say Microsoft's apparent attempt to set the rules of the game and throw sand in the gears of change is not in their interests, or that it will be unsuccessful.
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Informative)
      
      by LizardKing ( 5245 ) writes:
      
      Documents are worth far more than software, and they outlive the applications used to create them. See the comment to the original article - reading documents after 5, 20, 30, 100 years or more is not optional.
      
      Which is why medical, legal and military records are often not held in word processor formats. For instance, the military records I have dealt with (NATO mostly) are held in SGML, conforming to carefully designed MIL DTD's that preserve structure rather than presentation. These files can be transl
  - Re: (Score:3, Insightful)
    
    by AuMatar ( 183847 ) writes:
    
    No, saving the internal representation as a file is an utterly fucktarded idea. You see, an internal representation is made to make the most sense for your implementation of various features. It changes frequently, sometimes with every patch. It has performance hacks, redundancy, etc. A file format is supposed to be a representation of the data in an easy to parse format, so it can be loaded by applications.
    
    So what happens when you use the internal representation as the file format? Well, you have a fi
    - Re: (Score:3, Interesting)
      
      by megabyte405 ( 608258 ) writes:
      
      Well, AbiWord serializes its internal data structure into XML, so it's not an exact dump - it lets us do things like have backward-compatible additions such as LaTeX and MathML equations and include an image preview of the equation as a fallback, for instance. There are things you can do to make your internal format more lucid, and binary->text is one of those things: I can fix almost anything that can go wrong with an AbiWord doc (usually only happens in dev releases, but sometimes strange things happe
  - OOXML's Origin Is Not The Problem (Score:4, Interesting)
    
    by NickFortune ( 613926 ) writes: on Friday January 05, 2007 @10:15AM (#17472688) Homepage Journal
    
    ODF is a nice idea in theory, but really, it's a similar situation (OpenOffice.Org internal dataformat jammed into a standard, so designed with OO.o in mind by necessity)
    
    The ODF format must necessarily describe the structure and layout of an office document. There's no need for it to reflect the internal data structures of any specific application, except to the extent that they too describe office documents.
    OOXML includes data elements that should be part of internal import routines rather than being enshrined in the document format, and it includes elements that are not specified except by reference to applications for which no public specs exist. This is the problem, not the fact that OOXML is derived from MS Office file formats.
    
    RTF. It may not get press attention, but it's actually a fairly well-documented standard, has been working as an interchange format for years, and yet is designed with enough expandability that it's still useful with the kinds of documents produced today. It's a true de-facto standard.
    
    Well, I was a big fan of RTF at one time. But a few years back I found that documents with any kind of formatting more complex than paragraph+justification+font just wasn't working between MS Office and back. I don't know if this was because the format couldn't cope, or because of faulty implementations. In either case, it led me to give up on RTF.
    In any event, to be a replacement, RTF would need to work for spreadsheets and presentations at a minimum - something I don't think there's a lot of support for in the current RTF specification. We'd also lose the benefits of an XML based format, which given the amount of work on the seamless integration of XML documents into databases, web services and other data management applications means losing a lot of functionality.
    
    for those who really want interoperability, RTF is the way to go with today's software
    
    Interoperability is only part of the problem. We also want a spec that can be fully and freely implemented by anyone, which isn't under the control of any single vendor.We want a format to which we can entrust documents, knowing that in twenty years time there will be an application capable of reading them.
    
    an unnecessary dichotomy is drawn between OpenXML and ODF with regard to their design goals - both are repurposed native formats for a single application.
    
    I don't know what you mean by native in this case, but the repurposing of OOXML isn't the problem. It's one of size and obfuscation, and as TFA points out specification by reference to closed formats and the behaviour of extinct proprietary software. These are non trivial problems with OOXML which are not (to the best of knowledge) found in ODF.
    There's nothing wrong with ODF. Re-creating it based on the non-XML RTF would be a waste of time and effort.
    
    Parent Share
    twitter facebook
- Re: (Score:2)
  
  by urbanradar ( 1001140 ) writes:
  
  Not to forget:
  
  4. Profit!!!
The site seems to be slow... (Score:5, Informative)

by junglee_iitk ( 651040 ) writes: on Friday January 05, 2007 @01:18AM (#17469854)

You want to hire a new programmer and you have the perfect candidate in mind, your old college roommate, Guillaume Portes. Unfortunately you can't just go out and offer him the job. That would get you in trouble with your corporate HR policies which require that you first create a job description, advertise the position, interview and rate candidates and choose the most qualified person. So much paperwork! But you really want Guillaume and only Guillaume.

So what can you do?

The solution is simple. Create a job description that is written specifically to your friend's background and skills. The more specific and longer you make the job description, the fewer candidates will be eligible. Ideally you would write a job description that no one else in the world except Guillaume could possibly match. Don't describe the job requirements. Describe the person you want. That's the trick.

So you end up with something like this:

* 5 years experience with Java, J2EE and web development, PHP, XSLT
* Fluency in French and Corsican
* Experience with the Llama farming industry
* Mole on left shoulder
* Sister named Bridgette

Although this technique may be familiar, in practice it is usually not taken this extreme. Corporate policies, employment law and common sense usually prevent one from making entirely irrational hiring decisions or discriminating against other applicants for things unrelated to the legitimate requirements of the job.

But evidently in the realm of standards there are no practical limits to the application of the above technique. It is quite possible to write a standard that allows only a single implementation. By focusing entirely on the capabilities of a single application and documenting it in infuriatingly useless detail, you can easily create a "Standard of One".

Of course, this begs the question of what is essential and what is not. This really needs to be determined by domain analysis, requirements gathering and consensus building. Let's just say that anyone who says that a single existing implementation is all one needs to look at is missing the point. The art of specification is to generalize and simplify. Generalizing allows you to do more with less, meeting more needs with few constraints.

Let's take a simplified example. You are writing a specification for a file format for a very simple drawing program, ShapeMaster 2007. It can draw circles and squares, and they can have solid or dashed lines. That's all it does. Let's consider two different ways of specifying a file format for ShapeMaster.

In the first case, we'll simply dump out what ShapeMaster does in the most literal way possible. Since it allows only two possible shapes and only two possible line styles, and we're not considering any other use, the file format will look like this:
<document> <shape iscircle="true" isdotted="false"/> <shape iscircle="false" isdotted="true"/> </document>

Although this format is very specific and very accurate, it lacks generality, extensibility and flexibility. Although it may be useful for ShapeMaster 2007, it will hardly be useful for anyone else, unless they merely want to create data for ShapeMaster 2007. It is not a portable, cross-application, open format. It is a narrowly-defined, single application format. It may be in XML. It may be reviewed by a standards committee. But it is by its nature, closed and inflexible.

How could this have been done in a way which works for ShapeMaster 2007 but also is more flexible, extensible and considerate of the needs of different applications? One possibility is to generalize and simplify:
<document> <shape type="circle" lineStyle="solid"/> <shape type="square" lineStyle="dotted"/> </document>

Read the rest of this comment...

Share
twitter facebook
Backwards compatibility (Score:3, Interesting)

by Bob54321 ( 911744 ) writes: on Friday January 05, 2007 @01:29AM (#17469916)

I thought most people considered themselves lucky if there documents could open in successive versions of Office. Why would anyone want to implement support for really old versions if Microsoft does not do it themselves?

Share
twitter facebook
- Re: (Score:2, Informative)
  
  by kfg ( 145172 ) writes:
  
  Why would anyone want to implement support for really old versions if Microsoft does not do it themselves?
  
  Nobody would. That's the point of it.
  
  KFG
My favorite quote (Score:4, Insightful)

by IvyKing ( 732111 ) writes: on Friday January 05, 2007 @01:50AM (#17470032)

From TFA

This is not a specification; this is a DNA sequence.

Outrageously funny and to the point.

Share
twitter facebook
- M$ DNA (Score:2)
  
  by Erris ( 531066 ) writes:
  
  This is not a specification; this is a DNA sequence.
  It's appropriate to note that the 6000 pages will only fit the DNA of a few pathogens [psu.edu]:
  "Measured as Manhattan telephone books, each containing about 1,000 pages of 10-point type," said Simpson, "the genome of the bacterium E. coli is about a third of a book. Baker's yeast, which is my specialty, is a full book. The human genome will occupy two hundred books."
  Other parts of the article about genetic disorders, witches and demonic possesion are also approp
Where I'm from, reverse-engineering... (Score:4, Funny)

by PurifyYourMind ( 776223 ) writes: on Friday January 05, 2007 @02:48AM (#17470350) Homepage

...14- and 16-year-olds is illegal.

Share
twitter facebook
Application responsible for converting? (Score:2)

by Knutsi ( 959723 ) writes:

This was a worrying, but good, article. I'm sure MS is a bit in a thight spot as well, if they really desire backwards compatibility (which is what they survive on in a way). But it would make more sence to make supporting legacy documents more optional.

When I save a Word 2007 document to the old .doc format, it warns me that "minor loss of fidelty" may happen. Similarly, when opening a document, supporting waybackthen formats could be optional/plug-based, and the app rather warning that "minor loss of f
Even more lapses of judgment... (Score:2)

by shadowmatter ( 734276 ) writes:

You can view all the atrocities of OpenXML that he's blogged about here [robweir.com]. Highlights include dumping bitmasks into XML as hexadecimal on a byte-by-byte basis, and an XML element for specifying whether the dates in the workbook start in 1904.

I'm can't believe this became a ratified standard.

"Let him who has understanding calculate the number of the beast, for the number is that of a standard; and its number is three hundred and seventy-six." Common-freaking-sense 13:16-18

- shadowmatter
- Re: (Score:2)
  
  by kahei ( 466208 ) writes:
  
  IMHO those are more serious problems. They're enough to make it be what I'd call a Long Ugly Hastily Written Standard, which somehow doesn't really surprise me.
  
  The thing the original article is freaking out about -- legacy compatibility flags -- isn't really an issue. The standard has to include the features offered by existing wps. Sometimes those features are undocumented, obscure, and almost totally forgotten. What do you do? Find the last remaining copy of the code, figure out exactly how WP4 buggi
Blind leading the blind (Score:3, Interesting)

by frisket ( 149522 ) writes: <peter@silm a r i l.ie> on Friday January 05, 2007 @05:43AM (#17471144) Homepage

It's instructive to observe the panic-ridden frenzy with which Microsoft have approached the business of using XML as a file format. The marketing influence is all too plain to see, with the result that they feel an inner compulsion to preserve the appearance of the document at all costs, sacrificing all logic and common-sense to do it.

OOo did the same, but with greater elegance and less haste because they were ahead of the field. Corel screwed it up with WordPerfect by keeping their stylesheet format proprietary so that transfer between WP document code and XML was made as hard as possible (a Class A blunder, given that their XML editor is actually quite good). AbiWord makes a good job of saving DocBook XML, but it's not trying to pretend it's reimportable; it screws up LaTeX formidably, though, by trying to pretend that it absolutely has to preserve line-length and font-size, which is evidence of the same neurotic attitude as Microsoft.
The problem in all cases is not that the assorted authors and coders don't understand XML (although some of them clearly failed that test too), but that they don't understand documents. This is particularly true at Microsoft, where leaders such as Jean Paoli have been proselytizing XML for years. They still think a document is a jumble of letters; they have no idea of structure, and the DOM is simply laughable as a non-model of a document. Microsoft's particular problem with XML is that they came to it too late, and viewed it as a way of storing data, not text...indeed to this day many XML users, trained with Microsoft blinkers on, are unaware that XML can be used for normal text documents.
With this level of ignorance surrounding Microsoft, it's hardly unexpected that they should blunder so badly.

Share
twitter facebook
- user expectation (Score:2)
  
  by r00t ( 33219 ) writes:
  
  To normal non-nerd users and most nerds as well, a document is a jumble of letters.
  
  I highlight text. I click the "B" button to make my text bold. I don't screw with styles.
  
  Sorry to burst your bubble, dear Holy Priest Of The Most Highest XML.
Do backward compatibility in the converter (Score:2, Insightful)

by rjungbeck ( 1038398 ) writes:

Where is the problem in doing the conversion (for the legacy features) in the converter, so that the new format is free from this bloat? OK, its harder to write the converter (which has to implement this old behaviors), but its Microsoft who wants to have the backward compatibility. So it only needs to be done once.
purism as the enemy of progress (Score:2)

by gjuk ( 940514 ) writes:

As often, purism is the enemy of progress here. Whilst it'd be great to be able to render, faithfully, every detail of any legacy document - it's an unnecessary and unrealistic constraint. One day, Microsoft themselves will choose to drop support for WPx or WW8 etc. They will. Really, they will. For owners of documents whose only record is held in proprietary formats - that will happen one day. Might as well happen with the adoption of a standard which prevents it happening again. Let's face it - PC'
Seems fair enough (Score:2)

by istartedi ( 132515 ) writes:

If you were faced with output from a 15 year old program, what would you do? 15 years? In software, that's an eternity. These tags are essentially saying "here is where this old crap used to be". How many people are actually using these programs? Maintaining documents in the old format? I defy any of you out there in Linux-land to say you wouldn't take the same approach under the same set of circumstances. Actually, Linux people would probably just say "it may not open old documents properly, but tha
- - Re: (Score:2, Insightful)
    
    by theLOUDroom ( 556455 ) writes:
    
    The crazy amount of backwards compatibility is what allowed Microsoft to rise to the position it holds today...
    
    Or maybe it was their illegal business tactics?
    
    It would be pretty easy for me to run a successful business too if I could break federal law with impunity.
    - Re: (Score:3, Insightful)
      
      by Brandybuck ( 704397 ) writes:
      
      Things that are illegal for a monopoly are perfectly legit for a non-monopoly. It's a crazy law, but that's how it works. Microsoft broke no federal laws to *gain* their monopoly.
      - Re:MIcrosoft sucks. (Score:5, Insightful)
        
        by Aadain2001 ( 684036 ) writes: on Friday January 05, 2007 @01:36AM (#17469950) Journal
        
        But they broke plenty of laws to keep their monopoly :) And while their actions during their rise to the top may not have been illegal, they could easily be called 'strong-armed'.
        
        Parent Share
        twitter facebook
        
        Re: (Score:2)
        
        by kjart ( 941720 ) writes:
        
        But they broke plenty of laws to keep their monopoly :) And while their actions during their rise to the top may not have been illegal, they could easily be called 'strong-armed'.
        What they got in trouble for was actually using their monopoly to get into other markets - i.e. bundling IE with the OS meant that they used their OS monopoly to get into the browser market. There was also doing things like offering Office at a discount if vendors bought Windows. This is the kind of thing that everybody does (thi
        
        Re:MIcrosoft sucks. (Score:5, Insightful)
        
        by hachete ( 473378 ) writes: on Friday January 05, 2007 @07:50AM (#17471726) Homepage Journal
        
        Yes, they got into trouble for bundling but it misses the point every time. The secret sauce that Microsoft uses is to strong-arm the OEMs into bundling windows with PCs, espeicially for consumers. I'm also thinking that the Windows Tax is levied even if you buy Linux on a Dell. This is the lynch-pin of Microsoft domination, without it all their other strategies whither on the vine. Without bundling of windows with new pcs, the bundling of IE (and all the other sofware), the resistance against inter-operability, the mysterious file formats etc wither on the vine. I've been disappointed that *none of the investigations I've read about have gone after the OEM-Microsoft link. Break that, and you'll have a free-market again.
        
        I think the Office XML format style is a play straight out of IBM's hand-book: make the standard complex and incomprehensible, and the little players - that's you - will find it hard to compete. In a way, that's a good sign: Microsoft is now lumbering into middle-age, hoist on their own evermore complex petard.
        
        The other thing about middle-age is that every little technological step away from their established base-line is treated as a revolution. In reality, it's no such thing, just a small stepping stone to shouting "pesky kids. Get off my lawn." Or maybe they've reached that stage already.
        
        Parent Share
        twitter facebook
        
        Re: (Score:3, Interesting)
        
        by Gr8Apes ( 679165 ) writes:
        
        They did. It was "resolved" by being disallowed. There is no per machine "MS tax" anymore. That fell victim in teh IBM suit, I believe. (So many suits, so long ago, so many beers.... ahh - that explains it!)
      - Re:MIcrosoft sucks. (Score:4, Insightful)
        
        by edwdig ( 47888 ) writes: on Friday January 05, 2007 @03:26AM (#17470532)
        
        Microsoft broke no laws getting DOS onto every PC. They happened to be in the right place at the right time, and the market fell onto them. But from there, Microsoft bended and broke the law every chance they got to ensure that there never was any competition.
        
        Also don't forget that although MS's purchase of DOS was perfectly legal, it was ethically horrible. They arrived at a handshake agreement to license the code from Seattle Computer Company. While the MS paperwork was being finalized by the lawyers, SCC then made arrangements to finance other business ventures using the MS money. MS then presented them a contract to buy the code rather than license it, and told SCC to take it or leave it. As SCC had already committed to the other deals, they had no choice but to take MS's offer. Sure, no one held a gun to the head of the SCC executives forcing them to take the deal, however, they didn't have any other reasonable alternatives. MS's behavior was legal, but certainly not ethical.
        
        Parent Share
        twitter facebook
      - Re: (Score:3, Insightful)
        
        by tyme ( 6621 ) writes:
        
        Brandybuck [slashdot.org] wrote:
        Things that are illegal for a monopoly are perfectly legit for a non-monopoly. It's a crazy law, but that's how it works. Microsoft broke no federal laws to *gain* their monopoly.
        
        Unfortunately, you are wrong on almost all counts:
        
        Section 1 [cornell.edu] of the Sherman Act [cornell.edu] (Restraints of Trade) applies to everyone, not just to monopolists. If Microsoft engaged in any restrains of trade, even before they were a monopoly (which doesn't have to mean 100% market share, as so many people seem to believe, i
      - Re: (Score:2)
        
        by CAPSLOCK2000 ( 27149 ) writes:
        
        Let's compare it to an all time favorite: Guns.
        
        You don't have to break a law to get a gun. But as soon as you start using it you'd better be very carefull. Except for a few very well defined cases (sport, hunting, self-defense), using your gun is illegal.
        
        Even in the cases named above, using your gun in the wrong way will send you to jail.
      - Re: (Score:3, Insightful)
        
        by 99BottlesOfBeerInMyF ( 813746 ) writes:
        
        Things that are illegal for a monopoly are perfectly legit for a non-monopoly. It's a crazy law, but that's how it works.
        
        I think your logic is more than a little broken. Monopolies have a great deal of power that other's don't have. They can undermine capitalism in a market and destroy innovation in entire industries. They can spread causing that damage to other markets. Think of it like this, people piloting airplanes aren't allowed to drink or step outside for a cigar, while those behaviors are perfect
        
        Re: (Score:3, Insightful)
        
        by I'm Don Giovanni ( 598558 ) writes:
        
        "I think your logic is more than a little broken. Monopolies have a great deal of power that other's don't have. They can undermine capitalism in a market and destroy innovation in entire industries. They can spread causing that damage to other markets. Think of it like this, people piloting airplanes aren't allowed to drink or step outside for a cigar, while those behaviors are perfectly legal for people who aren't piloting planes. Isn't that crazy?"
        
        A pilot knows that he's drinking at the time that he's do
      - Re:MIcrosoft sucks. (Score:5, Insightful)
        
        by TrekkieGod ( 627867 ) writes: on Friday January 05, 2007 @03:28AM (#17470544) Homepage Journal
        
        If you get to the point where you build up a company that can even consider garnering the term "monopoly", then get back to us...At that point, maybe, just maybe, you may come to thinking that you you earned what you got, and the government has no right to tell you how to run your business...
        
        Yeah. Because the person best suited to decide what a company should or should not be allowed to do are the people who own the company. Of course you're going to want to be completely unrestricted to mow down your competitors using whatever advantages you have if you are in a position to do so. What you're missing is that no one should be allowed to use unfair practices to do it. Some people think we should idolize the free market as some sort of religion. We don't like free market economy because it was given to us by the gods. We like it because it tends to result in better products and lower prices. That ceases to be true when you have a monopoly in the mix.
        That being said, I'm not really informed about any Microsoft specifics, so I'm not going to argue in favor or against any "federal laws" as it applies to them (or failed to apply to them). However, suggesting that only people who have built a company that holds a monopoly should be able to decide what is fair regulation isn't rational. It may even be that the current federal laws regarding monopolies may be unfair and in need of reform, but the fact remains that the existence of a set of laws to regulate businesses is necessary.
        
        Parent Share
        twitter facebook
        
        Re: (Score:2, Insightful)
        
        by infofc ( 979172 ) writes:
        
        My mind must be failing. I seem to recall that a free market is based on fairly competing businesses, hence no monopoly can be tolerated. We allow monopolies to form and exist as long as competitors have a chance to emerge.
        
        Re:MIcrosoft sucks. (Score:4, Insightful)
        
        by ArsenneLupin ( 766289 ) writes: on Friday January 05, 2007 @04:41AM (#17470850)
        
        If you get to the point where you build up a company that can even consider garnering the term "monopoly", then get back to us. Until then, you have no idea what you're talking about, especially when quoting arbitrary and esoteric "federal laws". Call me nuts, but if you ever got to that point, you'd might even get a crazy idea in your head that those "federal laws" that you are so damned proud of, are about as fair and just as our drug laws. At that point, maybe, just maybe, you may come to thinking that you you earned what you got, and the government has no right to tell you how to run your business that you started in your teens, and proceeded to build to make it one of the most successful companies in the history of capitalism.
        
        Until you get to that point, I suggest that you those "federal laws" out your ass, Mr. Ashcroft.
        I agree 100% with you. However, for fairness' sake, we should then abolish all those unjust business-hampering federal laws, including copyright and patent law.
        Oh, and also those so-called "computer misuse" laws. Indeed, if I want to set up a consultancy where I propose to convert customers ASP scripts to PHP I should be allowed to demo to my prospective customers in great graphical detail why ASP is so insecure, even if I don't yet have an existing business relationship. Why should I tolerate that the government tells me how I may and may not recruit new customers?
        Anything less would be one-sided and unfair.
        
        Parent Share
        twitter facebook
      - Re: (Score:2, Informative)
        
        by poopdeville ( 841677 ) writes:
        
        Call me a shill if you want, but I've seen NineNine post in a lot of different threads not related to MS. I think he's acting in good faith.
    - Re: (Score:3, Insightful)
      
      by Al Dimond ( 792444 ) writes:
      
      No, actually, I think you'd find it takes the skill of many people, good timing, and luck to be successful in business, even if you could break very many laws. Creating and sustaining a business for many years is hard. Not very many businesses make it.
    - Re: (Score:2)
      
      by WalterGR ( 106787 ) writes:
      
      The crazy amount of backwards compatibility is what allowed Microsoft to rise to the position it holds today...
      
      Or maybe it was their illegal business tactics?
      
      Or maybe I, too, can post unprovable, untestable anti-Microsoft conjecture to slashdot and get modded up?
- Death would be too easy. (Score:5, Funny)
  
  by Kadin2048 ( 468275 ) writes: <slashdot.kadin@xox y . net> on Friday January 05, 2007 @01:43AM (#17469984) Homepage Journal
  
  I think we need to do some sort of "Trading Places [imdb.com]"-esque scheme, where all the Microsoft board members go to sleep one night as usual, but wake up the next morning working in Bangalore at an outsourced call center for OEM tech support.
  
  At the same time we'll let the tech support drones have their way with the Microsoft campus, which I suspect will involve setting it on fire.
  
  Parent Share
  twitter facebook
- Re: (Score:3)
  
  by oohshiny ( 998054 ) writes:
  
  Are you kidding? This is not a format specification. And it reflects badly on Microsoft and the engineers that authored this document: either they are too stupid to know that this is not a specification, or they are taking everybody else to be fools.
- Re: (Score:2)
  
  by Al Dimond ( 792444 ) writes:
  
  Because there's an opportunity for the format to not be ugly, so that the engineers can get as much done with less work and spend the rest of the time doing something that's really useful instead of duplicating their futile efforts. Or they might just kick off early and sip margaritas on the beach for all I care.
- Re:Suck it up (Score:5, Funny)
  
  by animaal ( 183055 ) writes: on Friday January 05, 2007 @04:32AM (#17470820)
  
  What's the deal with you people? I have seen engineers take apart the most difficult situations. You have the format in your hands. It's ugly and crappy, go figure. Just get it done and stop bitching. Why is everyone so lazy?
  Jeff, is that you? Haven't seen you much since you became a project manager. Congrats on getting the MBA!
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by mwvdlee ( 775178 ) writes:
  
  Ever heard of the saying "good programmers are lazy programmers"?
  
  Yes, we could all duplicate the significant effort of reverse engineering the missing parts of the standard (you still have a working copy of WP5 around somewhere?). Or we could just save everybody a lot of time and money in the future by making a one-time small investment of fixing the standard now.
- Re: (Score:3, Insightful)
  
  by gaijin99 ( 143693 ) writes:
  
  You're missing the point. By defining their "standard" in this manner they can now say "Application X doesn't implement OOXML", naturally by "implement OOXML" they mean "fully implement OOXML" so that if even the most obscure and bizarre tag is not supported that's that. At that point they can either demand that application X not claim on its packaging that it supports their "standard", they might have one of those cute little "OOXML Compatiable" seals and refuse to let anyone who doesn't fully support th
- Re: (Score:2, Insightful)
  
  by rohan972 ( 880586 ) writes:
  
  I understand that these tags will be needed when converting legacy documents, but how many people are going meet all the following conditions to even be effected by this:
  
  If it gets adopted as a standard (ISO or similar, not defacto standard) then everyone. The point is not whether people need the features, the point is that MS is trying to get this accepted as a standard. It still can only be implemented by MS, and therefore should not be accepted as a standard. If a government body had as part of a soft
- I can. (Score:2)
  
  by mrchaotica ( 681592 ) * writes:
  What you don't seem to realize is that your list of people "effected" by this is irrelevant. There's exactly one category that matters:
  
  People who will not even consider alternatives to MS Office by the fact that only Microsoft could possibly ever claim "full compliance" with the "standard."
- Re:I can't see this being too big of a problem (Score:5, Insightful)
  
  by Askmum ( 1038780 ) writes: on Friday January 05, 2007 @04:02AM (#17470680)
  
  You seem to be missing the point.
  
  You do not need these features to begin with in a new format that is inherently incompatible with an old format. You don't want to say "now I'm going to do WP style linespacing and my linespacing is 1".
  If you want to convert a WP document to an XML document, the conversion program should know that the linespacing in WP is 0.9 times the linspacing in XML document (or what it really may be)and will then use linespacing=0.9 in the XML document. This is not a task of the new wordprocessor or its specification.
  
  By adding this so-called "backward compatibility" to your specification, you make the spec overly difficult and in fact you make the conversion program in the new application when this is absolutely not necessary.
  And on top of that, you require that the programmer who uses this spec should have knowledge of all these old versions and is able to program them without error. And as the application will grow because of these unnecessary features, the number of bugs will also rise. So this is not a blueprint for a good application, this is a blueprint for a very buggy implementation of a wordprocessor.
  
  Parent Share
  twitter facebook
  - - Re: (Score:2)
      
      by vidarh ( 309115 ) writes:
      
      You've missed the point - he's presented examples, not an exhaustive list. He even linked to another article detailing more brain damage.
      Also, again, you can get the precise behavior in a generic way. The post you replied to explained how. It's even ok to include information about _why_ certain information is like that, so that tools that have special knowledge about the format can do full roundtrips more easily. The problem comes when the format itself is burdening ALL implementors with replicating undoc
    - Re: (Score:3, Insightful)
      
      by cduffy ( 652 ) writes:
      
      By "updating" these pages by not supporting the old format you will breaking the layout of some documents that need a consistent layout.
      You're missing his point: When converting the file to OOXML, one can and should add generic tags indicating the specific (broken) behavior which should emulated (such as "scale small caps by this percentage point") rather than just specifying a generic "Do What I Mean" marker without any useful guidance on how rendering of documents containing this marker should be implemen
- Re: (Score:2)
  
  by tdelaney ( 458893 ) writes:
  
  The number of *people* who need to convert old documents may well be small, but the number of *documents* that need to be converted/imported is *huge*. And you can guarantee that if one of their documents hits an undocumented legacy feature, all their documents will.
- - - Re: (Score:2)
      
      by Ucklak ( 755284 ) writes:
      
      I say batch every old document to PDF then you don't have to worry about Microsoft anymore.
      
      What an assinine piece of crap OOXML is.
- - - Re: (Score:2)
      
      by jlarocco ( 851450 ) writes:
      
      Anyway, what I was going to say was that they simply need to "imitate" the feature, which OpenOffice/WordPerfect already do with their legacy Word doc support, making the point of this article moot. Now, don't get me wrong, this standard sucks, it's a bunch of floof, but so is the basis for this article.
      No, they can't "imitate" the feature. They have to do EXACTLY what Office does, otherwise they're not compliant. Did you even read that guidance block you quoted? "If applications wish to match this b
- Forbidden partial implementation? (Score:5, Interesting)
  
  by tepples ( 727027 ) writes: <tepples.gmail@com> on Friday January 05, 2007 @02:52AM (#17470370) Homepage Journal
  
  OOXML is just as open as ODF
  
  The behavior of years-old proprietary word processing software is included by reference into OOXML. How is any spec that includes by reference the behavior of proprietary software exactly "open"? True, implementors could produce a partial implementation of the spec that degrades away the legacy baggage (more or less) gracefully, but some standards' patent licensors forbid implementors to publish a partial implementation. I don't know if this applies to OOXML's license.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by bigpat ( 158134 ) writes:
    
    but some standards' patent licensors forbid implementors to publish a partial implementation. I don't know if this applies to OOXML's license.
    My understanding is that Microsoft was going to enforce full implementations of the standard, except on themselves of course, so it really makes OOXML untenable as common format. What you get is an open standard that everyone but Microsoft has to follow exactly and implement fully, but only Microsoft could follow it and implement it fully. And then even if someone could manage to implement MS's XML fully and perfectly, then Microsoft could just pull a bait and switch would break compatibility.
    
    Actually, I
  - Re: (Score:2)
    
    by dkf ( 304284 ) writes:
    
    How is any spec that includes by reference the behavior of proprietary software exactly "open"?
    Easy. It's open... to abuse.
- Who's Scared? (Score:4, Informative)
  
  by Erris ( 531066 ) writes: on Friday January 05, 2007 @05:56AM (#17471200) Homepage Journal
  
  ... immediately render billions of existing MSO documents obsolete if you could get govt to mandate ODF exclusively. And the bonus is that such govt mandate would render any and all features not supported by ODF (i.e. not supported by OO.o) irrelevant.
  
  Eh? Isn't that why M$ made this supposedly "open" format? Because governments were tired of paying through the nose for secret formats that broke between versions? The purpose of an archive is to read it later. Governments and companies have already moved to pdf for archives. They are going to move their working documents to reasonable formats next.
  
  But MS opened their own format, thus leveling the playing field so that you must again compete on features ...
  
  You must not have read the 6000 page spec, which includes lots of sections like this:
  
  Guidance: To faithfully replicate this behavior, applications must imitate the behavior of that application, which involves many possible behaviors and cannot be faithfully placed into narrative for this Office Open XML Standard. If applications wish to match this behavior, they must utilize and duplicate the output of those applications. It is recommended that applications not intentionally replicate this behavior as it was deprecated due to issues with its output, and is maintained only for compatibility with existing documents from that application. end guidance
  
  That's neither open, nor a standard.
  
  Microsoft is hoping people believe what you say, but everyone knows better. Shit like OOXML this only proves that they have not changed. It's just another, more elaborate and more expensive lie. Even the name, by using "OO" is intentionally confusing. The New Office is everything the old Office was and always will be. Vista and Office 2007 are non starters.
  
  Parent Share
  twitter facebook
- Re:Unfair (Score:4, Insightful)
  
  by DavidTC ( 10147 ) writes: <slas45dxsvadiv.v ... m ['box' in gap]> on Friday January 05, 2007 @03:53AM (#17470642) Homepage
  
  Tools that don't care about legacy support are unaffected by this; they can just pick the closest modern option to whatever the legacy flag calls for on input, and not output documents that use them.
  And thus tools, legally, are not OOXML, and won't qualify for purchasing by companies that specify OOXML. Which is the entire point.
  There's a difference between 'We need to make sure that old documents can be converted correctly.', and 'We will literally convert old documents into a new representation that contains all their weirdness, and we won't explain how to implement said weirdness in the standard.'.
  What Microsoft has produced is not even a standard. Standards must specify everything, or reference other standards that specify everything. They can't reference applications.
  If Microsoft wants to keep secret how to turn Office 95 documents into OOXML, fine. Producing a standard doesn't mean you have to explain how to convert things into that standard.
  It does, however, mean you have to explain exactly what should happen if mwSmallCaps is true, to the pixel. You can't just pawn it off on the unexplained hypothetical behavior of some other application.
  
  Parent Share
  twitter facebook
- Re: (Score:3, Insightful)
  
  by Askmum ( 1038780 ) writes:
  
  So they did it wrong.
  
  You need to let a conversion program worry about converting Word 2006 documents to XML documents. You need to let the maker of Word 2006 worry about making this conversion program. This can be in the form of a "save as XML" option, but also an external program.
  You can not say "oh, this is an old feature, let's put it in the spec and let's let the programmer that uses this spec worry about it because we can't be bothered to convert it or don't know how to convert it".
  Sorry, but XML s
- Re: (Score:2)
  
  by SanityInAnarchy ( 655584 ) writes:
  
  Let's see -- first off, it would help if Microsoft had actually spelled out how this is supposed to work, rather than requiring us to have a copy of every word processor since WordPerfect and carefully reverse-engineer its behavior in every case.
  
  Second, you're right, old format -> OOXML -> old format... yeah, that would kind of suck. However, I don't get why you would ever want such a loop to work. Old format -> OOXML -> some other format -- now that makes sense. Think of it this way -- you can

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

It's not a true standard... (Score:4, Funny)

Bah! (Score:3, Funny)

Re: (Score:2)

Length (Score:4, Funny)

Size (Score:5, Funny)

Re:Size (Score:4, Funny)

Re: (Score:3, Insightful)

Re: (Score:2, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Funny)

Re: (Score:2, Insightful)

Re: (Score:3, Insightful)

MS areslow learners (Score:4, Interesting)

The author is exactly right. (Score:4, Insightful)

Re: (Score:3, Interesting)

Re: (Score:2, Funny)

Re: (Score:3, Informative)

Re:The author is exactly right. (Score:5, Insightful)

Isn't ODF an updated OpenDoc? (Score:2)

Re: (Score:2, Informative)

Re:Isn't ODF an updated OpenDoc? (Score:5, Informative)

Disadvantages of ISO (Score:5, Interesting)

Re: (Score:2, Informative)

Re: (Score:2, Informative)

The power of legacy systems... (Score:5, Insightful)

Re: (Score:2)

the real hitch - it never was clear (Score:5, Interesting)

Re: (Score:3, Informative)

14 year-old Word and 16-year old WP (Score:2, Funny)

Basically (Score:5, Insightful)

Don't forget the page counts... (Score:5, Interesting)

No bragging rights there. (Score:2, Troll)

Re: (Score:2, Funny)

Re: (Score:3, Insightful)

Re:Basically (Score:4, Insightful)

Re: (Score:2)

Re:Basically (Score:5, Interesting)

Re: (Score:3, Interesting)

Re: (Score:2)

Re: (Score:3, Informative)

Re:Basically (Score:4, Informative)

Documents outlive applications (Score:5, Insightful)

Re: (Score:3, Informative)

Re: (Score:3, Insightful)

Re: (Score:3, Interesting)

OOXML's Origin Is Not The Problem (Score:4, Interesting)

Re: (Score:2)

The site seems to be slow... (Score:5, Informative)

Backwards compatibility (Score:3, Interesting)

Re: (Score:2, Informative)

My favorite quote (Score:4, Insightful)

M$ DNA (Score:2)

Where I'm from, reverse-engineering... (Score:4, Funny)

Application responsible for converting? (Score:2)

Even more lapses of judgment... (Score:2)

Re: (Score:2)

Blind leading the blind (Score:3, Interesting)

user expectation (Score:2)

Do backward compatibility in the converter (Score:2, Insightful)

purism as the enemy of progress (Score:2)

Seems fair enough (Score:2)

Re: (Score:2, Insightful)

Re: (Score:3, Insightful)

Re:MIcrosoft sucks. (Score:5, Insightful)

Re: (Score:2)

Re:MIcrosoft sucks. (Score:5, Insightful)

Re: (Score:3, Interesting)

Re:MIcrosoft sucks. (Score:4, Insightful)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:3, Insightful)

Re: (Score:3, Insightful)

Re:MIcrosoft sucks. (Score:5, Insightful)

Re: (Score:2, Insightful)

Re:MIcrosoft sucks. (Score:4, Insightful)

Re: (Score:2, Informative)

Re: (Score:3, Insightful)

Re: (Score:2)

Death would be too easy. (Score:5, Funny)