Dark Corners of the OpenXML Standard 250
Standard Disclaimer writes "Most here on Slashdot know that Microsoft released its OpenXML specification to counter ODF and to help preserve its market position, but most people probably aren't aware of all the interesting legacy code the OpenXML specification has brought to light. This article by Rob Weir details many of the crazy legacy features in the dark corners of OpenXML. As it concludes after analyzing specification requirements like suppressTopSpacingWP, 'so not only must an interoperable OOXML implementation first acquire and reverse-engineer a 14-year old version of Microsoft Word, it must also do the same thing with a 16-year old version of WordPerfect.'"
It's not a true standard... (Score:4, Funny)
Bah! (Score:3, Funny)
The strings command supports all legacy document formats! What more could you possibly need? Besides, formatting is overrated anyway...
Re: (Score:2)
( http://www.glyphandcog.com/index.html [glyphandcog.com] )
Length (Score:4, Funny)
Size (Score:5, Funny)
And the best part is, these [umn.edu] are the pages it uses... (I mean, why else do those specs cost so much?)
Re:Size (Score:4, Funny)
Re: (Score:3, Insightful)
I read the back cover. Looked derivative. Put it back.
Re: (Score:2, Funny)
Re: (Score:2)
Re: (Score:2)
The company boss would say why not just give 8 pages each to 750 developers and by the end of the day we should have a fully working product.
While this is rediculous, I'm sure the spec could be broken up into specs for a few different modules. Afterall if Microsoft wrote the spec, and has implemented the spec, then how difficult could it be?
I once spent 18 months writing a 3000 page spec, and it only took a team of 5 another year to implement it. Of course since then whenever someone asks me if I would li
Re: (Score:3, Funny)
Yes, but modular programming is anti-thetical to Microsoft's way of doing things.
Re: (Score:2, Insightful)
Re: (Score:3, Insightful)
Afterall if Microsoft wrote the spec, and has implemented the spec, then how difficult could it be?
Did you read the article. Some of the spec is things like "do what MS Word 5.1.4 did with line spaces." How exactly is anyone other than MS supposed to implement that? By reverse engineering a whole slew of old products that are not even available on the market anymore?
I once spent 18 months writing a 3000 page spec, and it only took a team of 5 another year to implement it.
That's fine but this spec isn
MS areslow learners (Score:4, Interesting)
Their "open" XML format for office docs is a prime example of this.
I think Steve Jobs was the one who first said "Microsoft just doesn't get it". Microsoft was probably the very first third-party software developer for the Mac and this was Jobs' reaction to Microsoft's first Mac applications (I think a port of Multiplan--which was re-incarnated into Excel IIRC, and MSBasic). They really WERE "tasteless", ugly and took almost no advantage of the revolutionary GUI interface--their DOSness really showed through--I think in the case of Multiplan the mouse could be used only to jump the cursor to a certain cell and that was it--the rest was all like in DOS.
MS Windows is another example--Microsoft didn't "get it" well enough until the third major release. Now MS is SLOWLY "getting it" with the beneficial characteristics of XML standards. Microsoft's early XML efforts are like Windows 1.0--there is some very rudmentary understanding of the mechanics but not the philosophy of XML, and I wonder if this is why SOAP ended up NOT so simple (given Microsofties were involved in its creation and seemed to be trying to make it a DCOM-in-XML-but-dumber thing). Microsoft's "Version1" XML might look like this: "See? We're using XML and SOAP! We're hip! We're cooool! You can't say we don't play by the rules now!"
Of course, this is an obtuse, opaque and obsfucated way to use XML andtotally NOT in the spirit of interoperability and openness. I won't even go into the nifty XML tools MS has made...nifty to use but they've done a lot to obliterate the S out of SOAP in their crazy output.
OOXML (Opaque and Obsfucated XML) standard is "version 2.0"--they're doing their best to eliminate ambiguity but now we've gone over to hyper-specificity, and the standard is being shared a bit better...problem is that they don't fully describe the interpretation of the standard elements so as to keep its advantage. All they've done is taken every formatting option and mapped it to an XML element--it is monolithic and completely non-extensible. But hey, at least its publicly available and doesn't involve weirdness like encoded-binary-blobs.
In a few years MS will reach version 3.0 of "getting" XML...
The author is exactly right. (Score:4, Insightful)
Re: (Score:3, Interesting)
Re: (Score:2, Funny)
Re: (Score:3, Informative)
You probably know that JavaScript has been standardized as "EcmaScript" by ECMA; everybody just ignores that standard.
Re:The author is exactly right. (Score:5, Insightful)
Isn't ODF an updated OpenDoc? (Score:2)
I thought ODF was an updated version of the venerable OpenDoc standard pioneered by IBM, Apple, and others. Doesn't it mean "Open Doc Format"?
If so, it was a defacto industry standard long, long, long before OpenOffice existed.
Re: (Score:2, Informative)
Re:Isn't ODF an updated OpenDoc? (Score:5, Informative)
Disadvantages of ISO (Score:5, Interesting)
Once it is ratified as an ISO Standard, the standard is locked up and anyone that does want to a copy has to buy it from ISO. These are copyrighted. They're not cheap; thousands of dollars. Out of the reach of the average hobbyist, and not listed anywhere on the Internet. That 6,000 page draft will vanish into the mists of time.
Larger Companies can afford this, but garage companies and hobbyists definitely can't. So what's the chance of an open source or even small upstart challenging Microsoft's Documentonopoly? Zero.
Want another example? ISO country codes. The country codes (e.g. .us, .jp) are actually ISO, and ISO ended up backing off on a demand for royalties for this(!) But if you want state codes (e.g. California, Kantou), well, forget it unless you want to buy them off ISO. http://www.alvestrand.no/pipermail/ietf-languages/ 2003-September/001472.html [alvestrand.no]
ISO aren't the only ones guility of doing this. IEEE do it as well. Want the latest simulation standard? Then get out your checkbook: http://standards.ieee.org/catalog/olis/compsim.htm l [ieee.org]
ISO and the IEEE are enemies of openness. Microsoft is taking a page out of their gamebook.
ISO or IEEE certification is a *bad* thing.
Re: (Score:2, Informative)
You mean like the C++ standard (ISO:14882) which can be downloaded as a PDF for $32 or purchased hardcopy for something like $300, and for which there are multiple sources for drafts
Re: (Score:2, Informative)
The power of legacy systems... (Score:5, Insightful)
Re: (Score:2)
I'm sure there are plenty of people that would do it if they had access to the dev docs that Microsoft works from.
The hitch here is that *not* having them means tons and tons of reverse engineering, and that's only after tracking down every release of every version of every MS Office ever. Reverse engineering can be fun, but I have a hard time imagining that figuring out character spacing in the Mandarin version
the real hitch - it never was clear (Score:5, Interesting)
The hitch here is that *not* having them means tons and tons of reverse engineering, and that's only after tracking down every release of every version of every MS Office ever.
The real hitch, as the article hints, is that the releases are contradictory. For instance, the Mac version of small caps is different from others. This is part of the reason Word is so bloated and does not preserve printing type setting from one machine to the next.
Ten years ago, a state agency I was working for was forced to move from Word Perfect to Word. Hundreds, if not thousands, of documents were painstakingly converted from one format to the other. The typesetting, which they had never had a problem with previously, was easily broken by moves from one machine to the other or by changing printers. That is the kind of thing that no program can account for - it was broken from then and can not be created correctly today. It's also probably the reason for all of the nebulous "guidance" sections that don't tell you anything other than to look at, and presumably measure, old printed examples. Not even M$ knows what it was really doing in the field. As I saw at the time, no two were alike.
Of course, the time to get things right is not in your XML it's when you import the document. The author tells us this in so many words. The XML should be general enough to encompass any kind of typesetting. It is the importing program's task to figure out what the old format wanted things to look like. As the author points out, the spec does not do anything other create something impossible to follow. It's not going to magically make things look right no matter how hard they wish it would.
Re: (Score:3, Informative)
I had the English version of Word. When I tried to print, I discovered (after a lot of pages, of course) that I had to fix the formatting because some of the formatting was translated... And not even logical stuff like accents - page breaks, footnotes, etc.
14 year-old Word and 16-year old WP (Score:2, Funny)
Basically (Score:5, Insightful)
OpenXML is Microsoft trying to translate its proprietary DOC file inside a XML container (because it's a big buzzword) and propose it as a standart to ECMA (because everyone is speaking about ODF being an ISO standard). It describes not only what is to be expected from a word processor, but also all MS-Word specific microsoftism. It was designed with a specific software in mind (and partly derives from the internal functionning of MS-Word). It's only a small improvement over the previous MS XML format (which had a lot of informations hidden in a binary blob).
The good thing for Microsoft, is that they can pretend this limitation is "Not-a-bug-but-a-feature", and brag around that there are a lot of stuffs that MS-Word couldn't store inside an ODF and only OpenXML can carry.
Microsoft's plan :
1. Embrace
2. Extend <- They are here
3. Extinguish
Don't forget the page counts... (Score:5, Interesting)
ODF spec page count: 722 [iso.org].
OpenXML spec page count: 6000 [regdeveloper.co.uk]!!
No bragging rights there. (Score:2, Troll)
OpenXML is Microsoft trying to translate its proprietary DOC file inside a XML container .... The good thing for Microsoft, is that they can pretend this limitation is "Not-a-bug-but-a-feature", and brag around that there are a lot of stuffs that MS-Word couldn't store inside an ODF and only OpenXML can carry.
Pretend is the operative word. Translation is supposed to happen when you import the crufty old crap. M$ may have an advantage there, but you won't find that ability in the 6000 pages of their spe
Re: (Score:2, Funny)
Re: (Score:3, Insightful)
Re:Basically (Score:4, Insightful)
I would argue that when it's taken to the extreme of Office prior to 2007, it *is* a bad thing. AFAIK, the old Word format is more or less a (very) partial RAM dump (which is why you can often find all sorts of interesting stuff in Word files that the authors think they've deleted). That makes for faster dev times, but because the load and save functions don't really "understand" the content of the file, IMO the developers made things a lot harder for themselves in the big picture. I imagine reproducing issues in testing is a particular nightmare.
Re: (Score:2)
Re:Basically (Score:5, Interesting)
And about your RTF suggestion... can I draw diagrams with RTF? Can I have a ToC? Can I do complex styling? Can I have a "galery" of styles? Can I include images? No. RTF is not a solution.
Re: (Score:3, Interesting)
Re: (Score:2)
Re: (Score:3, Informative)
Actually, you can. RTF can express most (if not all) of what the Microsoft Word format can. Let me answer your objections using excerpts from the RTF 1.8 specification:
The \tc control word introduces a table of contents entry, which can be used to build the actual table of contents.
The \stylesheet control word introduces the style sheet group, which conta
Re:Basically (Score:4, Informative)
After having written some tools on OS X that do stuff with RTF:
RTF is well documented and you can make an RTF document on all manner of platforms (I've done it in Ruby and Cocoa), but many platforms have extended RTF in their own way in order to support special features. OS X has added a few special methods to RTF files to support Mac OS X typography, and I've noticed that different versions of Word handle document attributes (like headers and page numbers) in different ways.
RTF is great if you want to make up something quick that is ONLY formatted text, but readers have all manner of different ways of interpreting the exact appearance of tables, page layouts and margins, and there doesn't seem to be any manageable common mechanism for including images or other documents, something Word and OO.org excel(pun) at. Even HTML seems to be better at this.
I use RTF output in a few little in-house tools I have, so people can get the text+attributes they create and open them in a text editor of their choice for touching-up and delivery. When my tools have to create something that is supposed to be finished, they make PDFs.
RTF is great for interoperability, but I never expect an RTF file to contain a "finished product," unless the recipient expects quality on par with a Selectric. It is merely a relatively-open serialization format for strings with attributes.
Documents outlive applications (Score:5, Insightful)
Documents are worth far more than software, and they outlive the applications used to create them. See the comment [robweir.com] to the original article - reading documents after 5, 20, 30, 100 years or more is not optional. You can pay the price of developing an independent format now, or you can pay the price of reverse engineering over and over again every time you change your internal representation.
Repeated implementation limits future change and innovation. It's expensive: it likely costs more even for Microsoft. But they can afford it; their competitors may not be able to. Plus, Microsoft already has their first implementation.
Perhaps so. But compare that cost to the cost I've just outlined. It is in the best interest of users and software developers (maybe even of Microsoft) to bite the bullet now, do the conversion once, and develop a clean format for the future.
Maybe you have in mind an argument you're not making, but I don't see any sufficient basis for your broad contention that using a file format based on an internal representation is a "darn good idea". In specific cases, yes (e.g. where the cost of development time or effort are the most important factors). In general, I very much doubt it. That successful applications in the past have taken that approach is weak evidence. They were developed when the up-front cost of development in a time of rapid innovation, the loss of customer lock-in, and a lack of open-format competition where good business reasons for making such a choice - even if it was inferior technically, increased cost in the long term, and was bad for consumers. In today's climate of slower innovation, competition from open formats, and customers who are running into their own long-term interests, the situation is different.
Which is not to say Microsoft's apparent attempt to set the rules of the game and throw sand in the gears of change is not in their interests, or that it will be unsuccessful.
Re: (Score:3, Informative)
Documents are worth far more than software, and they outlive the applications used to create them. See the comment to the original article - reading documents after 5, 20, 30, 100 years or more is not optional.
Which is why medical, legal and military records are often not held in word processor formats. For instance, the military records I have dealt with (NATO mostly) are held in SGML, conforming to carefully designed MIL DTD's that preserve structure rather than presentation. These files can be transl
Re: (Score:3, Insightful)
So what happens when you use the internal representation as the file format? Well, you have a fi
Re: (Score:3, Interesting)
OOXML's Origin Is Not The Problem (Score:4, Interesting)
OOXML includes data elements that should be part of internal import routines rather than being enshrined in the document format, and it includes elements that are not specified except by reference to applications for which no public specs exist. This is the problem, not the fact that OOXML is derived from MS Office file formats.
Well, I was a big fan of RTF at one time. But a few years back I found that documents with any kind of formatting more complex than paragraph+justification+font just wasn't working between MS Office and back. I don't know if this was because the format couldn't cope, or because of faulty implementations. In either case, it led me to give up on RTF.In any event, to be a replacement, RTF would need to work for spreadsheets and presentations at a minimum - something I don't think there's a lot of support for in the current RTF specification. We'd also lose the benefits of an XML based format, which given the amount of work on the seamless integration of XML documents into databases, web services and other data management applications means losing a lot of functionality.
Interoperability is only part of the problem. We also want a spec that can be fully and freely implemented by anyone, which isn't under the control of any single vendor.We want a format to which we can entrust documents, knowing that in twenty years time there will be an application capable of reading them. I don't know what you mean by native in this case, but the repurposing of OOXML isn't the problem. It's one of size and obfuscation, and as TFA points out specification by reference to closed formats and the behaviour of extinct proprietary software. These are non trivial problems with OOXML which are not (to the best of knowledge) found in ODF.There's nothing wrong with ODF. Re-creating it based on the non-XML RTF would be a waste of time and effort.
Re: (Score:2)
4. Profit!!!
The site seems to be slow... (Score:5, Informative)
So what can you do?
The solution is simple. Create a job description that is written specifically to your friend's background and skills. The more specific and longer you make the job description, the fewer candidates will be eligible. Ideally you would write a job description that no one else in the world except Guillaume could possibly match. Don't describe the job requirements. Describe the person you want. That's the trick.
So you end up with something like this:
* 5 years experience with Java, J2EE and web development, PHP, XSLT
* Fluency in French and Corsican
* Experience with the Llama farming industry
* Mole on left shoulder
* Sister named Bridgette
Although this technique may be familiar, in practice it is usually not taken this extreme. Corporate policies, employment law and common sense usually prevent one from making entirely irrational hiring decisions or discriminating against other applicants for things unrelated to the legitimate requirements of the job.
But evidently in the realm of standards there are no practical limits to the application of the above technique. It is quite possible to write a standard that allows only a single implementation. By focusing entirely on the capabilities of a single application and documenting it in infuriatingly useless detail, you can easily create a "Standard of One".
Of course, this begs the question of what is essential and what is not. This really needs to be determined by domain analysis, requirements gathering and consensus building. Let's just say that anyone who says that a single existing implementation is all one needs to look at is missing the point. The art of specification is to generalize and simplify. Generalizing allows you to do more with less, meeting more needs with few constraints.
Let's take a simplified example. You are writing a specification for a file format for a very simple drawing program, ShapeMaster 2007. It can draw circles and squares, and they can have solid or dashed lines. That's all it does. Let's consider two different ways of specifying a file format for ShapeMaster.
In the first case, we'll simply dump out what ShapeMaster does in the most literal way possible. Since it allows only two possible shapes and only two possible line styles, and we're not considering any other use, the file format will look like this:
Although this format is very specific and very accurate, it lacks generality, extensibility and flexibility. Although it may be useful for ShapeMaster 2007, it will hardly be useful for anyone else, unless they merely want to create data for ShapeMaster 2007. It is not a portable, cross-application, open format. It is a narrowly-defined, single application format. It may be in XML. It may be reviewed by a standards committee. But it is by its nature, closed and inflexible.
How could this have been done in a way which works for ShapeMaster 2007 but also is more flexible, extensible and considerate of the needs of different applications? One possibility is to generalize and simplify:
Backwards compatibility (Score:3, Interesting)
Re: (Score:2, Informative)
Nobody would. That's the point of it.
KFG
My favorite quote (Score:4, Insightful)
Outrageously funny and to the point.
M$ DNA (Score:2)
It's appropriate to note that the 6000 pages will only fit the DNA of a few pathogens [psu.edu]:
Other parts of the article about genetic disorders, witches and demonic possesion are also approp
Where I'm from, reverse-engineering... (Score:4, Funny)
Application responsible for converting? (Score:2)
When I save a Word 2007 document to the old
Even more lapses of judgment... (Score:2)
I'm can't believe this became a ratified standard.
"Let him who has understanding calculate the number of the beast, for the number is that of a standard; and its number is three hundred and seventy-six." Common-freaking-sense 13:16-18
- shadowmatter
Re: (Score:2)
IMHO those are more serious problems. They're enough to make it be what I'd call a Long Ugly Hastily Written Standard, which somehow doesn't really surprise me.
The thing the original article is freaking out about -- legacy compatibility flags -- isn't really an issue. The standard has to include the features offered by existing wps. Sometimes those features are undocumented, obscure, and almost totally forgotten. What do you do? Find the last remaining copy of the code, figure out exactly how WP4 buggi
Blind leading the blind (Score:3, Interesting)
OOo did the same, but with greater elegance and less haste because they were ahead of the field. Corel screwed it up with WordPerfect by keeping their stylesheet format proprietary so that transfer between WP document code and XML was made as hard as possible (a Class A blunder, given that their XML editor is actually quite good). AbiWord makes a good job of saving DocBook XML, but it's not trying to pretend it's reimportable; it screws up LaTeX formidably, though, by trying to pretend that it absolutely has to preserve line-length and font-size, which is evidence of the same neurotic attitude as Microsoft.
The problem in all cases is not that the assorted authors and coders don't understand XML (although some of them clearly failed that test too), but that they don't understand documents. This is particularly true at Microsoft, where leaders such as Jean Paoli have been proselytizing XML for years. They still think a document is a jumble of letters; they have no idea of structure, and the DOM is simply laughable as a non-model of a document. Microsoft's particular problem with XML is that they came to it too late, and viewed it as a way of storing data, not text...indeed to this day many XML users, trained with Microsoft blinkers on, are unaware that XML can be used for normal text documents.
With this level of ignorance surrounding Microsoft, it's hardly unexpected that they should blunder so badly.
user expectation (Score:2)
I highlight text. I click the "B" button to make my text bold. I don't screw with styles.
Sorry to burst your bubble, dear Holy Priest Of The Most Highest XML.
Do backward compatibility in the converter (Score:2, Insightful)
purism as the enemy of progress (Score:2)
Seems fair enough (Score:2)
If you were faced with output from a 15 year old program, what would you do? 15 years? In software, that's an eternity. These tags are essentially saying "here is where this old crap used to be". How many people are actually using these programs? Maintaining documents in the old format? I defy any of you out there in Linux-land to say you wouldn't take the same approach under the same set of circumstances. Actually, Linux people would probably just say "it may not open old documents properly, but tha
Re: (Score:2, Insightful)
Or maybe it was their illegal business tactics?
It would be pretty easy for me to run a successful business too if I could break federal law with impunity.
Re: (Score:3, Insightful)
Re:MIcrosoft sucks. (Score:5, Insightful)
Re: (Score:2)
What they got in trouble for was actually using their monopoly to get into other markets - i.e. bundling IE with the OS meant that they used their OS monopoly to get into the browser market. There was also doing things like offering Office at a discount if vendors bought Windows. This is the kind of thing that everybody does (thi
Re:MIcrosoft sucks. (Score:5, Insightful)
I think the Office XML format style is a play straight out of IBM's hand-book: make the standard complex and incomprehensible, and the little players - that's you - will find it hard to compete. In a way, that's a good sign: Microsoft is now lumbering into middle-age, hoist on their own evermore complex petard.
The other thing about middle-age is that every little technological step away from their established base-line is treated as a revolution. In reality, it's no such thing, just a small stepping stone to shouting "pesky kids. Get off my lawn." Or maybe they've reached that stage already.
Re: (Score:3, Interesting)
Re:MIcrosoft sucks. (Score:4, Insightful)
Also don't forget that although MS's purchase of DOS was perfectly legal, it was ethically horrible. They arrived at a handshake agreement to license the code from Seattle Computer Company. While the MS paperwork was being finalized by the lawyers, SCC then made arrangements to finance other business ventures using the MS money. MS then presented them a contract to buy the code rather than license it, and told SCC to take it or leave it. As SCC had already committed to the other deals, they had no choice but to take MS's offer. Sure, no one held a gun to the head of the SCC executives forcing them to take the deal, however, they didn't have any other reasonable alternatives. MS's behavior was legal, but certainly not ethical.
Re: (Score:3, Insightful)
Unfortunately, you are wrong on almost all counts:
Re: (Score:2)
You don't have to break a law to get a gun. But as soon as you start using it you'd better be very carefull. Except for a few very well defined cases (sport, hunting, self-defense), using your gun is illegal.
Even in the cases named above, using your gun in the wrong way will send you to jail.
Re: (Score:3, Insightful)
Things that are illegal for a monopoly are perfectly legit for a non-monopoly. It's a crazy law, but that's how it works.
I think your logic is more than a little broken. Monopolies have a great deal of power that other's don't have. They can undermine capitalism in a market and destroy innovation in entire industries. They can spread causing that damage to other markets. Think of it like this, people piloting airplanes aren't allowed to drink or step outside for a cigar, while those behaviors are perfect
Re: (Score:3, Insightful)
A pilot knows that he's drinking at the time that he's do
Re:MIcrosoft sucks. (Score:5, Insightful)
Yeah. Because the person best suited to decide what a company should or should not be allowed to do are the people who own the company. Of course you're going to want to be completely unrestricted to mow down your competitors using whatever advantages you have if you are in a position to do so. What you're missing is that no one should be allowed to use unfair practices to do it. Some people think we should idolize the free market as some sort of religion. We don't like free market economy because it was given to us by the gods. We like it because it tends to result in better products and lower prices. That ceases to be true when you have a monopoly in the mix.
That being said, I'm not really informed about any Microsoft specifics, so I'm not going to argue in favor or against any "federal laws" as it applies to them (or failed to apply to them). However, suggesting that only people who have built a company that holds a monopoly should be able to decide what is fair regulation isn't rational. It may even be that the current federal laws regarding monopolies may be unfair and in need of reform, but the fact remains that the existence of a set of laws to regulate businesses is necessary.
Re: (Score:2, Insightful)
Re:MIcrosoft sucks. (Score:4, Insightful)
Until you get to that point, I suggest that you those "federal laws" out your ass, Mr. Ashcroft.
Oh, and also those so-called "computer misuse" laws. Indeed, if I want to set up a consultancy where I propose to convert customers ASP scripts to PHP I should be allowed to demo to my prospective customers in great graphical detail why ASP is so insecure, even if I don't yet have an existing business relationship. Why should I tolerate that the government tells me how I may and may not recruit new customers?
Anything less would be one-sided and unfair.
Re: (Score:2, Informative)
Re: (Score:3, Insightful)
Re: (Score:2)
Or maybe I, too, can post unprovable, untestable anti-Microsoft conjecture to slashdot and get modded up?
Death would be too easy. (Score:5, Funny)
At the same time we'll let the tech support drones have their way with the Microsoft campus, which I suspect will involve setting it on fire.
Re: (Score:3)
Re: (Score:2)
Re:Suck it up (Score:5, Funny)
Re: (Score:2)
Yes, we could all duplicate the significant effort of reverse engineering the missing parts of the standard (you still have a working copy of WP5 around somewhere?). Or we could just save everybody a lot of time and money in the future by making a one-time small investment of fixing the standard now.
Re: (Score:3, Insightful)
Re: (Score:2, Insightful)
If it gets adopted as a standard (ISO or similar, not defacto standard) then everyone. The point is not whether people need the features, the point is that MS is trying to get this accepted as a standard. It still can only be implemented by MS, and therefore should not be accepted as a standard. If a government body had as part of a soft
I can. (Score:2)
What you don't seem to realize is that your list of people "effected" by this is irrelevant. There's exactly one category that matters:
Re:I can't see this being too big of a problem (Score:5, Insightful)
You do not need these features to begin with in a new format that is inherently incompatible with an old format. You don't want to say "now I'm going to do WP style linespacing and my linespacing is 1".
If you want to convert a WP document to an XML document, the conversion program should know that the linespacing in WP is 0.9 times the linspacing in XML document (or what it really may be)and will then use linespacing=0.9 in the XML document. This is not a task of the new wordprocessor or its specification.
By adding this so-called "backward compatibility" to your specification, you make the spec overly difficult and in fact you make the conversion program in the new application when this is absolutely not necessary.
And on top of that, you require that the programmer who uses this spec should have knowledge of all these old versions and is able to program them without error. And as the application will grow because of these unnecessary features, the number of bugs will also rise. So this is not a blueprint for a good application, this is a blueprint for a very buggy implementation of a wordprocessor.
Re: (Score:2)
Also, again, you can get the precise behavior in a generic way. The post you replied to explained how. It's even ok to include information about _why_ certain information is like that, so that tools that have special knowledge about the format can do full roundtrips more easily. The problem comes when the format itself is burdening ALL implementors with replicating undoc
Re: (Score:3, Insightful)
You're missing his point: When converting the file to OOXML, one can and should add generic tags indicating the specific (broken) behavior which should emulated (such as "scale small caps by this percentage point") rather than just specifying a generic "Do What I Mean" marker without any useful guidance on how rendering of documents containing this marker should be implemen
Re: (Score:2)
Re: (Score:2)
What an assinine piece of crap OOXML is.
Re: (Score:2)
No, they can't "imitate" the feature. They have to do EXACTLY what Office does, otherwise they're not compliant. Did you even read that guidance block you quoted? "If applications wish to match this b
Forbidden partial implementation? (Score:5, Interesting)
The behavior of years-old proprietary word processing software is included by reference into OOXML. How is any spec that includes by reference the behavior of proprietary software exactly "open"? True, implementors could produce a partial implementation of the spec that degrades away the legacy baggage (more or less) gracefully, but some standards' patent licensors forbid implementors to publish a partial implementation. I don't know if this applies to OOXML's license.
Re: (Score:2)
but some standards' patent licensors forbid implementors to publish a partial implementation. I don't know if this applies to OOXML's license.
My understanding is that Microsoft was going to enforce full implementations of the standard, except on themselves of course, so it really makes OOXML untenable as common format. What you get is an open standard that everyone but Microsoft has to follow exactly and implement fully, but only Microsoft could follow it and implement it fully. And then even if someone could manage to implement MS's XML fully and perfectly, then Microsoft could just pull a bait and switch would break compatibility.
Actually, I
Re: (Score:2)
Who's Scared? (Score:4, Informative)
Eh? Isn't that why M$ made this supposedly "open" format? Because governments were tired of paying through the nose for secret formats that broke between versions? The purpose of an archive is to read it later. Governments and companies have already moved to pdf for archives. They are going to move their working documents to reasonable formats next.
But MS opened their own format, thus leveling the playing field so that you must again compete on features ...
You must not have read the 6000 page spec, which includes lots of sections like this:
That's neither open, nor a standard.
Microsoft is hoping people believe what you say, but everyone knows better. Shit like OOXML this only proves that they have not changed. It's just another, more elaborate and more expensive lie. Even the name, by using "OO" is intentionally confusing. The New Office is everything the old Office was and always will be. Vista and Office 2007 are non starters.
Re:Unfair (Score:4, Insightful)
Tools that don't care about legacy support are unaffected by this; they can just pick the closest modern option to whatever the legacy flag calls for on input, and not output documents that use them.
And thus tools, legally, are not OOXML, and won't qualify for purchasing by companies that specify OOXML. Which is the entire point.
There's a difference between 'We need to make sure that old documents can be converted correctly.', and 'We will literally convert old documents into a new representation that contains all their weirdness, and we won't explain how to implement said weirdness in the standard.'.
What Microsoft has produced is not even a standard. Standards must specify everything, or reference other standards that specify everything. They can't reference applications.
If Microsoft wants to keep secret how to turn Office 95 documents into OOXML, fine. Producing a standard doesn't mean you have to explain how to convert things into that standard.
It does, however, mean you have to explain exactly what should happen if mwSmallCaps is true, to the pixel. You can't just pawn it off on the unexplained hypothetical behavior of some other application.
Re: (Score:3, Insightful)
You need to let a conversion program worry about converting Word 2006 documents to XML documents. You need to let the maker of Word 2006 worry about making this conversion program. This can be in the form of a "save as XML" option, but also an external program.
You can not say "oh, this is an old feature, let's put it in the spec and let's let the programmer that uses this spec worry about it because we can't be bothered to convert it or don't know how to convert it".
Sorry, but XML s
Re: (Score:2)
Second, you're right, old format -> OOXML -> old format... yeah, that would kind of suck. However, I don't get why you would ever want such a loop to work. Old format -> OOXML -> some other format -- now that makes sense. Think of it this way -- you can