Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Microsoft Programming IT Technology

Microsoft Word Document ML Schemas Published 439

Lars Munch writes "On Monday the 17th November the xml schemas for the Word Document ML along with documentation, was uploaded to the Infostructurebase (ISB). With the Word Document ML specification anybody can generate, view and process Microsoft word documents on any format." (Here are the legal terms under which the schemas can be used.) "The Word Document ML is based on the W3C specification eXtensible Markup Language (XML), there by providing documents that are easy to integrate into a large variety of systems. The Danish Government Infostructurebase is the first schema repository to make the schemas accessible to the public. The Microsoft Office Document ML schemas and documentation can now be downloaded from the ISB Repository." There are more links on this page.
This discussion has been archived. No new comments can be posted.

Microsoft Word Document ML Schemas Published

Comments Filter:
  • by warmcat ( 3545 ) * on Monday November 17, 2003 @12:18PM (#7493326)
    With thanks to Seth Johnson on the DMCA Discuss list for forwarding this earlier today:

    Subject: [Patents] MS Office 2003 XML patented
    Date: Mon, 17 Nov 2003 13:48:11 +0100
    From: Carsten Svaneborg
    Organization: www.mpipks-dresden.mpg.de
    To: patents@aful.org

    Hi! Just came across the following:

    http://www.microsoft.com/mscorp/ip/format/xmlpat en tlicense.asp
    Office 2003 XML Reference Schema Patent License

    Microsoft may have patents and/or patent applications that are necessary for
    you to license in order to make, sell, or distribute software programs that
    read or write files that comply with the Microsoft specifications for the
    Office Schemas.


    So usage of MS Word XML files requires a patentlicense:

    You are not licensed to distribute a Licensed Implementation under license
    terms and conditions that prohibit the terms and conditions of this
    license. You are not licensed to sublicense or transfer your rights.


    The licence is royalty free, but GPL 7 requires the right to sublicence
    patent rights to the people who obtain a GPL program from you.

    so in other words Microsoft is using patents to prevent GPLed programs from
    accessing the XML format that MS Word will be using.


    This is very good timing, and goes to show how important it is to ensure
    that the software patent directive has articles that protects
    interoperativity from consituting patentinfringemet.
  • Oh yeah? (Score:0, Insightful)

    by iantri ( 687643 ) <iantri&gmx,net> on Monday November 17, 2003 @12:21PM (#7493346) Homepage
    How much do you want to bet it will be with extremely restrictive licensing, will be incomplete, or both?
  • by k98sven ( 324383 ) on Monday November 17, 2003 @12:21PM (#7493350) Journal
    Given Microsofts history of skirting around verdicts and legal agreements, how long will this format be valid?

    How long before MS switches to either a new markup scheme, or introduces undocumented 'features'?

  • by Uma Thurman ( 623807 ) on Monday November 17, 2003 @12:26PM (#7493405) Homepage Journal
    It's NOT reasonable. They don't allow any modifications or derivatives of the schema without permission.

    So, Microsoft will be free to continue changing their format with each new release, breaking all the open source programs for a time, causing time and trouble for users to upgrade.

    We don't like Word formats because they change frequently, and they are developed in a direction that suits Microsoft. How does this change anything?

  • by rruvin ( 583160 ) on Monday November 17, 2003 @12:27PM (#7493424)
    So, let me get this straight:

    Microsoft is allowing you to license the patent free of charge but not to sublicense it. The GPL requires that you be allowed to sublicense patents applicable to GPLed software. And that's somehow Microsoft's fault?

  • by 16K Ram Pack ( 690082 ) <(moc.liamg) (ta) (dnomla.mit)> on Monday November 17, 2003 @12:28PM (#7493430) Homepage
    There is no 'about-face' but it seems clever.

    Put XML support on the pro version of the software, so it looks open, but because it's not on all versions, people will have to use the non-open for sending to people in case they don't have Pro.

    I can't see any other reason for not including it in Pro.

    You still won't be able to run Word as a server app either.

  • Most probably the intention is to make the XML formats 'incompatible' with the GPL. However if this is the case, there is at least one easy work around, namely to define a neutral XML format (say the OOo XML format) and use a non-GPL 'connector' (which carefully observes the Microsoft patent license conditions) to do the dirty work.

    Any 'open' standard that imposes conditions on its use is not actually open at all. The owner can decide at any time to change the license, and this in itself should be enough reason to avoid this XML interface.

    I believe these XML standards are what is technically called a "honeypot".

    Of course, I may be paranoid, this may indeed be a munificent gesture by Microsoft who have realized that their XML schemas will serve the global community, add value to their products, and encourage a new generation of Office extension applications that will halt the trickle/rush/avalanche of Linux conversions.

    Indeed.

  • I'll take this over having to reverse-engineer the specs and deal with potential IP issues. For once, Microsoft did us a favor, even if it does come with strings attatched.
  • > They don't allow any modifications or
    > derivatives of the schema without permission

    Hm. I guess I'm not sure what would be gained by doing that - i.e., changing the spec and republishing it. Why would that be a good thing to do, even if you could?

    > Microsoft will be free to continue
    > changing their format with each new
    > release, breaking all the open source
    > programs for a time

    Right... but couldn't the same be said of any API? I mean, if the Apache plugin API [apache.org] changes, I'll need to rewrite my mod_foo module to use the new API.
  • Solution: (Score:3, Insightful)

    by Alethes ( 533985 ) on Monday November 17, 2003 @12:38PM (#7493525)
    Create a BSD licensed application that accesses the XML format, so that users will have a choice other than MS Word.

    It seems that Microsoft has inadvertently demonstated that the GPL does not always protect the users' freedom, as is its intent. If the user can only use MS Word or some other highly restrictive software to access these file formats, because somebody has decided to be a GPL zealot, then the GPL has become a hindrance to the users' freedom.
  • I wonder... (Score:3, Insightful)

    by WIAKywbfatw ( 307557 ) on Monday November 17, 2003 @12:39PM (#7493528) Journal
    Just how long will it be before Microsoft releases a Word Document ML Plus format that is not so open?

    Let's face it, Microsoft loves proprietary technology that it owns and that it controls. There's no long-term advantage to it whatsoever in creating a truly open file format - the biggest reason why Microsoft Office applications are so ubiquitous is because people need to read Word, Excel, PowerPoint and Access documents they've been sent, not necessarily because those are the best tools for everybody.

    Word Document ML is a PR exercise. It's Microsoft saying "See, we're nice and friendly and open, too", at a time when its revenues are beginning (perhaps not significantly yet) to be threatened by open source alternatives. Long-term though, Microsoft will shut up shop again and bring users back to the fold with a proprietary version that's "improved", "enhanced" or "more secure" in some way.

    Want proof? Just look at Hotmail. When Microsoft bought it, it promised that the Hotmail service wouldn't be compromised in any way, and that it would continue to remain free. Well, the basic service might still be free but it's been crippled in so many ways - mail filtering that says it will delete junk mail in 24 hours but doesn't, incredibly bad junk mail filtering in the first place, even fewer mail sorting rules allowed now than were allowed a few years ago, a very limited number of addresses and domains that can be blocked, etc. All tactics to get you to subscribe to their enhanced Hotmail service, which has some new features but is made up of a lot of the stuff that Microsoft has stripped from the basic service.

    Will people use Word Document ML format? If it becomes standard in Microsoft Word then of course they will. They'll have no choice - Microsoft has a practical monopoly when it comes to everyday file formats. Will Microsoft eventually hijack Word Document ML format by making a future iteration proprietary once more and hence shut out any competing product when it releases them via a patch or whatever? Of course it will.

    Why am I so sure of this? Because Microsoft is just like the scorpion in the tale of the scorpion and the frog [allaboutfrogs.org]. It's in its nature.

  • by gillbates ( 106458 ) on Monday November 17, 2003 @12:40PM (#7493534) Homepage Journal

    Microsoft is trying to appear "Open" while denying the actuallity thereof.

    Does anyone seriously believe that third party developers will be able to write Office document generators and formatters with this information? Do we really believe that:

    1. Microsoft will comply fully with the spec (it's disclaimed in the legal terms), and
    2. a developer will be able to write document parsers for these schema without infringing on Microsoft's patents?

    Given the fact that there will always be legal encumbrances with anything interfacing with Microsoft technologies, I believe these schema would be better left ignored by the OS community. With Open Office and KOffice maturing (and the former running on Windows, and available for free), there's no good reason to cater to Microsoft document protocols anymore. They are simply irrelevant.

    And no, we in the OS community don't have to copy everything that Microsoft does. Compatibility with Microsoft is no longer a necessity.

    Close, Microsoft, but no cigar. Kudos for the marketspeak.

  • All caps (Score:2, Insightful)

    by nodwick ( 716348 ) on Monday November 17, 2003 @12:41PM (#7493551)
    Putting the "This provided As Is" section in all caps is SOP for licensing. Check any of your software boxes, or Google "software license" [google.com]. Point-and-click examples include the W3C license [w3.org], or Apache license [apache.org].
  • by bokmann ( 323771 ) on Monday November 17, 2003 @12:41PM (#7493553) Homepage

    I already have the ability to save my word processing documents as XML. I already have the ability to transform them into other things I want. So do you. check it out. [openoffice.org]

    I'm sure someone, someplace is already working on the appropriate xslt to transform Microsoft's stuff into this more open format, and I'm sure Microsoft has some ace up their sleeve technically or legally to push it into a 'gray' area...

    But I just cannot imagine anyone having the gaul to say that my data is only available to me in a format that they control the terms and conditions on. how successful would a paper company be if they put 'terms and conditions' on the use of their wood pulp?

  • by wfrp01 ( 82831 ) on Monday November 17, 2003 @12:42PM (#7493558) Journal
    Why bother with proprietary file formats when you have DRM? Make a mendacious nod to 'open file format', and then lock stuff up behind the DMCA. If you want to read a DRM encoded word document, you'll need word. Period.
  • by poot_rootbeer ( 188613 ) on Monday November 17, 2003 @12:42PM (#7493560)
    I think this says no open source implementation is possible, doesn't it?

    Open Source != GNU Public License.

    Microsoft's licensing terms here seem to be closest to the BSD License out of the major open source models. A good decision if they're looking for rapid and widespread adoption of their design -- how many TCP/IP stacks do you know of that AREN'T derived from BSD?
  • by ciaran_o_riordan ( 662132 ) on Monday November 17, 2003 @12:49PM (#7493607) Homepage
    Previously we could reverse engineer their format and use it. Their work was covered by copyright, no problem once we create our own implementation.

    This schema is patented. Patents are an exclusive right to use an idea. Now if you use their format without upholding their conditions, you're a criminal, even if you figured out the format yourself.

    By publishing the format, they can cast doubt on anyone that does reverse engineer it. "I bet you read the spec on line".

    Also, being able to view the format isn't much use. It's XML, but that doesn't mean it will be meaningful cleartext. They can simply uuencode a big block of binary data, stick it between two tags, and it's valid XML.

    Learn from the past. Microsoft are not here to do us favours.
  • I think the key phrase is
    You are not licensed to distribute a Licensed Implementation under license terms and conditions that prohibit the terms and conditions of this license.
    Which just accidentally happens to exclude any software that is licensed under GPL, since the GPL is not compatible with any licence that has a mandatory advertising clause.

    We are clever, aren't we!

  • by 0x0d0a ( 568518 ) on Monday November 17, 2003 @01:08PM (#7493770) Journal
    Yup -- I tossed an article onto Slashdot about a year ago pointing this out.

    I'm not familiar enough with Word internals to know how useful the schema would be in translating documents losslessly between formats (and am very dubious that it would be particularly easy), but it isn't even necessary to go that far -- the point is that the non-Pro copies of Word don't support XML format export.

    Microsoft isn't going to give up the golden strength of a file format lock-in any time soon, even if they let companies use custom indexing tools on their store of documents (which is really what this whole XML business is about).
  • by Doc Ruby ( 173196 ) on Monday November 17, 2003 @01:08PM (#7493775) Homepage Journal
    Any 'open' standard that imposes conditions on its use is not actually open at all.

    No, that's the difference between "open" and "free". Open standards are published, so anyone can see their features, interop interfaces, and internal structure (how they work). Openness is the most important feature of technology, for maintenance (developers and hands-on consumers). (Proper) patents are "open", publishing all details of the invention. So other inventors can utilize the invention in their own inventions, and avoid duplicating the invention of another.

    Technology can be truly "free" when in the public domain, but that includes the freedom to subvert it, like market an incompatible version under the same name (or interface), poisoning the market with an unreliable technology. Or it can be free and open under the GPL, with restrictions solely to keep it just as open and free as it was when first released. Or it can be free under the BSD (or MIT) license, where it can be more or less free than the original: the originator's copyright still applies to the original technology, but it need not be kept open, or free, beyond that copyright.

    Then there's the "free" technology that costs nothing (in money). That's free as in subsidy, which is relevant only to marketing, not to development.
  • Right... but couldn't the same be said of any API? I mean, if the Apache plugin API changes, I'll need to rewrite my mod_foo module to use the new API.

    It's a good thing for MS, because they will, for a time, have the only compliant implimentation of the standards every time they change. Every other implimentation will lag behind as they seek to impliment the new standard.

    The main difference between these changes and the apachie API changes is that the apache people are not selling a closed source version of mod_foo which is included in completely update compatible form with each revision of the API.

  • by Anonymous Coward on Monday November 17, 2003 @01:17PM (#7493855)

    Where can I get VPC, to host windows on my linux box?

    Yeah, THAT'S why vmware is better...

  • Wow, I'm impressed by your grasp of the subtleties of the English language and its use in this context.

    But I believe you are actually wrong. An "open" standard which is usable only under terms of a patent license is not open. It can be as documented as you like, but if there are conditions attached to its simple use, it is not open.

    An example: if I document the interface to my bondoogle so that any one can programme bondoogle extensions, that is a "documented" standard.

    If I place the bondoogle extension specifications into the hands of an independent body, that is an "open" standard.

    If I provide the community with the rights to the standard itself, it may become a "free" standard.

    But if I document the standard and then say "and all use of this standard is restricted to those applications I agree with", that is neither open nor free, simply licensed.

    Furthermore, this is quite an innovative restriction mechanism: previous mechanisms for making so-called "open" standards such as win32 non-open included deliberate underdocumentation. The use of patent law is new and should be raising red flags all over the place, especially as it's for something as vital as an XML schema.

    Does this mean that XML schemas can be patented?

    A truly frighting idea, given the importance of XML to the Internet ecosystem.
  • by OglinTatas ( 710589 ) on Monday November 17, 2003 @01:19PM (#7493876)
    So write an MSWord document filter module as a plugin for your GPL application, and require a separate download from your main GPL application for that plugin. I think that's how the GIMP [gimp.org] got around the .GIF patent issue. [gnu.org]
  • by lorien420 ( 473393 ) on Monday November 17, 2003 @01:20PM (#7493885)
    Perhaps they opened the specs, but they surely didn't open reading them and using them for any purpose. The License for implementing the specs requires that you attach their license to all files and derivative works.
  • by Uma Thurman ( 623807 ) on Monday November 17, 2003 @01:23PM (#7493910) Homepage Journal
    >Hm. I guess I'm not sure what would be gained by doing that - i.e., changing the spec and republishing it. Why would that be a good thing to do, even if you could?

    1) All specifications are incomplete. The requirements that it addresses today are not static, and in 10 years there will be new requirements.
    2) Microsoft will change their XML schema.
    3) Historically, Microsoft has done things that are in the interest of Microsoft. Everyone else must follow along.
    4) Therefore, the changes that Microsoft will make the the XML schema have a high liklihood of being advantageous to Microsoft.

    When Microsoft keeps all the real control of the format, it turns any open source developer into a sharecropper. We're going to be plowing a field that we don't own, and the price we pay is going to entrench the Microsoft format even further.

  • by Phleg ( 523632 ) <stephen AT touset DOT org> on Monday November 17, 2003 @01:23PM (#7493913)
    Cache it. If Microsoft takes you to court for abusing the license, tell them and prove to them that you followed the link, and that there were no applicable license restrictions. You followed their directions to the word (pun intended).
  • A drowning man... (Score:3, Insightful)

    by alexborges ( 313924 ) on Monday November 17, 2003 @01:27PM (#7493941)
    ...last kicks

    That is what we call this in Mexico. Now this is what i call competitive pressure.

    Now what about excel?

    Oh and BTW, now MS is playing catch-up with OO.o.

    Thanks microsoft, i think you are starting to 'get' it.
  • no, it isn't (Score:3, Insightful)

    by penguin7of9 ( 697383 ) on Monday November 17, 2003 @01:28PM (#7493957)
    ....seems like all you have to do is put a notice in the code about using the spec.

    You can't sublicense or transfer the license. That means that Microsoft can stop new implementations any time they choose by simply changing the license on their web site. They may even be able to do that retroactively.
  • by Darth Daver ( 193621 ) on Monday November 17, 2003 @01:36PM (#7494034)
    That certainly is a nice pro-Microsoft spin you put on things, but perhaps you can explain the logic behind your statements. How did they "out-open-source" Open Source software? How can they be more open that what is already completely open?

    I am still skeptical that Microsoft has truly made this open. Excuse me, but I don't just blindly accept what Microsoft says at face value. Microsoft has a serious credibility problem from lying about so much for so long. Even if Microsoft has finally caught up to the Open Source community regarding the openness of file formats, that helps OpenOffice and its users. It would make me feel even better about NOT spending hundreds of dollars on an office suite every few years.

    Microsoft just cut our legs off over security issues? Do you think opening a Word file format just magically makes all of their security issues go away?

    I saw some other Microsoft cheerleader congratulate Microsoft for "leapfrogging" Linux by finally providing a decent (remains to be seen) shell, but this person did not explain how this infant shell surpassed bash, pdksh, or zsh. Just because someone makes some wildly unsubstantiated claim about Microsoft's superiority does not make it true. Why should I believe this is anything more than PR and spin? I'm not convinced they have joined us, let alone beat us, at anything. Honestly, please explain your rationale.
  • by penguin7of9 ( 697383 ) on Monday November 17, 2003 @01:38PM (#7494057)
    Apart from the legal loopholes in Microsoft's license that are big enough to drive a truck through, much more worrisome is the fact that Microsoft asserts that they are getting a patent on an XML Schema. What is the novelty in that schema? It's a standard XML representation of well-known word processing data structures and concepts.

    This would be a very bad precedent. Microsoft is really trying to push the limits of patentability and testing what they can get away with. Their patent application on .NET APIs is a similar trial balloon.

    That is something open source and free software developers should really worry about.
  • by more ( 452266 ) on Monday November 17, 2003 @01:45PM (#7494113)
    It is not the format that is the problem. The format is rather well reverse-engineered already. The problem is the layout algorithm. People do care if their document looks different, a figure has jumped, there is one more page, etc.

    Layout algorithms are very non-linear. Dramatic changes can happen in the layout due to differences in the rounding. Currently, there is absolutely no specification about the layout algorithms.

  • by RobertB-DC ( 622190 ) * on Monday November 17, 2003 @02:12PM (#7494415) Homepage Journal
    That certainly is a nice pro-Microsoft spin you put on things... Honestly, please explain your rationale.

    Dude, did you read my post? :)

    Why should I believe this is anything more than PR and spin?

    That was my point -- unfortunately PR and spin are too often what the PHB's depend on when they're choosing a "strategic" direction. When I suggested that MS has cut you down, I meant in the view of the non-technical manager who believes everything Bill Gates says.

    But in full disclosure, I will admit that I've been coding VB apps for over 10 years, and the Microsoft shackle is firmly attached to my ankle...
  • by IM6100 ( 692796 ) <elben@mentar.org> on Monday November 17, 2003 @02:12PM (#7494422)
    No need to get all preachy on us.

    The FSF and Microsoft have different goals. You're entitled, of course, to claim some goals are more noble than other goals. However, see paragraph one above.

  • by Clith ( 5063 ) * <rae@tnir.org> on Monday November 17, 2003 @02:46PM (#7494742) Homepage Journal
    Well, they were winning in 1996 and 1997. Now that we have StarOffice/OpenOffice, I wonder if a tiny percentage (say 1 or 2%?) is slipping through their fingers. But then, how would you measure the number of free installs? Most of these surveys only count paid installs, which is invalid when you are trying to include Open Source tools.
  • by swillden ( 191260 ) * <shawn-ds@willden.org> on Monday November 17, 2003 @02:55PM (#7494848) Journal

    Microsoft isn't going to give up the golden strength of a file format lock-in any time soon, even if they let companies use custom indexing tools on their store of documents (which is really what this whole XML business is about).

    Unless I'm missing something, I think this does break the lock-in, in large part. With a published, standardized format, non-Microsoft tools can implement support for it, and users can expect it to work reliably. Openoffice.org, for example, can probably support the new MS format simply by adding a pair of XSLT stylesheets (though they may want to take a different approach for performance).

    This means that users of non-MS tools will be able to create documents, confident that MS Office users will be able to read them. There are still limitations going the other way, but that still means that non-MS tools only have to write import filters for the old Office formats, halving the work, and that is really won't be an issue in the business world, where Office Pro is the norm anyway.

    I think think this move will prove painful for MS, but probably less painful than sticking with completely closed formats, given the way they've been getting beat up about it.

  • by Slime-dogg ( 120473 ) on Monday November 17, 2003 @03:05PM (#7494933) Journal

    why not buy the brand most compatible with the format?

    That's an easy one to answer. You've got 300-400 machines that require an office application suite, but you've got a small budget. Complete compatilibility is not much of an issue if you can save (400*$600) $180,000 - $240,000, yet run a "mostly compatible" suite. Now, with the opening of the format, that "mostly compatible" becomes "compatible."

    Then there's the whole issue of MS Licensing 6.0 (as if it's a whole other application itself).

  • by tspauld98 ( 512650 ) on Monday November 17, 2003 @04:38PM (#7495793)
    This explanation is not technically accurate as I read it. IANAL, however, I think that it says that they have patents or patent applications governing the process used to read or write files that adhere to this schema's structure.

    In other words, no one can write a piece of software that can read or write files in this format without a license from MS allowing it.

    If I'm right in my interpretation, it is worse than the previous post states. No OO.o support. No third-party support at all without a license. Please, somebody who can read legal-ese tell me I'm wrong. :)

    This would really suck.

    tims
  • by antiMStroll ( 664213 ) on Monday November 17, 2003 @04:43PM (#7495832)
    Look at the bright side. When the League of Microsoft Moderators scramble to mod up a post so patently broken in fact and logical coherency, someone's in panic mode. That +5 is good news.
  • by penguinrenegade ( 651460 ) on Monday November 17, 2003 @05:20PM (#7496245)
    Just a point of clarification. It might be a PUBLISHED format, but it is not STANDARDIZED. Bastardized, maybe. Microsoft tweaks TCPA just slightly and it becomes Palladium. Their XML based format is NOT the same as XML!

    And if you mod me down - please look at the language. Microsoft continues to make people think that they invent a new format, when they do no such thing.

Receiving a million dollars tax free will make you feel better than being flat broke and having a stomach ache. -- Dolph Sharp, "I'm O.K., You're Not So Hot"

Working...