Microsoft Embraces and Extends Perl 256
Anonymous Coward writes "According to this Press Release, Microsoft has signed an aggreement with Active State to add missing functionality to the Windows version of Perl. But the FAQ states that they also want Perl to "take advantage of platform features on Windows". "
Re:let's not be hypocrites... (Score:1)
> features of *nix" right now, intentionally
> and without shame.
Sure it does. Right now, however, perl on win32 is pretty much the same as perl on unix, except for fork() performance. It *does* make things more tolerable for people trapped in win32-land, much the same as Cygnus' gnu-win32 gives them a bunch of unix tools and gcc. It's great that they're adding Unicode support to perl. The only travesty is that they're doing it for Windows only. They're undoing all of the previous bridge-building that got perl to the point where you could run a perl script anywhere. That's wrong.
MS didn't kill java (Score:1)
Java is huge and getting bigger all the time...
ActivePerl is great! this is good! (Score:1)
Uneducated Slashdotters...Beef Take 1 (Score:1)
The more UNIX-originated tools on NT the better. UNIX has a lot of what NT doesn't (stability, power, flexibility), and if some of that can rub off on NT and make my hellish job better, all the better. thank dog for ActivePERL, vim/win32, and thegnu tools for NT. Now if only we could get an Open Source GNU NT kernel replacement! (GNU NTURD?)
Don't be so quick to evangelize everything. Try and keep an open mind. It's painful to read the comments sections on Slashdot as of late, because it's usually composed of flames, and the "Linux Or Die" attitude, and as a UNIX person, I don't like to see it. It only pushes people away. Look on the bright side, the more NT assimilates UNIX, the easier it is for the average m0r0n MCSE to migrate to UNIX after he realizes NT just doesn't cut it in the enterprise...
This is not cut and dry, analogous to politics. (Score:1)
Much like politicians. Clinton, as evil and inept as he sometimes is, morphed to the right. Not for good reasons, just for votes. But the effect is the same. Forbes is very aware of this effect, and never intends to actually win, but he applies pressure for change in his direction by running.
M$ may being doing things like this just for votes/sales, but this sort of thing will change what M$ is. Our pressure for good can morph the giant instead of killing it. Face it, we cant kill it.
I am not saying I will ever advocate or use M$ voluntarily, but if their power for evil reduces through magnitude or intent below a critical threshold (total brainwashing of PHBs) our lives and good engineering sense may return to some sane state.
Sorry, but you're wildly wrong (Score:1)
I work in CJK computing, and I speak all three languages. I use a Unicode-based machine to read all three. I assure you that I have none of the difficulties you describe. Someone working in just one of those languages would have it even easier. You're not going to get hit in the face by some horrible Chinese font if you are trying to read Japanese on a Japanese-only machine.
You say the following:
"The whole point of Unicode was that code points were supposed to be kept separate between languages - so an A in English and an A in French (as well as German, Dutch, etc. etc. etc.) are all supposed to have different code points, because usage differs between these languages. (This is just an example; I'm not positive that all these languages have a different code point for A)."
This is extremely misinformed. The whole point of Unicode was never such a thing. It was to create a single, unified encoding for all languages, just as Latin-1 (ISO 8859-1) was a unified encoding of major western European languages. Latin-1 was valuable because it kept the French, Germans, Americans, etc. all from using different code points for the same characters.
No, there are not different code points for "A" in different languages in Unicode, and there never were. It's ridiculous to suggest that there should be. How often do we suffer because the French word "chat" (cat) and the English word "chat" are represented by the same code points in Latin-1? Approximately...never.
Because English and French share a common encoding, in the case of Latin-1 (ISO 8859-1), you can tell at a glance what language it is and what it is saying. If you want to switch to a font that is more "French" in appearance, that's easy to do. If you want that change to occur automatically, use language tagging or font tagging.
I agree that the font differences are greater between the C-traditional, C-simplified, J, and K than between English and French. They are not so great, though, that you can't read any of them in any of the others' fonts well enough to easily determine what language they're in and switch to a more appropriate font. Usually, they are so easy to read that you might not even bother to switch fonts unless you plan to print it.
When you need to indicate explicitly what language a sample of text is written in, you tag it with an ISO standard language tag, or one that is customized for your application. If it needs to be represented with a certain font (say MS Gothic for Japanese Win32 instead of MS Song for Chinese Win32), you use a font tag (or some equivalent out of band markup.
It's simply not necessary for the codepoint to declare both the character and the language and absurd to imagine doing so. There are something in excess of 5000 languages that use the Latin alphabet (cf. SIL's Ethnologue). Are programmers clamoring for 5000 different codepoints for the letter 'A'? They would all fit in a 4-byte encoding, but is that what you'd really want? There are dozens of languages that use the Arabic script. Should there be dozens of copies of each character? Even something like the Hebrew script, which you'd think would be limited to Hebrew language, is used in half a dozen other languages (Yiddish, Ladino....)
Essentially, creating a separate code point for each character in each language is the equivalent of adding language tags character by character. Is this really better than having language tags used only where actually necessary?
Imagine the font size! If you just use tables of pointers to the same glyphs instead of containing multiple copies within the same font, imagine the lookup table size! Imagine the performance hit to do that kind of nonsense for each character.
You say even the current system is better than Unicode. Really? Currently, if I send you a snippet of text in some East Asian language, how do you know what characters it represents? The major national encodings for Chinese, Japanese, and Korean completely remap the same double-byte codepoints. You have different characters depending on which encoding it is, so you have to be told the encoding, don't you? Having an encoding tag is just an out-of-band markup, but unlike the case of a language tag, the text is complete gibberish without the encoding tag with things as they currently are.
Also, you say there aren't enough codepoints in double-byte Unicode to hold the needed number of Asian characters. Really? There are no currently implemented computer systems in all of Asia that contain any characters than are not already encoded in today's 2-byte Unicode, and more are being added from obscure old paper sources--characters so rare that nobody has yet used them on any computer. There's still room for many more of these in Plane 0, and surrogates allow for the addition of nearly a million more without going to 4-byte encoding.
There are too many other flaws in your diatribe for me to be willing to go on any longer, especially when I don't know if anyone will even read this comment so late in the game.
You are too uninformed about these issues to be making such a fuss. (For example, you mention a 4-byte Unicode. There is no such thing. You're thinking of ISO 10646's UCS-4, which is not a part of the Unicode standard, but then you only seem to know enough about this to have a lot of strong mistaken opinions, so that's no surprise. I'm not going to get started again....)
Re:Sorry, but you're wildly wrong (Score:1)
"what I said was that what the Unicode standard calls the same character can actually be a character that appears superficially similar (but not necessarily) but has a completely different meaning depending on the language."
Character encoding isn't supposed to resolve "meaning" in the sense you mean. It's supposed to distinguish one character from another. The question was one of whether those cultures considered it to be the same character in a deeper sense that "what connotations and usages has it taken on in the modern language". It's the same as do all languages consider the ampersand to be an ampersand, even if it has all sorts of different usages and variant preferred glyphs in different cultures. We don't need a different ampersand for each culture.
Most characters' meanings have morphed significantly over the ages, even if you limit yourself to a single language, say Chinese. The Chinese don't consider a character to be two characters just because it used to mean one thing and now it usually means something else. (In cases where they DO consider two characters different, they encoded them separately in their national character sets, and they remain separate in Unicode.) In the same sense the Chinese consider one character to be "the same character", the Chinese, Japanese, and Korean scholars have been able to agree on which characters they share are "the same character", even with variant usages and preferred glyphs.
"This statement alone is enough to condemn Unicode for gross cultural arrogance. "All the languages of the world"? Yeah, right!"
Which ones have they missed? While I admit that it is always possible for me to find an obscure unencoded symbol, or to make up one of my own, and always will be, there are no modern languages whose normal-use characters are not either in Unicode already or are in the process of being standardized. As the quote said, Unicode supports all of the world's languages in the same sense that Latin-1 supports Western European languages.
"Did you actually read what I said? I was using the letter A as a theoretical example."
Sure I did, and I pointed out that your example was wrong. The letter A example you chose was a good example of your point, because it clearly illustrated the fallacy of that point.
"Great. You seem obsessed with on-screen appearance; try printing your text some time (and I'm not talking about using any TrueType crap here. Use a proper PS font.) Not to mention the fact that merely switching the fonts isn't going to work when you have two languages mixed in the same document."
It's you who seem obsessed with "appearance". I'm not talking about appearance at all. I'm talking about character encoding. Character encoding is not a field of DTP. What characters are represented by a bytestream is the point of character encoding. Its only relevance to DTP is that it provides one piece of information that, combined with font info, page layout metrics, and other forms of data allow a printed representation of the data to be created. Your only valid complaint about Unicode would be that it doesn't do its job of telling you WHICH CHARACTER is meant. All other claims are spurious. Font technology comparisons, for example, are a complete red herring.
Also, if you have two languages mixed together in some sort of display, you either represent them with the same font or with different fonts for the different languages. If you choose to do the latter, you have to have some information other than the text itself to tell you which text is in which language. If you use escape codes to switch from one encoding to another, then assume that a given encoding could only mean one language, then you could use those escape codes as markup to tell you when to switch fonts. That's fine, but it leaves you nothing to complain about when I suggest that you could use other out-of-band markup systems as well, say marking the font changes explicitly, or marking the language changes and letting that imply different fonts the way your encoding change marks currently do.
You seem confused by the difference between characters and the glyphs used to represent them. If I have a magazine page that contains both English and bits of French, I might choose to italicize the French, as is common in serious publishing. That's no reason for me not to use Latin-1 for the whole page, which is also what serious publishers tend to do. The glyphs to be used, in this case normal vs. italics, are chosen on the basis of two pieces of info: the character and the language. The latter can be marked, just as I said, by language tags that your system then uses to automatically pick a font, or by font tags, or by italics tags, or whatever. The info doesn't come from the character encoding.
It's no different if you're using Unicode for a mixture of Asian languages. You have the same issues and solutions as when using Latin-1 for Western languages. Those solutions haven't been widely implemented in software yet, so a lot of people tend to assume that an encoding escape sequence is somehow the same as a language tag. (The fallacy of this would become obvious to you if you worked in Arabic and had to switch among several languages that all use the same Arabic encoding, for the most part (ISO 8859-6, as I recall), but different presentation forms.)
Yes, Unicode requires out of band information to indicate things like language or font. So do all encodings. Some people just use the escape sequence that switches encodings as that out of band info, but that's still out of band info. It's not the text itself.
"Before you go spouting off about how Unicode is sufficient, and how "no currently implemented computer systems in all of Asia" use more characters, take a look at something like the GT Mincho Project, which has already defined 64,000 Japanese characters, and which will be extended to more than 100,000 characters in the near future. (And please don't tell me again about surrogate pairs; the UTC hasn't bothered to do much work on defining the code points yet, making them effectively useless.)"
The JIS character sets have proven quite sufficient for all but the rarest of Japanese applications. All JIS characters are currently encoded in Unicode. As new characters are added by the Japanese national standards org, they are also added.
A project to exhaustively find every character ever written by anybody in Japan, most of which are essentially spelling mistakes, is of extremely limited practical interest to those implementers and users of modern computer systems. Nevertheless, if it turns out that a set of 100,000 characters is of enough value to be worth implementing at all (doubtful, considering the cost/benefit of any fonts that covered them), they will have to be adopted into something larger than the current Japanese JIS, Shift-JIS, or EUC-jp encodings.
If the Japanese do adopt and implement such a system (I'll believe it when I see it), then it will also be added to the surrogate space in Unicode. You say the UTC hasn't done much work to define those codepoints yet. That's because it will be left to projects such as this Japanese one to come up with the character set and encoding support, then they can present it to the Japanese national body who will decide whether they are interested or not. If they are, then they will present it to the UTC, and the code points will be added, almost certainly based on a simple algorithmic conversion of the Japanese national system. It's not the UTC that does this job, "culturally arrogant" as you claim they are.
"Try searching using CJK Unicode. Depending on the characters you use, you'll pick up a whole pile of false hits because of the CJK unification."
If you search a multilingual database without any specification of language, you ALWAYS have this problem. Try searching for "chat" in mixed English and French. Some of the hits will be the English for chat, some the French for cat. There are a gazillion words that are spelled identically and mean different things in different languages. Given that there are almost always multiple languages using the same script system, it's foolish to think that you could ever just mix languages together and search a database that had no language data other than the codepoint of its characters.
The big database companies are the biggest proponents of Unicode because of their need to address a global market, so they don't seem to agree with you.
"If you really believe surrogate pairs are a good thing, you're an idiot."
Yes, very persuasive argument there. I'm agnostic about surrogates.
Surrogates, for anyone who reads this and doesn't know, are a way of changing Unicode from fixed 2-bytes per character to a variable [usually two bytes, but occasionally, for really obscure characters, 4 bytes]-type encoding. This adds some extra processing, but it's not very complex. It's an analog of what's already being done by most popular Asian encodings, which usually combine single-byte and double-byte characters in the same encoding. There is no horrible complexity problem that isn't being handled just fine by almost every Asian software implementation today, but it is sort of a wart on the purity of double-byte Unicode. It's an escape valve, really, for users who insist on writing in Klingon, for example. Because of the extra processing required to parse variable-width encodings, and the fact that there is so little of any importance to the vast majority of computer systems that ISN'T in the pure 2-byte Unicode, I don't expect to see a lot of implementations that bother with surrogates. Any implementation that decides to use it, though, can add surrogate support in a later version and still maintain complete backward compatibility with pure 2-byte Unicode data. That's really a huge practical advantage.
In contrast, you propose forcing everyone to switch to 4-bytes per character for every character. There simply is so little to be gained by this, and so much hassle for most users, that it isn't going to happen. It's dead. A font that contained one glyph per codepoint in a 4-byte space would weigh in a, what, nearly a terabyte! Of course, nobody anticipates actually using any sizeable portion of the codepoints, so you'll have an extremely sparse array with all of the complexity of translating codepoints into glyphs moved into font lookup tables and algorithms that simply shift some of the complexity from the encoding to the font subsystem. I could live with that, too, if there were something really important to be gained, but few people see enough advantage in it to be willing to put up with 4 bytes per character.
In practice, the great majority of people are going to just use the user-defined areas of 2-byte Unicode in combination with special fonts to handle those rare characters not in the fixed-width 2-byte plane (BMP). With the addition of language tags, this system will even allow for the listing of Klingon names in your Oracle database without even having to go to surrogates, much less 4-bytes/character. Unless you propose including 5000 different 'A' characters, one for each of the world's romanized languages, the languages will have to be recorded out of band in any fully multilingual searchable database anyway, so that capability will already be there.
My summary would be that going to a double-byte Unicode is a big hassle in a lot of ways, but it adds so much value by its unification of the world's script systems into a single encoding(like using Latin-1 for all Western languages instead of a different encoding for each) that it is inevitable. Surrogates are an escape hatch that probably won't be widely implemented because they add so little for most people.
A pure 4-byte per character system is so much more hassle than a 2-byte system, and adds so little extra value, that it is a hopeless cause. With a pure 4-byte system, you could do more than encode a few rare characters, of course. You could put all kinds of extra goodies into it that it would leave the realm of character encoding and become a more rich and detailed description of written language.
Don't imagine that I see no advantage in that. It's just that I think that modularizing the description of text into characters in a fixed 2-byte encoding, plus other data in out of band markups of various sorts that can be used or not as needed, is the more practical approach and the one that has the best chance of persuading people to move from today's confusion of incompatible encodings to a single, unified encoding for most (but certainly not all) purposes. A 4-byte system is such a big pill to swallow for so little benefit beyond that offered by 2-byte Unicode that it will never be practical as a mainstream technology.
ActiveState & Perl (Score:2)
As long as ActiveState keeps control over their version of the codebase, and stays autonomous, I don't see any problem here. On the other hand, this is Microsoft we're talking about here. With an agreement in hand between ActiveState and Microsoft, I fear ActiveState is not much longer for this world. Microsoft's past behavior with similar agreements is pretty scary.
Re:ActiveState / Perl (Score:2)
My point is that the agreement itself makes little difference, but it shows that a small company (ActiveState) has gotten Microsoft's attention. Historically speaking, such attention generally means a short term benefit for the small company, and a medium-to-long term disaster. If ActiveState goes byebye, it would be IMHO a loss for the Perl community, particularly any Perl for Windows users out there.
Re:Embrace and Extend? (Score:1)
Of course if Activestate keeps their version of perl in line with the "main" distribution, and contributes ALL relevant changes back to the community, then this will be an undoubtedly good thing for perl. I should point out that history is against this course of events.
Problem (Score:2)
Where I have a problem is with the proprietary add on components it looks like they will be developing. I'm not familiar with this company, but from the FAQ it looks like much of their business is writing "freedom subtracted" add-ons to Perl. I think this is very unfortunate and do not think companies should be rewarded for leveraging free software to sell proprietary products.
That is _not_ a safe assumption (Score:1)
What MS will be doing if they want to play hardball is this: breaking Unix perl scripts for THEIR OWN PEOPLE. Making little changes so only MS-Perl runs properly, and scripts from the archives or scripts from Unix that shouldn't be causing problems, would cause problems or fail. I don't know exactly how this would be done- it'd be syntactic- but this is definitely what would be happening.
Again: they would break it FOR THEIR OWN PEOPLE, w.r.t running scripts from the Unix community. This is a dangerous tactic because if they do it too much (combined with all the lovely license and 'user protection' schemes they're into), the temptation will simply be to abandon ship entirely. However, that is a risk they have to take because their only real option is to poison their own well- to make MS-Perl begin to act incompatibly and hope they can load so many examples into it that people will still choose that fork- this in spite of the fact that someone could try to make another Win32 Perl to compete with MS-Perl (at least until legislation is passed that is draconian enough about reverse engineering to make this awkward to impossible- in that case they would change Win32 just a bit and not update the docs and then the choices (for that API) become 'fail' or 'break the law')
Expect MS Perl to not run *nix perl scripts forever- however, the example given ('my perl script running on a *nix server') _would_ be safe as the script only has to run on *nix, and the Win32 browser does not have to run it.
Perhaps they will build it into IE and attempt to get people to make MS-Perl scripts that run on the client, not the server. After all, IIS seems not to be the most effective way of producing dynamic content anyhow, and offloading that dreadful processing load could be a real win
No way (Score:1)
You can't make sure anything runs better on Windows, because you do not own it. You can't count on reverse engineering, because legislation is being forced through to make that a crime. You can't significantly help free software by establishing beachheads of it on MacOS and Windows, because it is a luxury on those platforms, easily outmaneuvered in flash and glitz by proprietary vendors, and it cannot be certain the APIs it relies on are the true APIs- the vendor can change the rules of the game at any time.
Hell, I _program_ GPLed free software on MacOS, and I don't think this is a sane strategy. I program that way because I _use_ MacOS and because I want there to be GPLed software there, not because I have any illusion that this will cause Mac market share to be lost to Linux because I'm putting GPLed stuff there. It is not- the stuff that I write can only _increase_ the market share of MacOS. I'm OK with this, it could do with a bit of increasing and is a nice weird option to have around the industry, plus easy to maintain
If you want to get market share for Linux then present people with no option- the arrogance of commercial software will do the rest.
Re:Uneducated Slashdotters...Beef Take 1 (Score:1)
If his boss is devising new ways to make it even worse just to keep him from going near _you_ and your oasis...
If his boss is working out contracts removing more and more of his rights and turning him into property rather than an employee...
MAYBE HE BETTER FSCKING QUIT HIS JOB.
Be more ruthless, people. Sympathy is earned, not to be taken for granted. If you see slaves, do you feed 'em and save their owner the expense of feeding their property, give them water saving their owner the expense of bothering to find water for them, do you take over all the responsibilities of the owner without asking for the injustice of their situation to be changed- or do you save your efforts for the ones trying to break free?
(semi-)Great! (Score:1)
This is great news, in a way.
I tried using Perl Win32 about 1.5 years ago to do NT login scripting. It totally sucked. So I'm glad Win32 functionality will be expanded.
OTOH, I'm not sure I like the idea of MS doing the work.
I haven't studied Perl's licensing very closely so I don't to what extent the GPL applies to modules. And, of course, I have no idea what form MS's extensions will take.
--
"Please remember that how you say something is often more important than what you say." - Rob Malda
Re:Looking at the FAQ... (Score:1)
I am confused by BJH's discussion of CJK characters and Unicode. I am sure s/he is correct, but I just don't understand it.
How does the fact that a Chinese-based character appear differently in a different font differ from the fact the the Roman "A" appears differently in TimesRoman than in Helvetica? Would not the solution to DTP in Chinese-based languages simply be to install multiple Unicode fonts on a system, and choose the correct font based on the context (much like one does in DTP for Roman-based alphabets/fonts)?
Roll your own ;) (Score:1)
My most critical criterion, in a grossly heterogeneous environment (*NIX, OS/390, Tandem -- you name it, we're running it
Unfortunately, I haven't been able to compile many of the modules thusly, such as the libwin stuff.
I've brought this up to perl5-porters and comp.lang.perl.misc; but, I've received only apologies and ``we'll get there eventually''.
Do any of you, dear readers, have truly Open Source Perl deployments running on wintel and *NIX? How do you compile modules?
Interesting admission buried in there... (Score:1)
1) fork() This implementation of fork() will clone the running interpreter and create a new interpreter with its own thread running in the same process. Goal is functional equivalence to fork() on UNIX systems without the performance hit of NT's process creation overhead.
Uhm, aren't they basically admitting publicly that NT's process creation model is dog-slow? It's pretty strange to hear them admit that outright.
And also, isn't that just a horribly bad idea to alter the Perl interpeter so that when a perl script does a 'fork' that it ends up making just a thread instead of a process? It sounds like a good idea for performance, but pre-existing scripts that *expect* their forks to create processes are going to run into trouble on this. (One reason for forking in a server is so that if the sub-server dies from a bug the main server is still going - this change blows that idea out of the water.)
First MS-HTML ... soon MS-Perl (Score:1)
What about what they have done to "take advantage of platform features on Windows" regarding Java? Another clusterf**k. Luckily Sun had the $$$ to fight it.
Forget the fact that there ARE no advantages to the Windows platform in regards to Perl
...this is also a Micros~1 attempt to beat Apache (Score:1)
the FAQ is to "improve PerlScript performance
under IIS" -- I think someone at Micros~1
realized that beating Apache/mod_perl in the
perl scripting department would be a bid win
for them.
I'd have to admit it is a noble cause, but it
will take more than that to convince me to run
NT web servers instead of UNIX.
Don't panic. Unlike Java, Perl will survive M$ (Score:2)
Micros~1 "embrace and extend" game as Java
is because Perl has only around 300 native
functions where Java has around 5000 native
functions -- meaning there is less under the hood
for Micros~1 to mess around with. Also Perl is
less GUI dependent (in fact it is GUI independant)
than Java which makes it much harder to break than
Java. And let's not forget that Perl is Open
Source and Java is not (so far).
When Microsoft corrupted Java all they had to do
was change the behavior and calling conventions
of just a handful of functions and "Voila! It's
now incompatible!" -- with Perl that trick is
not so easy. If you consider Perl's "API" to be
CPAN, well it is almost impossible for Micrsoft
to mess with CPAN since each module is under the
control of each author.
So overall I'm not worried. In fact maybe I would
still use Windows if Micros~1 had released a
"Visual Perl" package -- especially if they
would release a "lite" Visual Perl package for
free maybe I would consider using Windows -- no,
just kidding -- I won't be using Windows any time
soon!
As a perl programmer I sense that this will do us
more good than harm. I have faith that Perl is
strong enough to withstand any "contributions"
from Micros~1.
Tyop? (Score:1)
History tells us more. (Score:2)
As I remember it, some of the folks at ActiveState worked at Hip Communications, the company that did a lot of the original Perl for Win32 work. Guess who provided some of the initial investment to jump-start that port of Perl 5 to Win32? Microsoft.
The last time Microsoft was involved with Perl, the Perl for Win32 port was not in the core Perl distribution. It would have been easy to morph it into WinPerl or VisualPerl or who knows what.
But the truth is, Microsoft didn't hijack it back then. Even if they had intentions to do so now, I think it would very difficult given that ActiveState's distribution is based on the core Perl source tree. ActiveState and many other people put a lot of time into merging the core Perl and Win32 branches, and I doubt they are going to allow a split to occur!
Perl supports UTF-8, not limited to 16 bit chars (Score:3)
So if Unicode grows beyond 16 bits -- which I'm sure it will -- Perl's UTF-8 support will already be there, ready to support it.
In other words: Don't Panic.
Re:Good or Bad? (Score:1)
Re:ActiveState & Perl (Score:1)
Re:They have every right to to this (don't they?) (Score:1)
--
Good (Score:3)
Note that Larry Wall himself did a fair amount of work making Java play nice with NT -- O'Reilly paid for that; and nobody complained then.
The only danger I can see here is a glut of Perl scripts that don't run under non-Windows environments -- but it's already perfectly possible to write Perl scripts that call C functions from Windows DLLs (don't ask me how; I skipped past it in comp.lang.perl.misc)
--
ActiveState's Perl is surprisingly good. (Score:1)
Instead, with ActiveState, and the ALREADY exiting Win32 extensions, I managed to write a usable program using batch files and Perl in just a few hours.
You can complain about Win32, (I do) but a lot of us have to work on it, and more tools that we know inside out, never hurt.
After all, try telling me to use Windows without using the Win32 port of VI.
CJK unification (Score:1)
According to the Unicode web page [unicode.org] and everything I have ever read on Unicode the unification only takes place if the characters have the same meaning. Can you name an exception where that rule hasn't been followed?
depending on the Unicode font used, you might get the Chinese character, you might get the Japanese character, or maybe the Korean character
That's just font management and/or language management. Every decent DTP system needs font management anyway, and if it is going to get hyphenation right it needs language tagging, even if you are only using Latin-1.
The whole point of Unicode was that code points were supposed to be kept separate between languages
What on earth makes you think that was the whole point of Unicode?
One of the big points in Unicode is that it should be possible to convert from any character set encoding to Unicode and then back again. That has caused some compromises for example the fact that the Greek capital letter Alpha has a different encoding to the Latin capital letter A although you could argue that they are the same character.
But you can't make that 2-way conversion guarantee for encoding systems that let you switch from character set to character set with escape codes. Amongst other problems that would make that impossible is the fact that you can use the same escape codes to switch into Unicode, so you would get an infinte recursion.
If you want to convert to and from escape-code switching encoding systems you will have to extract the implicit language and font information and make it explicit in the Unicode version of your data. That is probably a good idea anyway, and is possible in HTML and any other serious text format.
If it's a 'plain text file' then you can't embed the font or the language information, but that's why plain text sucks, and the same problem appears in Latin-1 plain text files.
Hold on; this might not be so bad... (Score:2)
1) M$ will not be doing the work itself. We all know what happens when Microsoft intervenes directly; things get wrecked. At least with some other firm Perl has a chance of remaining intact.
2) The development will be Open-Sourced. Under the same license as Perl itself, no less (which does make sense). In other words, we won't have to scramble so much to keep up with the damage M$ does.
3) M$ doesn't dare try to kill Perl. The Internet is the only thing Microsoft has ever fought against and lost. They tried killing TCP/IP; that didn't work. They're doing their damndest to wreck the Web, but that isn't working either (the piece of crap known as IE might be popular, but there are not many sites using IE-only features). And they will not be able to stop Perl for the same reason: it is too deeply entrenched already. Java failed because MS attacked early, while it was still weak. Apple's hanging on because M$ was a little too late; MS has weakened it severely (there was a time when Apple II's and Macs had more marketshare than Windoze or DOS, way back in the beginning) but it can't kill Apple off completely. Likewise, Perl has dug itself in too deep for MS to totally uproot.
4) Consider that DOS and Windows have no native scripting systems. Macs have AppleScript, Unix/Linux have Perl and the shells, DOS/Windows have... nothing. Not as a normal part of the operating system, at any rate (I hardly think batch files count). Simply put, Windows needs scripting, and Perl could well fit the bill. MS needs Perl, so it can't harm it (too much) or it hurts itself.
We should have an open mind about this. It's possible that Perl might just get some improvement out of this deal.
Specifics of port (Mostly harmless?) (Score:2)
1) fork()
This implementation of fork() will clone the running interpreter and create a new interpreter with its own thread running in the same process. Goal is functional equivalence to fork() on UNIX systems without the performance hit of NT's process creation overhead.
2) Microsoft Installer Support
3) Globalization
Extend Unicode support to all system calls in the core. This includes file names, environment variables, etc. Note that this functionality will only be available on Windows NT and Windows 2000 systems.
4) PerlScript performance
These look fairly harmless, no? Firstly,
#1 looks like they're probably keeping the
call on windows' being named fork. That's
pretty good, as it isn't a great departure and
would keep XP compat more likely than replacing
fork w/ something else...
#2 probably won't change anything -- if they're
talking about Perl itself (making a fancy installer), shouldn't change a thing. If not,
whatever it is, it probably will be done in
a module.
#3 this is probably the most cause for worry, but
isn't Unicode supposed to be the big thing in
5.006? If so, the "only on windows" statement
here is probably irrelivant, and all we're talking
about is platform parity, which is a good thing.
#4 So long as this happens w/o changing the
language, this isn't going to really hurt
anyone.
We've seen embrace and extend from Microsoft, but
this particular instance doesn't look too
dangerous, as the extend part seems to be little
more than optimizations which won't change the
language...
But... I don't *want* it to crash! (Score:1)
Easy; GPL everything you write. (Score:1)
Simply use the GPL.
Re:BZZZZZT (Score:1)
GPL'd software can also be extended at whim.
Re:BZZZZZT - Didn't think for long enough. (Score:1)
Tell you what, have another go, on me, for free... All you have to do is read the little words one after another to see if you can work out what my point is, look I'm even using short words for you now.
Re:!?! That made no sense (Score:1)
with a properly designed perl object this should be a reasonable expectation. but as far as this whole thing goes, i don't see any real danger. if they destroy that version of perl, i'm sure another version will come up. i tried to be a purist and use the win32 compile of perl and all i got were headaches. the activeperl distribution just installed it. i liked that. i still have some problems with some custom modules though. i do wish that i could just do a make on the modules in either platform and have it work. but yeah it would be nice to have it only as a module. i hope that's what they're thinking!
"The lie, Mr. Mulder, is most convincingly hidden between two truths."
Weal license (Score:1)
Let's just STOP comparing Java and Perl, OK? (Score:1)
I'll defer to tchrist's clarifications of the perl side of this statement.
On the Java side, nobody said you had to run Java on a client. In fact, Java's most vital current development is on the server side. Some Perl bigots find this threatening because Java servlets are displacing Perl CGI, but Java and Perl have different strengths and weakness -- they are not interchangable and we should just stop trying to compare them.
What determines "interpreted." (Score:1)
Compiled C/C++ cannot do this. You can generate machine code on the fly, but you can't generate C/C++ code on the fly and run it at run time. It has to be compiled.
Re:What determines "interpreted." (Score:1)
Start with yourself. I know what I mean. Most people know what I mean. I understand what you're getting at, but it's a total waste of time.
The root of the problem is that you can't really have a Science of computers for the same reason you can't have a Science of furniture. For example, everyone knows what a "couch" is, but can you precisely define it? Go ahead and try, but you'll leave some couches out of the definition or, inadvertently, include benches (which are not couches).
Same goes with "interpreted" or "compiled." Suffice to say that compiled code is fed via electrical signals into the processeor (whether that processor is one a chip, or a board). If the processor never performs an actual instruction fetch cycle on whatever is considered the program's "executable form" (such as bytecode for Java), then it's interpreted. I know, "ew, gross, hardware, reagisters, bus cycles! That's... hardware." Yeah, it is and maybe you shouldn't get so high and mighty when entering areas of computer "science" in which you have no expertise, Tom.
Yes, Tom, those of us who have actually designed CPUs know that there may be microcode (or even several levels of microcode) in the processor. That just means the definition isn't perfect, but it's servicable and everybody knows what it means.
Except you.
I can't believe you have time for this flame. Jave digging into your server-side mindshare?
Re:Devils Advocate (Score:1)
Re:What determines "interpreted." (Score:1)
No, it is a compiler the way everyone says it is, you smug bastard.
Everyone (meaning anyone with enough of a clue to be making these kinds of statements) says that javac compiles Java into bytecode for a Java machine. Pretty much nobody has a machine designed to run Java, so the bytecode is interpreted by JVMs which are compiled to run on real machines. Again, nobody seems to have a problem with this, except you.
And, again, the problem is that "compile" does not have a strict definition. Maybe you could formulate one and develop a new career promoting it. Instead of lecturing, teaching and writing on Perl (a subject of interest to many), you could lecture, teach and write on the proper use of the word "compile", which is a subject of interest to pretty much no one.
Personally, because I use Java so much, I tend to use the term "native" instead of "compiled", because "Java compiler" becomes a confusing term when people are trying to discuss Java-to-native versus Java-to-bytecode compilers.
Now, back to your little pet peeve. There is a chip, the picoJava chip, that runs bytecode native (that's bytecode compiled from Java source). There no picoPerl chip. You cannot run a Perl program unless you either have a Perl interpreter or you compile the Perl source to a native binary.
Again, here's another career opportunity, Tom. You could invent the Perl Virtual Machine and propose a hypothetical picoPerl chip and then, maybe people would start talking about compiling Perl the way they talk about compiling Java.
Until then, you will just have to suffer from compiler envy.
Gotcha.
Grow up.
tchrist sure can write -- can he read? (Score:1)
Yes, and I'll go further: even if the picoJava chip didn't exist, the fact that you could create one would be sufficient. I don't think a "picoPerl" chip could be designed, much less fabricated.
That means that all I'll have to do to make the Perl thingie-that-wishes-it-were-a-compiler into a true compiler, is, ... wave my magic wand and create a chip that happens to directly use the Perl bytecodes as its native instructions.
Yes, that's what I said. Should I say it again? Would that help?
Meanwhile, get a really good wand, Tom 'cause it will take some powerful magic to implement Perl in silicon. Gosling designed Java to be implementable in an embedded computer. Wall had no such plans with Perl and I don't think it's possible. Of course, you've made it clear to anyone still reading this thread that I'm a numbskull who knows nothing of the true "Computer Sciences", so post your schematic post haste and make me the laughingstock of slashdot.
Because I can already convert the Perl bytecodes into something to feed a C compiler...
If you want to call a Perl-to-native process "compiling," I'll be in agreement. Most of the time, however, Perl is interpreted by a "perl.exe". Again, you seem to be the only person who finds this distinction puzzling.
You have demonstrated... [blah blah blah]
So, when you're losing an argument, you retreat into "my thesaurus can beat up your thesaurus"? Pathetic.
Re:tchrist sure can write -- can he read? (Score:1)
- Converting source code into bytecode is compiling. Period.
Fine. It's compiling. I wrote several times (enough times for you to read it at least once) that "compiling" doesn't have a very precise definition. Nobody is going to argue with you that what you call compiling is compiling. I'm certainly not.- It matters not what's goingto be handling those bytecodes, be it a code generator or a bytecode interpreter.
Again (for what, the fourth time?), I'm not disagreeing with this. What we were debating was whether Perl is "interpreted" or "compiled." In the likely case that your memory is as defective as the rest of your cognitive aperati, the important point of the original poster was that Perl is interpreted. You haven't really refuted that; all you've done is establish that "compile" and "interpret" are not mutually exclusive. Don't pull a muscle patting yourself on the back.For the last time: if the bytecode is not loaded into the CPU in an instruction fetch, it's being interpreted. Period. Even you, Lord God Price of Perl, use the words "bytecode interpreter."
You can use the word "compile" any way you want. You're not going to change the fact that Perl is interpreted.
You are sooooooo cool; when's your guest spot of "Friends"?
Latin. How novel. (Score:1)
Fluent in a dead language. When Nietzsche wrote about the Superman, did he have you in mind or what?!
- Perl code is always interpreted, just like all programming languages, including Java and C++.
... your firmware is doing that interpretation.
What if there's no firmware? You... uh... did know that not all processors have firmware? Right? You're not trying to argue a point outside your narrow and increasingly inconsequential sphere of expertise? You wouldn't do that, would you?- Your pop-culture
... references are completely lost on me...
Well, at least they're in good company. My brilliant insights are completely lost on you too.- I do not do television
... I pursue a long-forgotten diversion called the reading of books.
Fear not, I have not lost the ability to read, although your posts are inspiring me to try. I have also not become so narrowminded and stagnant that I can'n appreciate music, film, television, dance or any number of arts.You might try it some time.
Get out more.
-trp
P.S. I left you a typo to criticise! I'm try to be kind to the rhetorically handicapped.
Apache -> IIS Migration Path(?) (Score:1)
Am I the only one who smells MS setting up a migration path?
Personally, a better Perl on NT is fantastic. It increases the value of Perl as a programming language... Now I have no reservations about spending the time to learn it. On the otherhand, it bugs me that Microsoft seems to be making a good move, and integrating one of my favorite features of Unix into NT. Not necesarily Perl, just a good scripting language.
The less I have to deal with NT the better.
Re:What determines "interpreted." (Score:1)
Re:Not flamebait - I think this is good (Score:1)
(Hardly.)
Let Perl turn the tables on M$! (Score:1)
Before you freak out... (Score:3)
What can Microsoft do to Perl? Ask yourself this: If MS were to mess around with win32 perl, would it break your code? Probably not. Because perl code is run in an environment of *your* choice, there no place where MS can break your code. The client doesn't have to run a (possibly broken) interpreter - you choose the interpreter. And if someone else uses an "extended" perl, will you still be able to use thier site? Of course. They run the MS version on their side, and all is well.
Whether you use perl for system administration or CGI, take a moment to think what microsoft could do to break your environment - probably not much.
--
... already slashdotted! (Score:1)
Everyone hit your reload buttons a few times!!
You have to wonder... (Score:1)
Just my 2 cents...
Re:Active State has always been in bed with MS (Score:1)
Visual Perl...interesting idea... (Score:1)
Re:What everyone's problem is... (Score:1)
The fear is not that it will be _possible_ to write win32 specific code, but that it will be _inevitable_. Meaning code that was designed to be portable won't be. Just as Java code won't run under The MS Program Formerly Known as Java and vice/versa. Just as html written for Netscape won't work right under IE, and vice/versa.
This is probably a good thing... (Score:1)
ActiveState sent out a message about this on their announcment list this morning. They said that there are four main areas of development with M$:
As for people's fears of embracing/extending, there are already a number of Perl modules specific to the Win32 platform for system services, registry, ODBC, etc. so it's already happened - and it's users who did it, not M$. As other people here have noted, most Perl scripts run in a controlled environment (i.e. your server) rather than an uncontrolled environment (i.e. someone else's web browser), so the comparison to Java is mostly irrelevant.
________________________
Re:!?! That made no sense (Score:1)
Not so fast! A line of text on UNIX ends with a CR, while in the M$ world it is CR + LF. When you use your perl chop() function, will it work on M$ text? also, I have noticed that unlike UNIX programs, windows programs have output that is difficult to parse, assuming that there is a text version of the utility to begin with. Something like expect might be handy.
Re:!?! That made no sense (Score:1)
I imagine that $/ defaults to CRLF on Windoze.
RTFM.
Re:No suprises (Score:2)
They will make their own version of Perl, and try to convince people that the other versions of Perl are substandard.
They could pull something like they did with their "Frontpage" strategy. If the Web server did not have "Frontpage Extensions" (IE, did not run MS IIS), a "helpful" alert box would appear and inform the user that their ISP 'sucks' and they should choose a different (MS-friendly, of course) ISP. Of course they eventually allowed 'Frontpage' extensions to be incorporated into other web servers, once they were found out.
No (Score:1)
:-)
Re:let's not be hypocrites... (Score:1)
Key words in press-release (Score:1)
This makes me wonder of to whom it will be significant: Microsoft or OSS community? For the latter I think 100% open source is the only significant amount, otherwise someone will yet again have control over features dooming portability. This reminds me cool language called Java.
AtW,
http://www.investigatio.com [investigatio.com]
The extensions will be mostly open source (Score:1)
You don't need to mount filesystems manually (Score:1)
automatically mounted. See my homepage for an autofs tutorial if you're
interested.
Re:Embrace and Extend? (Score:1)
How is this any different than Xlib or Tk extentions for *NIX? Or native GUI support under BeOS, DECWindows or GEM? Not all Perl scripts are meant to be portable.
MS-Perl? I don't think so... (Score:1)
Now, you may ask, why would I think this? Well, let's see... Java is a language whose purpose is to be able to write full-blown, platform independent applications. Why does this scare Microsoft? Well, it has the potential of making the OS a "commodity", since an app written in Java will run on any platform. So, MS decided to try and corrupt Java to ruin this "write once, run everywhere" philosophy.
However, Perl does not have the same purpose. Perl is designed to be almost a swiss army knife of programming languages... it allows the user to perform a myriad of different tasks and, as Larry Wall stated, acts as an excellent "glue" language, joining various tools and components together. In this way, it doesn't threaten the Microsoft hegemony, since it doesn't commoditize the OS in the same way that Java potentially can. As well, there is no strong corporation controlling the Perl language, so there's less of a corporate threat.
So relax, people. From what I can tell, the point of this partnership is to make Perl work better on Windows. And the only thing this can do is benefit the community, since more people will become aware of Perl and start using it. And isn't this what we all want?
A beginning for better things? (Score:2)
The idea of PERL from the beginning, has been to make the job you have in mind easier to achieve.
And faster. It's a damn good tool, and I love it dearly.
Due to the market, I have to use Windows products, as well as my preferred Linux, because, well.. That's what's out there.
One of the things that's always made me sit back and sigh about using Active State PERL is the lack of a fork(). And now, with a little financial aid from MS, it's getting put in.
There are many other functions in PERL that rely on *NIX platforms, and you can find these by reading the PERL docs on functions unsupported by Win32.
I would rate some of these among *NIX platform specific options.
But is that a problem? Not really. If I want to know how to do something specific on a particular platform, I RTFM. And there it is.
I currently use PERL on both *NIX and Win32, and am heartily glad I _can_ use PERL on Windows, as it cuts the hassle of having to learn VB (Use PERL/TK for most GUI functions), works across platform with few mods, and gives Windows a useable script language without having to worry too much about paying huge licence fees to a company selling a new and totally incompatible other language.
By introducing people to PERL, you're introducing them also to the ethic of Open Source (to a good degree), and the wider world of the Perl Mongers, and in general a very nice bunch of people who are altruistic in nature...
Maybe this will push home to even more companies that Open Source works for everyone, be it MS users, Linux, BSD or whatever...
And having a better port only lets me do my job even better, which is no bad thing.
PERL is a well engineered, well thought out language that's stood the test of time..
I say "hurrah!" that windows users now can slowly catch up on where the rest of us have been for years.
Just my tuppence worth,
Malk
Oh Please! (Score:1)
Come on! Any extensions that Microsoft add that are windows-specific are likely to be just that: specific to windows. You *wouldn't want* to do them on Unix - things like playing with the registry etc..
If anything, any windows-specific extensions will probably result in *more* portable perl code being written, as the 'write portable code' issue becomes more visible. Windows perl users are used to the idea that you have to give a little thought to portability, because of all the Unix-specific perl out there that they've had to alter.
Re:What everyone's problem is... (Score:1)
The reason they went after Java was because of the abiulity to run precompiled binaries on any Java compliant platform, which would undermine people's reliance on their operating system. However, perl is just another language that people find useful - it's not got the 'write once, run anywhere' features (i.e. abstraction of most of the operating system). Compare it to C - you can write portable code, but there's quite a few little incompatibilites to with endedness, standard libraries etc - perl is and probably will be exactly the same. A largely portable core language with platform-specific modules.
Re:Don't panic. Unlike Java, Perl will survive M$ (Score:1)
Re:compitition (Score:1)
I'm afraid you've got that backwards. Unix started, more or less, in 1970, and much of its early development was done on DEC PDP machines. VMS was first released in 1978 and developed in conjunction with VAX microprocessor, which was one or two generations past the processor in the PDP systems, depending on whether you're looking at the PDP-11 or PDP-8 machines.
Both systems have seen development since, of course, though VMS in general still holds a mild technical lead over Unix in some areas. Cluster support on VMS, for example, makes the Unix and NT offerings look sad and pitiful. (If you think the FUD microsoft spews at linux is bad, try wading through the cluster MarketSpeak Micro$oft emits after comparing VMS clusters with NT 'clusters')
And yes, everyone things VMS is dead and sucks technically. That's only because Digital's Marketing department was filled with AntiMarketroids... Dec had a marketing department every bit as good as Microsoft's. Alas, they were using that skill to drive customers away, rather than keep them.
Re:Good or Bad? (Score:1)
A: In a word, no. We will always use the mainstream version of Perl as our core technology. All potential
work we undertake to do on the mainstream Perl source code will be achieved through open development
with the community.
Q: Why is Microsoft doing this?
...
FOR IMMEDIATE RELEASE
FAQ - ActiveState and Microsoft
Vancouver, Canada - Update June 2nd, 1999
We want to make sure that you, as members of the Perl community, are informed on our latest efforts to
advance Perl technology for Windows. We are quite excited about this development, and see it as a truly
winning proposition for Perl.
Below we have attempted to answer the questions that many of you will have about this development. If you
have further questions or concerns, please send them to Press@ActiveState.com
Q: What is the scope of the work that is being done?
A: ActiveState proposed many potential areas of work to Microsoft, based on feedback we have had from
Perl users over the years. Microsoft accepted the items of work listed below as being important enough for
them to support with funding. As a result, there are four main areas of development, all of which target the
Windows platform.
The interfaces and implementation of all parts of the work that have a chance of being generally useful will be
discussed amidst the Perl development community (perl5-porters@perl.org, archived at www.deja.com) for
inclusion in Perl.
fork()
This implementation of fork() will clone the running interpreter and create a new interpreter with its
own thread, but running in the same process space. The goal is to achieve functional equivalence to
fork() on UNIX systems without suffering the performance hit of the process creation overhead on
Win32 platforms.
Emulating fork() within a single process needs the ability to run multiple interpreters concurrently in
separate threads. Perl version 5.005 has experimental support for this in the form of the
PERL_OBJECT build option, but it has some shortcomings. PERL_OBJECT needs a C++ compiler,
and currently only works on Windows. ActiveState will be working to provide support for revamped
support for the PERL_OBJECT functionality that will run on every platform that Perl will build on,
and will no longer require C++ to work. This means that other operating systems that lack fork() but
have support for threads (such as VMS and MacOS) will benefit from this aspect of the work.
Microsoft Installer Support
Microsoft is moving towards providing improved package management facilities in Windows 2000.
This aspect of the work will make the ActivePerl installer compatible with the new MSI DB, which is
an important requirement for easy management of the Perl installation process on Windows 2000
systems.
Globalization
The Unicode support that Larry Wall created for Perl extends to Perl operations but not to system
calls. Windows NT supports Unicode at the system-call level, and it would be natural to provide a
way to enable the Unicode variants of the system calls. This allows users to create files that have
names comprised of Unicode characters, for example.
This aspect of the work covers extension of the existing Unicode support to all Win32 system calls
in the Perl core, for such things as file names, environment variables, command-line arguments, etc.
This functionality will only be available on Windows NT and Windows 2000 systems (not on
Windows 98 or similar).
The implementation for this is Windows-specific, but the interface to enable it from Perl will be
general and portable to all platforms that support Unicode. This interface will be decided based on
discussion with the development community. The implementation will be built over Perl's existing
abstraction for system calls, which means other platforms that need to support Unicode system
calls can follow the same model if they wish to do so.
It must be noted that support for Unicode will have no effect on the default behavior of Perl. It will
continue to be enabled only when explicitly requested by the script with a pragma. Other existing
internationalization features like locales will continue to work as they have done before.
PerlScript Performance Enhancements
Caching and cloning of compiled scripts in memory will significantly boost the performance of
PerlScript running under IIS/ASP.
The implementation of this aspect will utilize the facilities for creation of multiple interpreters, but will
be otherwise independent of the Perl core.
Q: Is this going to be a custom version of Perl for Microsoft?
A: In a word, no. We will always use the mainstream version of Perl as our core technology. All potential
work we undertake to do on the mainstream Perl source code will be achieved through open development
with the community.
Q: Why is Microsoft doing this?
A: Microsoft knows first hand that Perl is an important tool for their customers, since they are a heavy user
of Perl internally. They want Perl to work well on the Windows platforms and to take advantage of platform
features on Windows.
Some people have expressed fears about a potential "embrace and extend" manoeuvre by Microsoft. We
would like to reassure the Perl community that we see no danger of this ever happening with Perl. Perl's
development model is based entirely on open discussion of changes, and is one of the most important
reasons for the dynamic evolution that Perl has enjoyed over the years. To change this would be
counter-productive to any commercial entity that may have a stake in Perl's success.
It seems more to be like Microsoft is FUNDING them to code on Win32 related perlcode, in the official perl release... I don't think we have to worry about anything...
/olle
Re:`Server-side' and `Scripting' as pejoratives (Score:1)
Good or Bad? (Score:5)
Perl and Java (Score:1)
Make it incompatible with other platforms.
Times change, Microsoft doesn't
Re: O, were it so simple... (Score:1)
Aren't "non-clued suits" already using it? Isn't the fact that Perl is so widely used and accessible one of the reasons Microsoft is doing this?
WinPerl will not be totally compatible with our current incarnations of Perl, and that will be the problem.
Correct me if I'm wrong, but isn't one of the licenses Perl is distributed under the GPL?
If so, then aren't any of the modifications that ActiveState/Microsoft makes to the Perl interpreter itself also covered under the GPL (which, I beleive, they state in their FAQ)? That is, we can pick it apart, correct "intentionally broken" functions and the like?
I don't know how or if the GPL covers modules, so it could be possible that ActiveState/Microsoft could write a proprietary Win32 module and restrict our ability to dissect it, but as someone else said, don't write scripts using that module and you should be okay.
I mean, gimme a break. If you write cross-platform code, then don't use any modules that gives advantages to one OS over the other -- I could be wrong, but I think MacPerl has some of those (especially for mac Toolbox functions). If you want to write platform-specific code (optimized for performance on a given OS/application combination) then use their stuff and be bound by whatever restrictions thay come up with.
Isn't that one of the meanings of TMTOWTDI?
Jay (=
What the heck.... please clarify... (Score:1)
Wouldn't Microsoft's proposed implementation of Perl extensions damage this methodology?
Ever here of boycott? (Score:1)
there's already lots of Win32 bindings (Score:1)
Re:Looking at the FAQ... (Score:1)
Yeah, that's pretty much it. The whole point of Unicode was that it's not supposed to depend on the font, but here we are with the same problems as Big5, JIS, SJIS, EUC, etc., had. In other words, there's not a lot of point in shifting to Unicode for people using these character codes.
Re:Looking at the FAQ... (Score:2)
Sorry I wasn't a bit clearer...
The idea is, you have two or three characters that look very similar. One is Chinese, one is Japanese, one may be Korean. They look similar, but usage differs (ie, they have different meanings). Because of the ridiculous Unicode proposal, they are all unified into the same code point (ie have the same character code, which means there is no way to distinguish them through Unicode alone).
Now, the problem is, depending on the Unicode font used, you might get the Chinese character, you might get the Japanese character, or maybe the Korean character - but there is no way to be sure beforehand which one it is going to be without breaking the "universal" concept of Unicode.
The whole point of Unicode was that code points were supposed to be kept separate between languages - so an A in English and an A in French (as well as German, Dutch, etc. etc. etc.) are all supposed to have different code points, because usage differs between these languages. (This is just an example; I'm not positive that all these languages have a different code point for A). This concept unfortunately breaks down for CJK fonts.
I hope that was a little easier to understand...
Looking at the FAQ... (Score:4)
One other thing - the press release says that Unicode is greatly desired by Asian customers. This is nothing but marketing bullshit. Anyone in Asia who works with text processing knows that the current implementations of Unicode (including 2.0) are almost pitifully inadequate. And before anybody leaps forward to defend Unicode, please study up on the CJK problem before doing so. Unicode causes so many problems in its present state that it may simply be easier to continue using present "standards", at least in Asian countries. I'm quite simply disgusted with the way countries that don't use an ASCII superset have been treated with regard to Unicode.
(The CJK problem I'm referring to above, for those who can't be bothered looking it up for themselves, is that the present implementations of Unicode allow pretty much only for 16bit characters, which is nowhere enough to contain the number of characters required for Chinese-based fonts - ie, China, Japan and Korea (CJK). The idiots in charge of Unicode then decided that "similar" characters would have to use the same code point, thereby defeating the whole point of Unicode - that is, for CJK characters, the appearance of a character can vary depending on the font used, even though Unicode is supposed to define separate characters based on differences in usage between languages. To put it simply, a Unicode font is useless for DTP or other areas where the appearance of a particular character must be clearly defined. Iknow, there is a 31-bit version of Unicode, but nobody has made any serious attempt at defining code points outside of the 16bit code space.)
With this development, it would seem that Microsoft is going to ride roughshod over Asian markets by saying that Unicode is the complete solution to all our problems. Well, I say, stuff that where the sun don't shine, Billy boy.
BTW, my sig has been the same since I first registered at
Embrace and Extend? (Score:2)
good! (Score:2)
I doubt they'll keep the source proprietary--that would seem to be self defeating in this case.
there is no general purpose programming language (Score:2)
I don't think there is any such thing as a "general purpose programming language".
All programming languages involve tradeoffs. Some are easy to implement, some have lots of features, some can be compiled into very high performance code, some are backwards compatible with others, some are easy to understand for a particular community, some catch a lot of errors at compile time, some catch a lot of errors at runtime, some use lots of resources, some are well-suited to development projects with lots of developers, some are well suited to development projects with only a single developer, etc.
Somewhere in that space, Perl, Python, Java, Tcl, Eiffel, Fortran, Lisp, Scheme, C, C++, and all those other languages each have their place. If you pick the best language for each job, it is my experience that there is actually very little overlap between those different languages. Each of them has their communities, and few of them are in danger of going away because some other language supposedly superceded them.
Re:let's not be hypocrites... (Score:2)
Sure, Microsoft is blah blah blah [insert favorite pejoratives here]. As a matter of fact, they are not the devil incarnate and some of their stuff does work pretty well. You can thank them for the fact that the Win32::ODBC module is the backbone of many a fine Web site (or didn't you notice that ODBC, a Microsoft project, is a protocol that despite some faults Doesn't Suck).
I'm also appalled that so many of the commenters here didn't even bother to check basic facts about Perl and its history in the DOS/Win realm. I've been using it since I was running everything under DOS 5.0 in 1992 (still have the old
I've also used the various ActiveState Win32 Perl releases for years with excellent results. For example, I use it to munch 2-gigabyte raw dumps from databases and produce crosstabs and statistics two orders of magnitude faster than the databases can do so themselves. One of my main reasons in moving to Linux, actually, was to double that performance, but it's certainly not bad under NT.
I'm glad tchrist showed up to correct some of the more egregious misunderstandings about Perl. The evolution of Perl between 1994 and 1997 from basically a fairly limited domain of text processing to a platform for all kinds of useful things, through the addition of object oriented structures, the revision of the library approach into the full-blown module environment we now have, and so forth, ought to be a classic study in how computing tools can evolve in a positive sense rather than a bloatware sense.
If Microsoft wants to hook into that in a bigger way, that's a very good thing. Perl makes my life in the NT context bearable. However, I don't want to see it fork from the standard distribution in any significant way. Larry understates the importance of reunifying the code bases between the standard development tree and the ActiveState version with the One Perl initiative. And he also plays down the fact that it was not a guaranteed win. But nonetheless it did work out.
So now consider the real story: Microsoft has conceded where the locus of authority is for Perl. They are promising in public to play nice. Their tendency not to do so is well understood and their activities will be thoroughly scrutinized.
In other words, this is precisely the opposite of what happened with Java. And it is no surprise, given that Perl is open source and Java is not.
And that is the most important lesson to be drawn from this.
I don't mind Microsoft throwing in some things to optimize local processing, as long as they are themselves open source and don't fork my code. Because the bottom line is that if I can't run the basic stuff across platforms, their optimizations will disappear from my code first. That is market discipline Microsoft has never had to face, and they may have to grow up (maybe just a bit, but grow up nevertheless) as a result.
Perl is a remarkable accomplishment. Learning more about its history, philosophy and development is a rewarding experience. And, golly guess what, it's all out there readily available to be discovered.
-------
maybe he funds Perl, but
Re:No suprises (Score:2)
Could be good, could be bad, could be both (Score:2)
IF the MS team working on this project "respects" the concept that (most) Perl programs should be able to (pretty much) run on either Win32 or *nix...IF they don't try to make key parts of the core Win32 Perl package MS property...IF all they are doing is adding functionality that exists in *nix versions of Perl that currently does not in Win32 versions....and IF, as it appears, they are trying to help make it compatible with this Win2000 thing....
Then hopefully this is a good thing.
The problem will be if they try to subvert it into "WinPerl" by adding tons of amazing new functionalities and properties that are propietary to MS and closed. I haven't got a problem with adding functionality to Perl for Win32 that takes advantage of OS features. As long as it stays open. As it is, the *nix version of Perl has this edge with all of it's *nix functionality built in.
The simple fact is there are differences between the two OS's that limit usage of a lot of Perl programs between platforms, and if this fixes it, great. Maybe we'll see more *programs* being written in Perl, Tcl/Tk, etc. etc. for Windows.
let's not be hypocrites... (Score:4)
Re: Unicode support (Score:2)
Perl wasn't designed with Unicode in mind, but as Mr. Wall is fond of saying, Perl is really great at text processing... Unicode is a natural addition.
I don't think this is a big deal. Is it possible for Microsoft to take the Perl idea and pervert it like the wondrous things they've done with Java? I highly doubt it. They are just adding Windows functionality and access points for those who need them.
I won't.
Re:time to think quickly... (Score:2)
Re:`Server-side' and `Scripting' as pejoratives (Score:2)
What about the Larry Wall interview? (Score:5)
I think the quote was:
"I'm not directly affiliated with ActiveState, but I've worked with them, and I think the problems they've solved far outweigh any problems they've created. You've got to understand their market has always been the Windows space, where you're actually doing people a favor by charging them money for things, because that's the only way to keep from confusing them. Linux users are smarter than this, of course, but some Linux users aren't quite smart enough to realize Windows is a different culture, and Perl, being a postmodern language that is sensitive to context, will look different in a different culture. "
Don't get me wrong... I'm not a big Microsoft fan. But it seems to me that if even more people use Perl, that will be good for the community in general. And it wouldn't violate any of the "sacred principles" of Perl/Linux advocates in the process....
Re: O, were it so simple... (Score:3)
You know, if stopping M$ was as simple as me
not using (or paying for *cough*) their products,
then guess what? They would not be where they
are today. IE would not be so popular, there
would be more *nix games, et cetera, et cetera..
The problem is not me (or dare I say us)
using it, the problem is the rest of the world,
especially the non-clued suits using it and
bringing it to the fore. WinPerl will not
be totally compatible with our current incarnations of Perl, and that will be the
problem.
...dave
Re:don't use it then (Score:2)
Where did you get this idea? All the FAQ and press release say is that there will be access to Win32-specific features, and there have always been such extensions in ActiveState Perl. You can write portable perl and it will work in ActiveState just like on any other platform, or you can write Win32-specific code and it will work only on Win32. You can do the same thing on UNIX; don't tell me you've never seen a Perl script that would only work on UNIX, cause there are plenty of them. It's as easy as `/usr/bin/whatever`
I've written a lot of ActiveState Perl, some of it portable, some Win32-specific. It's impossible to use the type of features they are talking about accidentally. You always have to use some Win32:: module. What could more explicit than that?
LJS
Re:`Server-side' and `Scripting' as pejoratives (Score:2)
Nostagically relevant, consider pi, pix, px, and pc on old BSD systems for a Pascal environment. I can present you with Pascal source, and ask what it is. You cannot tell me. It could be run under a pure interpreter, a bytecode compiler-and-interpreter, or further translated into some machine's assembly language. All are still interpreted.
Oh, they claim, they claim. But see the old article [perl.com] by Heinz Lycklama. Everyone else who did POSIX compliance did so to create something genuinely useful for their customers. Why use persnickety technicians when you get to hoodwink the courts to ordain you POSIX in a bait-and-switch game? I leave it to the readership to judge Microsoft on their own.--tom
Re:Uneducated Slashdotters...Beef Take 1 (Score:2)
The advent of freely available, plug-and-play substitutes for our standard Unix tools has been a great relief to victims of Big Bad Bill. These include the vi (vim, actually, but that's fine), Perl, the entire UWIN environment, the Cygwin suite, and of course, Perl Power Tools [perl.com], which also work on non-{Microsoft,Unix} systems.
Some folks feel that we shouldn't port our tools to the evil one lest it thereby become easier to exist under his yoke, and thus perhaps decrease the chance of someone upgrading to Unix. But hold on there. Imagine that you're living in a fertile oasis of plenty, and on a dayhike, come across your friend toiling for his business in the searing desert that surrounds you. He is parched and crying out for aid. You invite him back to your oasis to refresh him, but he says that his job is there in the infernal sands. What do you do? Do you just walk away and abandon him? Of course not! You give him a sip or seven of that cool water you've brought with you from your safe haven.
So too should it be with our tools.
--tom
++TMTOWTDI (Score:3)
I'm curious how it will avoid the insanely inefficient process start-up penalties one incurs under Microsoft, as well as how the non-shared data pages will be fast without being copy-on-write. The multiple-interpreters-in-one-process work will also help everyone.
I'd say to reserve one's fears until there's something to fear. ActiveState can't after all be all bad: they even list the Perl Power Tools [perl.com] on their pages to help people sentenced to tool-deprived systems. :-)
--tom
`Server-side' and `Scripting' as pejoratives (Score:5)
First, to relegate Perl to nothing more than CGI is a tremendous disservice. Perl is a general-purpose programming language whose process, file, and text manipulation facilities have made it the programming language of choice for tasks involving quick prototyping, system utilities, software tools, system management tasks, database access, graphical programming, and world wide web programming--just to name a few. Systems administrators, network administrators, and web administrators on all platforms flavors especially love it because of its potential to automate virtually everything they need to do.
The second misconception to disabuse oneself of is this whole `scripting' notion as being somehow different from `programming'. It's not. They're quite the same thing, at least as used in the vernacular now that JCL scripts and uucp chat scripts are largely (and thankfully) gone. And before you mumble something about `interpreted', you should think about how, contrary to popular misconception, not merely Perl but all programming languages are interpreted. The only question is, at what level?
In the normal case, the Perl compiler compiles source code into parsetrees of Perl Pseudocode (PP), and hands those off to the PP interpreter to execute (one could say `interpret') these trees.
In other cases, the Perl compiler compiles source code into parsetrees of Perl Pseudocode (PP), and hands those off to a code generator, which generates bytecodes. These bytecodes are then later loaded by a special module that converts them back into PP trees, which are then handed off to the PP interpreter to execute (one could say `interpret').
In still other cases, the Perl compiler compiles source code into parsetrees of Perl Pseudocode (PP), and hands those off to a code generator, which generates C source code, which is handed off to the C compiler, which generates assembly source code, which is handed off to the assembler to produce object code, which is handed off to the linker to create a linked binary image, which is then handed off to the kernel to execute (one could say `interpret') at some later date, whose instructions are then often handed off to the firmware to execute (one could say `interpret').
In all cases, the Perl compiler runs an optimization pass, just like any other compiler. For example, the expression $x = 2 ** 31 - 1 would be computed at compile-time, since it's a constant expression. But the Perl compiler is rather more clever than just that, sometimes inlining certain subroutines, ignoring unreachable code, doing loop hoisting, etc.
I hope that's all clear now. :-)
That part is certainly true! And I hope that part is, too.--tom