Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Perl Programming

Perl 5.6.1 Released, My Precioussss... 117

Pudge tells me that perl 5.6.1 is released. Tell your boss you won't get any work done today, you have to, er, upgrade your personal knowledgebase of evolving regularly expressional technology. Then test every one of the bugfixes, like ""a\nxb\n" =~ /(?!\A)x/m". Pick your favorite new feature or bugfix from the announcement and tell us about it.
This discussion has been archived. No new comments can be posted.

Perl 5.6.1 Released, My Precioussss...

Comments Filter:
  • by Anonymous Coward
    wrong.....ASP is an architecture for dynamically creating pages on the fly to be served by a web server. VBScript is an HTML-embedded language for server side processing. It is the "little brother" of full blown VB which is a prgogramming language. I personally prefer Javascript for ASP myself. And despite what people keep saying, you do not really use perl for asp, you use perlscript, which installs with the active state version of perl (installing standard perl and trying to do asp in perl will only get you frustrated, not results)
  • by Anonymous Coward
    An exception to Java's "There's Only One Way To Do It" philosophy tends to be the Vector class, where it's more like "There Better Be Another Way To Do It!"
  • by Anonymous Coward on Monday April 09, 2001 @03:44AM (#305574)
    This reminds me of how Microsoft decided to take out the line "It makes a grown man cry" from the song Start me Up during their Windows 95 campaign. Can't imagine why.
  • We have Stephen Cole Kleene [brighton.ac.uk] to thank for both recursion theory and regular expressions. According to a blurb in Friedl's Mastering Regular Expressions [oreilly.com] (pg. 60):

    "The seeds of regular expressions were plated in the early 1940s by two neurophysiologists, Warren McCulloch and Walter Pitts, who developed models of how they believed the nervous system worked at the neuron level.

    "Regular expressions became a reality several years later when mathematician Stephen Kleene formally described these models in an algebra he called regular sets. He devised a simple notation to express these regular sets, and called them regular expressions.

    I'm still looking for reprints of Kleene's papers on regular expressions, but they seem to be hard to come by. Maybe I'm just looking in the wrong places.

  • Perl isn't about doing things other languages can't do. Perl is about doing things other languages can't do easily. Chances are, if you are doing a lot of string/text manipulation in C (especially if you are doing a lot of matching and replacing) you could get your job done a lot faster and easier in perl. A lot of 100+ line C programs can be broken down to a
    Everybody I know who tried Perl's regular expression engine never went back. The C RE engine (that the manpage from 1994 claims is alpha quality) just doesn't have the rich feature set that the perl RE engine has. Perl's RE engine is constantly stress tested and obscure bugs are fixed, I'd dare say it is more stable than the C RE engine as well.

    Perl also has a very comprehensive (if a little incomprehensible) package repository called CPAN. If you are looking to do something that somebody might have done before, chances are they DID do it before, and released the sources for free to a nice searchable database.

    All this is just the tip of the iceberg. You really have to learn Perl (there are some great books available) to form your own opinions on it.

    Down that path lies madness. On the other hand, the road to hell is paved with melting snowballs.
  • So there I am, innocently trying to install Bundle::libnet on a new machine using "perl -M CPAN -e shell", and CPAN.pm has decided that before I install Bundle::libnet, it should first download the new perl distribution in its entirety and upgrade my system to perl 5.6.1.

    Which to me seems a bit overzealous on the part of CPAN.pm. Some might even describe it as Microsoftian in its evil.

    Is there any way to turn this "feature" of CPAN.pm off? Or am I stuck installing modules by hand until such time as I decide I want to upgrade to latest and greatest? (Not so bad for individual modules, but installing Bundles by hand is a royal PITA.)
    --
  • If you look into the history of perl, you'll find that it was written because awk(1) and sed(1) ran out of steam. They're not bad languages and I use them a lot. But Perl does much, much more. It's also vastly more extensible.

    The other reason is that Things can often be much simpler to do in Perl than in awk or sed. It all depends upon which modules you use.

    -Dom
  • by Dom2 ( 838 ) on Monday April 09, 2001 @04:14AM (#305579) Homepage
    It's braindead because:

    * It forks another process to work that could easily be done inside perl itself. This makes your perl script much faster.

    * He used a string instead of a list in the system() call. This is a potential security risk and slows tbings down even more because the shell must be invoked instead of the program being executed directly by perl.

    * He didn't use s2p(1) to create the perl script. Larry write it for a reason!

    I saw a similiar piece of code where I work recently that spawned 3 separate copies of mkdir, separately. And Perl has a builtin mkdir() function. Needless to say, this error was corrected.

    -Dom
  • Is that a Gormengast reference I see before me? Sounds a lot like Swelter...
  • by YuppieScum ( 1096 ) on Monday April 09, 2001 @04:03AM (#305581) Journal
    Has to be
    Infinity is now recognized as a number.
    Now we can write the code for the Infinate Improbability Drive...
  • by Masem ( 1171 ) on Monday April 09, 2001 @04:00AM (#305582)
    To be truthful, someone generally easily finds a bug in the regex engine about twice a month. Yes, they tend to be on the most complex of regexes, but they are still there and still need to be fixed. Plus memory issues and the typical details you get into with any program.
  • Everybody I know who tried Perl's regular expression engine never went back.

    Mark me down as the exception.

    I had a problem perl's RE stuff doesn't handle well. I had a fixed per program set of regular expressions (200+ of them, in priority order) and a lot of input strings that matched at least one of the regular expressions (on of the RE is .*, so there is always a match). I had to find the best match for each string, and the per string runtime had to be extremely low.

    The only way perl does this is to read the input string (maybe) call study, and then run the REs one by one in priority order stopping when you get the match. In other words something similar to lex, but just a little too dynamic and otherwise different to use lex.

    I wrote a big C++ program that made a NFA from all the regular expressions, transformed it to a annotated DFA, and then shoved strings through it. The cost to check a string was proportional to the string length, but (aside from cache effects) independent of the number and complexity of the regular expressions.

    Once in a while even fine pre-built tools like Perl aren't good enough, and you have to hand craft one of your own.

    That said, much of my regex work is in Perl or lex.

  • After seeing that benchmark, I got very curious cause I thought it was VERY odd. I read up on the web and noticed that several had commented that the test is VERY favorable for Perl and Awk due to it's "heavy use of associative arrays and strings". I can't comment that because I don't have the source code for any of those but I do know that the results are very weird.

    You can get it from the author's web page [bell-labs.com]. The Markov chain one was written in C by him a very very long time ago as a prank new.news poster before the re-org (before ANSI C, or Perl existed). I do think this is an updated version though.

    A small app that has two nested loops and calls a method, passing two integers as parameters and returning an integer as the result.

    Well that's a fine benchmark, if that is really the kind of code you expect to write. The markov chain one at least does some I/O and uses real data structures. Neither are all that close to code I tend to write, but the markov one is quite a bit closer (I normally have data structures, and I/O for example).

    Also note that this book was published in Feb 1999, so the book was prob. done six months before, with software possibly months older. The Java for the x86 may well have had Hot Spot stuff, but the Irix version may not have. Also as I recall I/O sucked big time for C++ (surprising to me, I expected it to win), and I/O for Java (also surprising, I didn't expect I/O to rock, but I expected it not to be so bad).

    P.S. if you program go get the book. It really is good. It is one of the few that has a chapter on debugging (even if it is pretty elementary).

  • That's what I guessed.. I downloaded the source code and the Java is horrible. It uses all the built-in slow string stuff like StreamTokenizer that is the first thing you learn to NOT use when you use Java and want performance.

    In other words it used the library, and the library sucks? The C++ code uses the STL for all the heavy lifting. The Perl code does it in a straight forward manner. They all have the look of "get the job done" rather then "use a lot of tricks to make it as fast as can be". The C++ code doesn't pre-size it's vector (actually it has been a while, and I don't recall if it uses a vector).

    That is pretty much a good aim for a book focused at non-experts. Or even busy experts :-)

    I ran both the Java and the Perl samples and it was 5 secs for Java, 3 for Perl. So so much for a 9 time speed difference.. I thought it was full of shit anyway.

    Not surprising, compiler technology marches on. I expect the C++ I/O has improved a lot as well.

    I could optimize the Java code a little bit if anyone wants to challenge this for a speed contest. I'm pretty sure I could half the Java execution time with an hour of work..

    The question is what would they all do with a little expert tweaking? I remember thinking that C++ version would run faster just by changing the map's to hash_map's (assuming the implementation has them, most do).

    I have no idea about the Perl version. Normally when I have to speed up Perl code I rewrite it in C++. I only have a vague idea about the Java since the only "fast" thing I wrote in it was a SSH client, and it beat out a few C (I assume) implementation on a Win95 machine, but blows dead goats on the BSD systems.

  • Ah, but Perl supplies the undef() operator for just this reason: In the event that a structure may be circular, you can always explicitly blast it at the end of the scope.

    C provides free() for just that reason; In the event that a structure may have been allocated, you can always explicitly blast it when you are done.

    That doesn't make C garbage collected does it? I mean is still cleans up auto variables by itself. Or C++ which cleans up dynamically allocated memory stored in autoptr variables.

    Or do all three languages fall short? (admittedly Perl falls far less short)

    Just because a language has some garbage collecting features doesn't mean you can live by poor coding standards (like not cleaning up after yourself).

    No, garbage collection's sole reason for existence is to clean up for you. If it doesn't do that then it isn't garbage collection. Period. It might be useful, but it still falls short of being garbage collection.

    Besides, a full mark-and-sweep algorithm is quite an additional load on the processor. (one more reason why Java is slow.)

    Mark and sweep is expensive, but there has been twenty years of research in GC since mark-and-sweep was state of the art. Modern GCs can actually be cheeper then reference counts! Regrettably they tend to be a little erratic in when they fire. The real-time GC systems (there are some) are something I don't know a lot about, but I expect they are more expensive.

  • by stripes ( 3681 ) on Monday April 09, 2001 @09:00AM (#305587) Homepage Journal
    Perl just added weak references. This is an idea that's been known in the academic programming language design community for over a decade, but until now, it hasn't shown up in a mainstream language.

    Doesn't Java count? It has at least 3 kinds of week references (at least as of Java2, maybe before). I think they differ both in how weak they are, and in how they react to destruction (at least one fires before the object is destroyed giving time to save to disk, or make a stronger reference; at least one fires after destruction). One or more of the weak ref types may prevent collection until the system is actually low on memory.

    A structure with backlinks, such as a tree in which each node has a back pointer to its parent, is a great use for weak links. If you do that in Perl now, you can get a memory leak, because it looks like a circular reference and the reference counts won't go to 0.

    That is one way to fix circular references. Another way would be to have a better garbage collector which can collect up circular references (that is in fact the traditional difference between reference counting which fails on circular references, and garbage collection which works).

    This isn't a slam on Perl, it is nice that it now has this feature. Ref counts also have nice features (they aren't faster then a good GC, but the find the garbage they do find at more reliable times so destructors can be relied on to go off at useful times -- there are some hybread systems that do both to get both advantages, like Lucent's Alef).

    The more common use of weak references (in languages with proper GC systems) is to have a cache. The cached item stays in memory until a GC sweep takes it away, if it is needed after that it has to be recalculated or read from disk again or something.

  • The name comes from formal language theory. A rather terse introduction can be found in this PDF [ohio-state.edu]. Check out page 8 and see if anything looks familiar.

    As for why this particular type of grammar/language is called regular, I guess it's just because it has the strictest rules of the four.
    --

  • Ah, but Perl supplies the undef() operator for just this reason: In the event that a structure may be circular, you can always explicitly blast it at the end of the scope.

    Just because a language has some garbage collecting features doesn't mean you can live by poor coding standards (like not cleaning up after yourself). Besides, a full mark-and-sweep algorithm is quite an additional load on the processor. (one more reason why Java is slow.)

  • ASP and Perl don't even try to solve the same problems. ASP is an HTML-embedded language for server-side processing (much like JSP or PHP). Perl is a programming language. You can embed Perl in your web pages if want to, but you can't really compare the 2 languages without getting more specific. There are many ways to embed Perl into an HTML sheet for server interpretation -- and, yes, it's usually more powerful than ASP is in my experience.

    The wheel is turning but the hamster is dead.

  • The Gap did dM's "Just can't get Enough" for their leather line last year, and if you went to depechemode.com they had a disclaimer that they did not support using animal products at all. Twisted Sister balked that their record company was letting the infamous John Rocker (Atlanta Braves) use thier "I wanna Rock" song every time he came out [hmmm]--

    I think the record companies are having fun with their 100% legal rights in these cases. The more they do this, the more incentive bands will have to steer clear of the RIAA and folks.

    The wheel is turning but the hamster is dead.

  • If this is the "latest and greatest," and incorporates all the bug fixes etc., then how come 6 months after I compiled Perl 5.6 on my own machine, all the OS distros I could find (including, if I'm not mistaken, the just-released Mac OS X) were still shipping with some 5.00x release?

    This is not a troll. I really want to know if there's some reason why more people aren't using Perl 5.6.x?

    --
  • Ah. Then on that score, I stand corrected. I'm almost certain that the Public Beta shipped with 5.005.
    --
  • Who needs hammers? I was told that there are rocks to make you happy!

    Seriously, I use awk and sed in shell scripts, but they can be a pain for complex tasks. Also, Perl's RE set is much more robust than that of awk and sed. Learn perl -- then complain about it.
  • Not least of its advantages are that it's cross platform. You can run the same perl program (with only a few caveats) on Solaris, linux, Windows, OS/2 and even DOS.

    Apart from that, the mindshare of perl is not insignificant. Chances are, someone's had the same problem you're trying to address so you might be able to find a ready-made solution already.
    --

  • But remember that regexps in Perl and other languages are not 'regular'. They have extended features which aren't available in true regular expressions, such as backreferences.
  • Mebbe cos they're so damn complicated they induce regular bowel movements? ;-)
  • chop(@list) in list context returned the characters chopped in reverse order

    The characters chopped in reverse order, not the list itself. People probably didn't notice this much because chop is maily used in scalar context, to see how many characters are chopped, which doesn't depend on the order of the returned list of chopped characters (choppees?).

  • You might write a faster program in C.

    You will program faster in Perl (or Python, or Rexx, or Ruby, etc..etc..etc...)

    See this [ira.uka.de] for an interesting study.

  • ...are my favorite changes. No matter what program I write in Perl, there are ALWAYS regular expressions. That seems to be the most common thread in all my stuff, IMO...

    Memory leak fixes are always nice, too...

    Jethro
  • Now, if only

    tr/Delorean/Heart of Gold/g;

    would work... sigh... 8^)

    Jethro
  • You are correct that s/// is substitution, but I thought tr/// would be more appropriate, since I was thinking you would convert the Back To The Future car to the HoG, not make a substitution...

    YSMV... (your semantics may vary) and horribly OT...

    Jethro
  • Despite your lack of tact in your obvious troll of a reply, Mr/Mrs/Miss Anonymous Coward, I will respond.
    * fork() emulation has been improved in various ways, but still continues to be experimental. See the perlfork manpage for known bugs and caveats.
    Again, when if M$ coming out with their "fork"? I am not talking about an experimental (buggy?)version.

    Ah, it doesn't matter anyway. Call it a troll, but M$ is just going to redefine fork the way they redefine all the rest of the "standards" they implement.

    Perhaps you should RTFA.

    Jethro
  • I like the versatility.

    It is not only extremely useful in a CGI aspect (though, arguably not as efficient as some "out of the box" -- i.e. modperl), but it is also very useful for many other things. I use it quite a bit as a cross-platform scripting language. It is nice to take things written on my BSD box, and run them on my desktop with ActiveState Perl.

    There are modules for everything, from processing PDF to controlling your X10 stuff.

    It doesn't get any better than that, IMHO...

    So, when is M$ coming out with "fork" for Windows? 8^)

    Jethro
  • Are you sure you weren't compiling standard C?
    A variable declared in the initialization section
    of a for-loop has the scope of the whole function
    in that case.
  • Posted from my Mac running MacOSX

    [localhost:~] tns% perl -v

    This is perl, v5.6.0 built for darwin

    Copyright 1987-2000, Larry Wall

    ...

  • I like the sort fix, too. Bumped into that bug just last week...thought I'd blown a synapse.

  • You are actually describing something called a Context Free Grammar, which is used to represent a Context Free Language. Regular Languages are a subset of the Context Free Languaes, but they are not the same, since they are only a subset.

    To be clear, Perl's regular expressions are actually not regular at all, in the true mathematical sense of the word. Backtracking is one thing that immediately breaks the definition of the world regular. However, the term is catchy, and has since been (incorrectly) applied to any form of pattern matching.

    For those interested, I suggest picking up a book on automata theory to learn more about all this, from a theoretical standpoint.

    I have Introduction to the theory of Computation by Michael Sipser, ISBN 0-534-94728-X
  • chop()
    chop(@list) in list context returned the characters chopped in reverse order.
    This has been reversed to be in the right order.

    You just know this is one of those bugs that would've had people scratching their heads for hours trying to work out why their arrays were sudddenly in reverse order; trying to debug their code with copious foreach loops and print statements.

    --

  • Considering you can write ASP in Perl I would say that your not propelry informed. You can do ASP in JScript, Perl, VBScript etc.

    LOL, I've got quite a few projects that use Perl to generate JavaScript to generate DHTML.

    You can really tell that Perl was made to deal w/ text and data (hence the name: Practical Extraction and Reporting Language aka Pathologically Eclectic Rubbish Lister). It's one of the things that make the language so nice to work w/ in a web/database environment. This can have great benifits for reducing development time.

    Another great strength of Perl is CPAN [cpan.org]. Perl has a great community behind it.

    Above all, I just think Perl is a fun language. It's not full of rules and restrictions. It's very loose and tends to let you find out what the best way to do something is. It's not always the right tool for the right job but it fits nicely most of the time.

  • Okay we can go into the theres 9,000,000 ways to do it argument here if you want but I dont see large corporations hinging their products on Perl. In Java, or C++ there is OWTDI not TIMTOWTDI ;)

    Really? Hmm.. I work for a world wide Networking/Communications company with 160,000+ employees and we use Perl a lot.

    We're not the only ones either:



    You're right that they don't hinge all of their products on it but I don't think that's what Perl is meant for. It's all about the right tool for the right job.

    Im only going to stick to the theoretical here but I have found some Perl code that simly defies common sense in production level and quality code. These little 5 lines of code that consist of the bulk of a COMPLEX order processing routine are just non sense to debug. That is like a hiehgt of mismanaged and ill-thought programming. Perl allows this so much easier than most languages. Say you can write bad code in any langauge and your right. Say you can right the most incomprehensible nonsense in Perl and your even more right!

    I understand that Perl's flexibility allows for problems. With flexibility comes the need for responsibility and common sense. It's really not Perl's fault. I'd rather have that flexibility when I need it rather than have it stripped away. That's one of the main reasons I use Perl more than Java. Again, they both have their place when used responsibly (is this starting to sound like a beer commercial? ;)

    I think the programmer of the system enjoyed trying to make everything as complex as his twisted little mind could. Im not a slouch to programming really complex business systems that have insane requirements. But I just get so flustered when I inherit some project that has code that would win a Obfuscated Perl Contest
    hands down. Its annoying and I spent a day trolling news lists and looking over my Camel book trying to figure out how in the hell some stuff works. Blah. behaviors change from version to version. No standards body controlling perl. (Not that the C/C++/Java people do much better..) and a ton of other factors just lead to this dislike for perl in a real business environment that has comple business needs. Perl for everything right, Not.. Everything has its place. The highend of business programming is not it for Perl.


    I hate having to deal w/ code like that but I still think it's worth it to have such a wonderful language. It's a shame that most people let this kind of thing reflect directly on Perl instead of the programmer.

    As far as changes go, that's just the way it goes. For a language to evolve, changes are inevitable. Java 1.1 and Java 1.2 illustrates this. I've been told about horror stories with C++ as well (I'm not proficient enough to tell you if that's true or not).

    As I said before, I don't think the "highend of business" is where Perl belongs either. I think it fits nicely being the beautiful duct tape that it is.
  • Okay, so its just my cynical rant because I got a badly managed and coded project.. ;-P

    You know, I have a theory about why there is so much bad perl code out there. I believe it's usually a first language. Newcomers have no idea what it's like to manage someone else's code much less organize large projects. Until that can be experienced, there's no real incentive to keep the code clean.

    The other half of it is just a bunch of twisted, sadistic wierdo's ;)
  • END {perl}; no strict interpretation; wait for 6.0;
    @O'Reilly::Larry_Wall{Hacks};


  • Actually, Perl benchmarks faster than Java for real world problems. Rob Pike & Brian Kernigan's "The Practice of Programming" has a bake off between various languages, implementing a markov chain algorithm. The results are (Stolen from an earlier /. discussion):

    PentiumII400MHz ----- Lines of source code
    C ----- 0.30sec --------------- 150
    Java --- 9.2 ------------------ 105
    C++ --- 1.5 ------------------- 70
    Awk ---2.1 ------------------- 20
    Perl --- 1.0 ------------------- 18

    Now I don't know about you, but for the majority of my programs I'm happy with a 3x speed decrease to avoid the hassles of dealing with C (And I'm more than happy to have a 1/3 increase by avoiding C++!!!!!) I'm not willing to pay a 18x decrease.

  • They all have the look of "get the job done" rather then "use a lot of tricks to make it as fast as can be".

    All of the programs could be optimized I'm sure, I know the Perl experts have offered suggestions on the Perl version. However, the test was comparing programs as written by people familiar with programming but not neccessarily experts the the language. It's not valid to compare expertly written programs, because most programmers aren't experts. Very few programs are optimized in real life, simply because there are always more important things which need doing.

  • How does Perl compare to C with respect to speed? It's damn slow. How does Java compare to C with respect to speed? It's damn slow. What are the advantages then?

    The advantage for both Perl and Java over C is that you can *write* code much faster in those langauges and still be sure that you don't have stray pointers and memory leaks, potential unchecked buffers etc. This means that you have more time to concentrate on writing good logic and using good algorithms, which in turn means that in the end, you end up with a much "better" application that works just as well. What you lost in speed of execution you gained in a nice architecture and with good algorithms.

    Of course a really kick-ass C coder is just as productive in C, but those coders are extremely hard to find and very expensive. So in the real world with tight deadlines where computers are cheaper than coders, Java and Perl will give you a better run for your money that C will.

    That's not to say that C and C++ doesn't have it's places - it does. I wouldn't code a game in Perl, for example, nor would I code an operating system in Java. But I would also never code an e-commerce site in C (actually I have, but that was in 1996 :) Use the right tool for the right job.

    BTW.. Perl's real strength from a purely language point of view is probably it's close relation to regular expressions, which makes string handling very easy to do and extremely hard to read for other coders who are reading your code. You can do more in one line of Perl (string handling) than you can with 1000 lines of C code in many cases.
  • I have a very hard time believing that benchmark. I'd like to know what virtual machine was used and if the Java coder who wrote the implementation had any idea of what he was doing.

    For non-graphical applications, Java tends to be quite close to C++ in speed. I'm talking about stuff like math and those kinds of things. They are JITted to almost the exact same code that a C++ app would have been compiled to in the first place. I also have a hard time understanding how Perl would be faster than C++ and AWK almost as fast as C++. There must be something seriously screwy in some of those implementations.

    I'm sure there are plenty of benchmarks to show very different results. For example, many CGI benchmarks show Perl and Java to be very similar in speed and only slightly slower than C based CGI programs. Go figure..

    About not being able to pay a 18x decrease.. Well.. If your response time on the web server is 0.1 seconds, does it matter if it's 18 times slower than the optimal? Especially, if the code is now much easier to maintain and extend.. As long as the application is fast enough, it doesn't matter IMHO.
  • After seeing that benchmark, I got very curious cause I thought it was VERY odd. I read up on the web and noticed that several had commented that the test is VERY favorable for Perl and Awk due to it's "heavy use of associative arrays and strings". I can't comment that because I don't have the source code for any of those but I do know that the results are very weird.

    So.. I decided to write my own little - very simple - test just to get a ball park feel of language speeds. A small app that has two nested loops and calls a method, passing two integers as parameters and returning an integer as the result. I wrote a C, a Perl and Java version and compiled and ran these. My machine is an Athlon 500 with 128 megs of ram running a Microsoft OS. The Perl in use is ActiveState's 5.005_003, the Java VM is JDK 1.3.0_01-beta and I compiled the C app with both gcc and Visual C++ (no difference in speed for those). The result is that the Java app is fastest with a run time of 3 seconds and that includes loading java.exe, granted it's only a marginal overhead. The C comes at a very close 2nd with 4 seconds. The Perl version was so slow that I had to decrease it's iterations to 100 times less than those of the C and Java version. Then it took 7 seconds. I guess it would have taken 700 seconds otherwise but I couldn't wait that long.

    Now I know this proves very little. For example, that Java is faster than C probably only shows that the HotSpot could optimize the code VERY well as it ran and thus be faster. I know that a normal Java app wouldn't do this well compared to C.. But in this simple case, it did BETTER. Perl.. well.. the app ran so slowly that I decided to INLINE the method call.. Get rid of it and just say $x = $i * $j; instead. It STILL takes 300 seconds - 100 times slower than the Java version.

    The *ONLY* thing this proves is that benchmarks can be tilted in any direction you want. Unfortunately I can't post the sourcecode because it triggers the "junk" filter. I can email them to whoever wants them tho.. There's nothing special in them anyway and anyone could do the same test to see themselves quite quickly.
  • That's what I guessed.. I downloaded the source code and the Java is horrible. It uses all the built-in slow string stuff like StreamTokenizer that is the first thing you learn to NOT use when you use Java and want performance. It uses the Vector class but doesn't initialize it to any special size so the default is used, which is way too small in this case. Since java.util.Vector is implemented with arrays and copies the old array to a newly allocated one if it runs out of space, it's horribly slow if it needs to resize itself. I could go on forever.. STILL.. I ran both the Java and the Perl samples and it was 5 secs for Java, 3 for Perl. So so much for a 9 time speed difference.. I thought it was full of shit anyway. Feel free to test the stuff yourself if you want.. I could optimize the Java code a little bit if anyone wants to challenge this for a speed contest. I'm pretty sure I could half the Java execution time with an hour of work..
  • Heh, yeah, that IS a good example. I wasn't actually thinking about that but yeah.. Just *7* lines.. And completely unreadable. :) And btw, I'm no Perl coder and I would be the last to do Perl advocacy but for string handling, it really rocks as far as speed and code compactness goes. Personally I prefer to do string handling object oriented but that's just me and I wasn't always like that.
  • Yeah that's a troll :-)

    % perl -v

    This is perl, v5.6.0 built for i386-openbsd (with 2 registered patches, see perl -V for more detail)

    Copyright 1987-2000, Larry Wall

    Perl may be copied only under the terms of either the Artistic License or the
    GNU General Public License, which may be found in the Perl 5.0 source kit.

    Complete documentation for Perl, including FAQ lists, should be found on
    this system using `man perl' or `perldoc perl'. If you have access to the
    Internet, point your browser at http://www.perl.com/ [perl.com], the Perl Home Page.

    % uname -sr
    OpenBSD 2.8
  • Just that, patching against 5.6.0 will complaing about existing lib/CGI.pm ... just delete it and it will get patched right (or so I hope, compiling right now ...)

    - german

  • Just to report that even when compiling 5.6.1 with -O3 -fomit-frame-pointer (gcc-2.95.3/glibc-2.2.2) all tests are passed, while with 5.6.0 some of them failed (about 1.5%).

    Now

    All tests successful.

    u=0.68 s=0.3 cu=46.56 cs=8.14 scripts=255 tests=13098

    Great work!

    - german

  • Because he could have achieved exactly the same effect with:

    $_ =~ s/foo/bar/;

    And not closed/opened/closed/opened file handles and lots of other pointless stuff. :o)
  • by xixax ( 44677 )
    I am still waiting for Micro$oft to release ActiveAwk and IntelliSed

    Xix.
    P.S. Line from fave braindead contracted Perl program I have seen this year:
    system("sed -f $sedfile < $infile > $outfile");
  • Because they're expressions describing `regular languages,' any language (collection of strings) which can be recognized by a finite automata.

    -_Quinn
  • I just happen to be reading this article on my Powerbook running OS X...

    % perl -v


    This is perl, v5.6.0 built for darwin

    Copyright 1987-2000, Larry Wall

    Actually I believe perl 5.6 was in the Darwin distribution a few weeks after it was released.

  • Can a perl buff outline the main advantages or is it just the peer group effect?

    Well, as you can already see from some of the threads, some people don't like Perl at all. It's just a matter of time until a major Perl vs. Python flamewar breaks out.

    I think that inclination toward Perl has a lot to do with your background. As much as I like it, I have to admit that a lot of the syntax that so many people complain about can seem a bit quirky -- all of that punctuation and funny variable names, and all kinds of idiomatic idiosyncrasies.

    But what many Perl-bashers don't realize is that almost none of this was invented the first time in Perl: Most elements of Perl syntax have a precedent in languages and tools that are common and familiar if you happen to use and adminster Unix every day. Shell script languages, sed, awk, grep and many other regular expression tools, and C -- all of these have idioms that you find again in Perl.

    So if you live Unix every day, and these idioms all come naturally to you, then Perl doesn't seem that unnatural at all. This is the way I feel about Perl -- it is very grokkable, its syntax flows out smoothly, it just seems to feel like the right way to express these things.

    But then there are the people who just can't a feel for Perl at all. In many cases, I've found that these people aren't as familiar with the standard Unix tools either, so the idioms of Perl that seem so ordinary to me are entirely alien to them.

    Incidentally, I am very amused by the posts in this thread that complain about regular expression syntax, as if they are pointing out a weakness of Perl. Regular expressions are a mathematical concept and powerful computer science tool that have been known about for decades. And as hard as the syntax may be to learn, Perl follows tried and true idioms that have proven their worth in numerous tools and programming languages: the various greps, awk, Tcl, emacs, the C regex libraries and the POSIX standards. These all certainly have their differences, but they converge on a syntax for regular expressions that is very nearly a standard; and Perl has the most sophisticated version of them all. Maybe, just maybe, this common syntax is not so obtuse after all; maybe, just maybe, it's the most efficient and compelling way to express that is intrinsically quite powerful and complex.
  • What an incredibly ignorant statement.

    Perl 5.6.1 on Windows platforms is currently in beta and should be available soon from ActiveState [activestate.com].

    In the future, you should only comment on subjects you are actively educated in.

  • It was just out of curiosity, given that regular expressions are a bit arcane.

    I think this is the single biggest misunderstanding of Perl.. People call perl line-noise because of all the shift-numbers scattered throughout.

    The fact of the matter is that the regular expression package is a very standard mechanism of pattern matching (though there are slight syntactic deviations from one implimentation to another). If you don't understand the importance or power, then you've obviously never taken a compilers course, and have little room to critique a language. Additionally, sed, awk, and even C or Java _do_ have regular expression packages. If Perl expressions are incomprehensible, then you're no better off there (with possibly the added benifit of seeing a class name that denotes what's going on).

    The fact of the matter is that the traditional C / Java style string searches DO exist in perl. You have printf with %s includes, you have "index", "rindex", etc. But they suck for complex operation (in both performance and readibility).. Once you actually spend the time to learn expression matching it becomes a powerful tool and actually makes it easier to read complex parsing code. (just think of how difficult UNIX /DOS would be without "foo.*")
    As for the other 'confusing notations', such as $hash_ref_name->{field_name}[ $index ]. This was a necessary evil to maintain 'string interpolibility'. Meaning, we aren't restricted to 'res = sprintf( "value = %s, %i\n", name, age)'. We can now just say $res = "value = $name, $age". Again, once you learn the rule-set, it makes string-based operations significantly easier.

    The problem is that it's now being used for things other than text, so these get in the way.

    In fact, Perl has one of the easiest to read variaents of the regular-expression language, since it allows liberal use of white-space and comments. If someone throws together a 5 line regular expression, then it's no better than a c coder that put everything on the same line with little or no white-space. It's just a matter of programing style (which admitadly most perl coders lack).

    The addage, "best tool for the job" applies.. If you're doing text manipulation (which often includes web page generation), then Perl is indispensible. If all you're doing is back-end database manipulation, maybe Python, VB or Java is better suited. I wouldn't advocate Bash programming because the same arguments that apply to Perl are doubly so in many other shell languages.

    Lastly, as to the stability of perl. As long as you don't resort to volitle C code or "experimental features" of the reg-ex's, then you're pretty stable (aside from memory leaks). I've rarely ever had perl core without one of these two factors. Yes there are well documented ways that you can abuse regular expressions (the 50 million year search time due to exponential back-tracking, for example), but that's part of learning the language.

    -Michael
  • with "for(int i..." in standard C++, i's scope is for the duration of the loop. in MSVC, you've just declared a new variable, so putting two for loops in a row with the same variable initialization will cause compiler errors which you *shouldn't be seeing*.

    IIRC g++ (the gnu c++ compiler) does something similar when the Wshadow switch is enabled.

    --

  • It's the regexps. Definitely the regexps.

    C jokes are about pointers to structures of arrays of arrays of pointers to structures.

    Perl jokes are, as we just saw,
    "a\nxb\n" =~ /(?!\A)x/m

    The regexps are much more useful in day-to-day life.

  • ...little discrepancies between Microsoft's idea of C++ and the ANSI standard -- a famous example is Microsoft's interpretation of the scope of variables declared inside the conditional section of a for loop. with "for(int i..." in standard C++, i's scope is for the duration of the loop. in MSVC, you've just declared a new variable, so putting two for loops in a row with the same variable initialization will cause compiler errors which you *shouldn't be seeing*.
    That error's not so obscure... I've ran into it several times when a prof used VC++ as the compiler for our assignment code.
  • I have learnt Perl but haven't got around to getting hooked to it. My freinds all seem to be quite happy to be using ASP. Can a perl buff outline the main advantages or is it just the peer group effect?
    Perl does so much more than just CGI stuff. The Swiss-Army Chainsaw I believe it is sometimes called.

    Oh, and you can use Perl for ASP, just go to http://www.activestate.com [activestate.com]

  • Why in the name of Larry Wall are you installing Bundle::libnet? Are you going to be packaging your own libnet .tar.gz distribution?

    What you really want is http://www.cpan.org/modules/by-module/Net/libnet-1 .0703.tar.gz [cpan.org].

    -Gerard
    http://www.lanois.com/perl/ [lanois.com]

  • I use perl all the time. There's three thnings that really make it useful for me.
    1. Versatile regular expressions
      Really great for slicing and dicing those text strings. Just about any pattern I can imagine can be parsed. Probably more then I imagine, 'cause I've only learned about 50% of the regular expression syntax.
    2. Associative Arrays
      I'd never seen associative arrays (hashes) until Perl, but they're great, they make it so much easier to organize information. I believe LW himself said something like "if you're not using associative arrays in your Perl script you're probably doing something wrong." I really didn't grok the significance of hash tables (the underlying mechanism) until I encountered AAs in Perl. Now I love hash tables.
    3. Everything Else
      which includes the fact that its interpreted (no compilation), easy conversion between types (or maybe its really no types), anonymous variables (or whatever you call those $_ things, they're really convenient once you get used to them), almost all the neat little system functions C, and a flexible/redundant syntax (but with the potential for greater obfuscation) which makes perl more like a human language then a computer language, but not everyone will think that last item is a good thing.
  • Upgrade CPAN.pm, the latest version does not force you to upgrade Perl anymore.
  • Now we can write the code for the Infinate Improbability Drive...

    Better yet! We can now implement RFC2795 [isi.edu] (the Infinite Monkey Protocol Suite) in Perl! Who wants to start a page on sourceforge? :)

    Shayne


  • A couple months ago, I would just have said "hmph, he's bluffing!"

    After DeCSS, I realize that if 7 obscure lines of code can stand for all the random looking strings and C++ string handling routines, then Perl is the way to go for text[/signal?] processing.

    I wonder if learning Perl could help my language theory professor to implement a revolutionary Natural Language Processing compiler in her lifetime.

  • Assumming you're serious, if you look up [dictionary.com] "regular", you'll find the following definition:

    In conformity with a fixed procedure, principle, or discipline.

    Regular expressions provide an regular or ordered way to specify patterns.

  • Now, if only

    tr/Delorean/Heart of Gold/g;

    would work... sigh... 8^)

    I think you mean
    s/Delorean/Heart of Gold/g;

    The tr/// operator is translation, and s/// is substitution.
    --

  • Questions like this are why spending 4 years in university might be valuable. It helps to all be speaking the same language when you head out to solve a big problem.
  • This is what happens when you use a compiler written before the standard was solid. MS has patched quite a bit but I'm sure that some of the bugs in MSVC are hard to fix in a patch. Let MS come out with its next full version before you get too down on compliance or non-compliance. GCC needed a few versions before it was up to par I'm sure.
  • Weak references did go into Java 1.2, but they're there mostly to deal with the mess associated with finalization. Perl has different issues.

    Finalization in Java involves calling a destructor function from the garbage collector. This is ugly, because you have no idea when it will be called. It might be called from an unexpected thread, or when resource locks are set that the destructor needs. So Java finalizers have to be very carefully written, and can be a source of wierd, hard to reproduce bugs.

    Perl destructors ("DESTROY") are called when the reference count goes to 0. This occurs at a time defined by the source code that owns the object, not some random time determined by garbage collection. So the uglier issues of asynchronous finalization are avoided. Reference counts are clean conceptually; the remaining problems are overhead and memory leaks.

    Reference counts create a leak problem for circular reference structures. Weak pointers offer a mechanism for solving that problem. It's not airtight; you can still write programs that leak memory. But weak pointers help. Basically, anything that's a "backlink" probably should be a weak pointer.

    One of my back-burner projects is a paper called "Pointer Patterns". It's on how to use strong and weak references in a way that eliminates the possibility of both memory leaks and dangling pointers. Basically, you have to be clear on which references "own" stuff and which don't. I was writing about C++, but something on Perl may be in order.

    (I've been writing a proposal for "Strict C++", a set of restrictions that eliminate the possibility of buffer overflows and dangling pointers without too much run-time overhead. Basically, you have to use the STL, and you have to use some things like auto_ptr to allocate anything. "new" and "delete" are not used. Some additional compile-time checks are made. Iterators are allowed, and checked. Most of the checking can be optimized out without loss of safety. C arrays and arithmetic on raw pointers are not allowed, but you can get the same effect with checkable iterators. The overall effect is the safety of Perl or Java without the overhead. But that's another topic.)

  • by Animats ( 122034 ) on Monday April 09, 2001 @08:24AM (#305645) Homepage
    Perl just added weak references. [cpan.org] This is an idea that's been known in the academic programming language design community for over a decade, but until now, it hasn't shown up in a mainstream language.

    A structure with backlinks, such as a tree in which each node has a back pointer to its parent, is a great use for weak links. If you do that in Perl now, you can get a memory leak, because it looks like a circular reference and the reference counts won't go to 0. If the backlinks are weak links, the tree will be deleted when all external references to it go away.

    The Perl implementation of weak links turns weak links to an object into "undef" when all the regular (strong) links go away. It's not clear if this can result in an object being deleted while in use. When you dereference a weak link, does that bump the reference count? Unclear. Details like this really matter in threaded programs, where one thread might drop an object in use by another thread. The Perl documentation is hazy on this.

    I'd like to use it, but my Perl code has to run on servers that don't run the latest Perl. It's going to be useful in future, once newer Perl implementations are widely deployed. Once it's standard, widely used tree structures like HTML::Element should use it.

  • Yeah, Perl is actually okay for somet asks and Ill admit that most incoherent perl code is just a fault of the code writer and lack of standardization on the one true way to write perl.

    Okay we can go into the theres 9,000,000 ways to do it argument here if you want but I dont see large corporations hinging their products on Perl. In Java, or C++ there is OWTDI not TIMTOWTDI ;)

    When you talka bout structured programming environments it simplifies the process of having a coding standard and everything if you dont have to nail down all of the frivilous details of a language.

    Its just silly when you have to become granular about rather basic programming constructs and how to use them (IMO).

    Im only going to stick to the theoretical here but I have found some Perl code that simly defies common sense in production level and quality code. These little 5 lines of code that consist of the bulk of a COMPLEX order processing routine are just non sense to debug. That is like a hiehgt of mismanaged and ill-thought programming. Perl allows this so much easier than most languages. Say you can write bad code in any langauge and your right. Say you can right the most incomprehensible nonsense in Perl and your even more right!

    I think the programmer of the system enjoyed trying to make everything as complex as his twisted little mind could. Im not a slouch to programming really complex business systems that have insane requirements. But I just get so flustered when I inherit some project that has code that would win a Obfuscated Perl Contest hands down. Its annoying and I spent a day trolling news lists and looking over my Camel book trying to figure out how in the hell some stuff works. Blah. behaviors change from version to version. No standards body controlling perl. (Not that the C/C++/Java people do much better..) and a ton of other factors just lead to this dislike for perl in a real business environment that has comple business needs. Perl for everything right, Not.. Everything has its place. The highend of business programming is not it for Perl.

    okay I feel much better now..

    /me quietly goes back to his hole in the wall and codes.

    Jeremy

  • Okay, so its just my cynical rant because I got a badly managed and coded project.. ;-P

    I cant help it now I have this illogical aversion to perl.

    But im not really against it honestly, Like I said its just a matter or procedure and a little common sense and consistency and any project can use it.

    I really dont let it reflect on perl because I understand that it is a function of a programmers skill how elegant and clearly understood the code is (even using neat tricks you prolly wont see in other languages ive seen clearly nice and easy to read Perl)

    I guess those late nights several weeks in a row brainwashed me. :) hehe.. I am still competent enough and when I need a quick program to do this or that ill whip out perl (I love ActiveState perl on windows)

    I also like being able to run it on any platform (heh I almost think Perl is more cross platform than Java :)

    Anyways, I was just ranting and being a lil non-sensible. I really dont hate Perl or go out of my way to see it come to an early doom :)

    Jeremy

  • by jallen02 ( 124384 ) on Monday April 09, 2001 @04:18AM (#305648) Homepage Journal
    Considering you can write ASP in Perl I would say that your not propelry informed. You can do ASP in JScript, Perl, VBScript etc.

    Given that all of those are available to ASP use whichever you like to your hearts content.

    The most common language is VBScript in ASP. However I have seen them all used to varying degrees of coherency and success. (The perl parts being the least coherent and most painful to fix... odd) Anyways apples and oranges. Cant compare ASP to Perl. Try comparing VBScript to Perl. Then we can get a good discussion about two languages going.

    Jeremy

  • Regular expressions match grammers that are in the form:

    S --> a

    or

    S --> aA

    or

    S --> lambda

    Where S and A are any non-terminals, and 'a' is a terminal. Lambda represents the null string.

    Example:

    S --> aaA
    A --> bA | b

    regex = aabb*

    I hope this helps. To gain further insight, do a net-search on "DFA".

    NAS
  • I don't think you read your book correctly. What I posted was the definition for right-linear, regular grammars.
  • Geez, take a joke, moderators.

    Not posting anonymously because I'm standing behind what I'm saying. Please take that into account, moderators, if you consider this post.

    Thank you.

    --

  • Think of it this way: in Latin a regula [tufts.edu] is a "ruler" or, abstractly, a "rule." The adjective regularis [tufts.edu] means "pertaining to rules" or "having rules" (maybe "rule-based"). A liber regularis is not a "regular book" but a "rule book", so you could think of regular expressions (expressiones regulares) as "rule-expressions," "expressions of rules" (i.e. the rules for what is considered a match)
  • It was just out of curiosity, given that regular expressions are a bit arcane. Thanks for the info.
  • I won't correct you, because I haven't a clue what you just posted :-)
  • by cyber-vandal ( 148830 ) on Monday April 09, 2001 @04:08AM (#305655) Homepage
    Why are regular expressions called that? Regular in what way?
  • Unfortunately, that "fix" is illegal in ANSI C++. Further, no diagnostic is required by a compiler when parsing your "fix"; a conforming implementation may fail silently and horribly.
    Can you please explain? Why is this #define illegal in ansic c++? And what do you mean by 'no diagnostic is required by a compiler when parsing your "fix"'? I don't quite follow.

    Care about freedom?
  • I have to agree with that.

    At first though I was a little bit intimidated by Perl, having heard that it was ugly and hard to understand.

    It reveals that it isn't, and for a very good book on beginning with Perl one should definetly think about "Learning Perl" from Randall Shwartz.

    Very narrative structure, interesting read. That book changed my life ;-)

    It may not go in all the little details, but for me it was enough to get me started and start doing nifty things right away.

  • Oh, did they include "improbability" also? Or do you have to express it as (1-"probabilty") in the whole code? What a mess...
  • a famous example is Microsoft's interpretation of the scope of variables declared inside the conditional section of a for loop. with "for(int i..." in standard C++, i's scope is for the duration of the loop
    Well, Cletus, I think you'll find that C++ used to use Microsoft's interpretation [new-brunswick.net] (look at question 34.2), and that Microsoft is using a suboptimal, but reasonable, approach to compatibility [mvps.org] with old source code.
  • by ChaoticCoyote ( 195677 ) on Monday April 09, 2001 @04:37AM (#305662) Homepage
    Not to mention the sense of power you feel when you type some code that is incoherent gibberish to 99.99% of the world but is actually a usefull perl program.

    ...is an example of why I haven't much use for Perl. I gave up running line noise when I stopped programming in APL... ;)


    --
    Scott Robert Ladd
    Master of Complexity
    Destroyer of Order and Chaos

  • by Bistromat ( 209985 ) on Monday April 09, 2001 @03:48AM (#305664)
    with a language as utterly inexplicable as perl, i wonder how programmers *find* bugs, especially in later releases, where the bugs are that much more obscure.

    "Say, Cletus, ah jist noticed that "a\nxb\n" =~ /(?!\A)x/m doesn't properly eval-yoo-ate for certain prop'rties of 'nxb'.

    then again, i suppose it's about as easy as finding all the little discrepancies between Microsoft's idea of C++ and the ANSI standard -- a famous example is Microsoft's interpretation of the scope of variables declared inside the conditional section of a for loop. with "for(int i..." in standard C++, i's scope is for the duration of the loop. in MSVC, you've just declared a new variable, so putting two for loops in a row with the same variable initialization will cause compiler errors which you *shouldn't be seeing*.

    --nick
  • Why are regular expressions called that? Regular in what way?

    "Regular" - formed, built, arranged, or ordered according to some established rule, law, principle, or type.

    "Expression" - a mathematical or logical symbol or a meaningful combination of symbols.

    Put them together, and it's fairly straight-forward terminology that comes from computer science.

    (I remember first studying them, outside of Perl, in my Discrete Structures CS class.)
  • "UNIVERSAL::isa()"

    A bug in the caching mechanism used "UNIVERSAL::isa()" that affected base.pm has been fixed. The bug has existed since the 5.005 releases, but wasn't tickled by base.pm in those releases.

    In other words, ALL YOUR BASE.PM BELONGED TO ISA().

    But it's fixed now.
  • Nobody is happy using ASP/VBScript.
  • Perl just added weak references. This is an idea that's been known in the academic programming language design community for over a decade, but until now, it hasn't shown up in a mainstream language.

    Python 2.1, currently in beta, has them too. See the development docs [sourceforge.net].
  • I have learnt Perl but haven't got around to getting hooked to it. My freinds all seem to be quite happy to be using ASP. Can a perl buff outline the main advantages or is it just the peer group effect?
  • I regular expression is called that because it can generate the strings in a regular language.

    And a regular language is the collection of strings that can be generated by a finite state automaton.

    i think... somebody will correct me if I'm wrong.
  • here is a better definition [mala.bc.ca] of regular languages / expressions, from Dr. Doom himself.

  • Unfortunately, that "fix" is illegal in ANSI C++. Further, no diagnostic is required by a compiler when parsing your "fix"; a conforming implementation may fail silently and horribly.

    There are much more interesting errors in MSVC. Incorrect scoping of namespaces with a using declaration, incorrect template instantiation, and lack of support for template friends are just a few.

    Hang out on the comp.lang.* newsgroups for a few days and I'm sure you'll come accross plenty of implementation bugs. It's true in comp.lang.c++, at least.
  • by Salieri ( 308060 ) on Monday April 09, 2001 @03:39AM (#305683)
    It just occurred to me, how many departments does slashdot HAVE? I've seen hundred and hundreds already. Where do you find managers and resources for an operation that large?

    --------------------------------
  • I believe the largest factor in Perl's popularity is Lary Wall's personality. He's just silly.

    Programming perl has a joke or two on every page making it seem like fun to learn that crazy language.

    Not to mention the sense of power you feel when you type some code that is incoherent gibberish to 99.99% of the world but is actually a usefull perl program.
    -----------------------------
    kaaaameeeeeeehaaaaaameeeeeha!
    -----------------------------

To be is to program.

Working...