Forgot your password?
typodupeerror
Perl Programming

Perl 6 Grammars and Regular Expressions 202

Posted by CmdrTaco
from the zztop-wants-a-perl-necklace dept.
An anonymous reader writes "Perl 6 is finally coming within reach. This article gives you a tour of the grammars and regular expressions of the Perl 6 language, comparing them with the currently available Parse::RecDescent module for Perl 5. Find out what will be new with Perl 6 regular expressions and how to make use of the new, powerful incarnation of the Perl scripting language."
This discussion has been archived. No new comments can be posted.

Perl 6 Grammars and Regular Expressions

Comments Filter:
  • Grammar (Score:4, Insightful)

    by dprust (316840) * on Monday November 08, 2004 @12:23PM (#10755965)
    It is good to see PERL focussing on what makes it great. There is no other language, IMHO, that handles text input as well as PERL does. Adding this level of processing just makes it even more powerful.
  • by winkydink (650484) * <sv.dude@gmail.com> on Monday November 08, 2004 @12:36PM (#10756117) Homepage Journal
    What does Perl6 offer a satisfied Perl5 user? Is it faster? Smaller?

    To this user, the last several releases (5.x) have looked more like opportunities for continuing royalty streams for perl authors (new versions of old books) than significant releases.

  • by TheFlyingGoat (161967) on Monday November 08, 2004 @12:40PM (#10756171) Homepage Journal
    Those of us that use Perl as more than just system duct-tape know it's a programming language. Perl 6 will make that even more clear by being based on OO fundamentals rather than being a procedural language with OO tacked on top of it. This is just another debate that makes the OS community look like a bunch of freaks and zealots... just like the GNU/Linux thing. Get over it and start focusing on what the software does, not how to classify/name it.
  • by Matthew Weigel (888) on Monday November 08, 2004 @12:51PM (#10756304) Homepage Journal
    ...then it will stop being a good duct tape for systems.
  • by imnoteddy (568836) on Monday November 08, 2004 @12:53PM (#10756329)
    I can understand a desire for adding grammars that are more powerful than regular expressions in Perl 6 but it opens up a whole new can of worms.

    The grammars appear to be in a class called "context free languages"(CFGs). Some CFGs are ambiguous in the sense that a given "sentence" can be derived from more than one set of rules. Traditional tools such as yacc/bison tell you where there is ambiguity in your rules - even then it isn't always easy to remove the ambiguity (trust me on this). If the Perl 6 system doesn't help the programmer debug the grammar he/she will not be happy when the parsing doesn't work as expected.

    In addition, the article ends the description of features with "And much more...". It appears that Perl 6 grammars are more powerful than CFGs. If they can simulate a Turing machine...

  • by pragma_x (644215) on Monday November 08, 2004 @01:18PM (#10756586) Journal
    Actually, I for one tried to learn perl once.

    I tried to absorb the syntax docs one afternoon, but it gave me nightmares. Literally. It was as if the C-programming-part of my brain was in conflict with the oddball operators and constructs presented in the perl language. Ever since I've been haunted by perverse unreadbility of it all. I liken the experence to attempting to think in brainfuck [c2.com].

    So by this I know for a fact that perl is Not My Thing(tm).

    Now, a more objective reason as to why perl isn't generally a good language (but not completely bad) is evident in the very syntax of perl itself.

    Useful code shouldn't be as inscrutable as its compiled counterpart since that defeats the whole point. Perl is a language, so it follows that it is a communication medium. By that it should be able to communicate something to a party outside just the author and the perl interpreter. It can accomplish this, but not without the reader having to go through the mental gyrations of what could be best called linguistic decompression. The language has this tendency to impose this extra step to yield the information communicated. Simply put: it gets in its own way.

    Now, useful programs on the other hand follow a different set of criteria of course. I've used perl-coded stuff online all the time, and enjoy its reliability and speed.

    I'll give credit to the fact that perl is compact, terse, to the point and has a reputation for string manipulation. It fills this niche rather well, and is a prime example of the being "the right tool for the right job".

    IMO, "the right job" for perl is about 2% of all programming tasks out there. This is evident by the fact that even though perl was the prominent CGI language of the mid-nineties, it lost the overwhelming majority of that interest with alarming speed.

    As for python, I'm sure it fills a niche too... whatever the hell it is.
  • Re:Adoption (Score:5, Insightful)

    by Black Perl (12686) on Monday November 08, 2004 @01:26PM (#10756687)
    Yes, you did misinterpret the message. Eric Raymond was a former Perl programmer, and is now a Python programmer. He was saying that Python's native-code-binding facility is superior than Perl's XS, and it would benefit Perl to adopt it. He mentions that Python benefitted from adopting Perl's regex syntax. Nowhere does he say or imply it was "grudgingly" done.

    By the way, not long after he wrote that, Perl coders started using the Inline:: modules like Inline::C [cpan.org] instead of XS, which is very easy to use. I do not know if this was an adoption of Python's technique, but I don't think so.
  • Re:Perl goodness (Score:3, Insightful)

    by Black Perl (12686) on Monday November 08, 2004 @01:41PM (#10756861)
    Just so people know, Perl gets its reputation for being line noise largely from its early adoption of regular expressions. For example:
    s/^\/\\\$(.*)$/\/\/$1\//;
    But now this syntax has made it into just about every other language. And so now you can accidentally program a web browser in any language.
  • Perl perspective (Score:2, Insightful)

    by Anonymous Coward on Monday November 08, 2004 @01:47PM (#10756936)
    I find that reaction to Perl by people familiar with highly structured languages is common. I think this is because Perl has things like weak typing and overly flexible syntax, things that make experienced programmers vomit in their own mouths. But what's great about Perl is that you CAN have strict grammar, and you CAN have strong typing, if you desire. It's just not required.

    This makes Perl very strong as a teaching tool for beginner programmers. They can start out writing loose, messy code that gets the job done, and slowly work towards stronger, more structured language. People that wouldn't have the logical dicipline to jump directly into C++ can use Perl to learn logical programming one concept at a time, and eventually be ready to grasp required C++ concepts like objects. But you'll find that people raised on Perl, even if they primarily program with C++ or Java, always come back to Perl for the quick scripting or text-processing tasks.
  • by adamruck (638131) on Monday November 08, 2004 @01:47PM (#10756937)
    I think you went about things the wrong way. Why would you ever look at the nitty gritty syntax rules first when trying to learn a language. First do some simple examples to get the general feel of the langauage. Then learn the nitty gritty stuff as required.

    "IMO, "the right job" for perl is about 2% of all programming tasks out there."

    76 percent of all statistics..... You get the point. You really dont have any valid point here, every language is designed to do certain things, and people will use it for those things and more. Trying to say whats the best langague out there is stupid. Trying to say what percent of projects perl should be used on is also stupid.

    "It can accomplish this, but not without the reader having to go through the mental gyrations of what could be best called linguistic decompression."

    Have you tried to program in a logical language lately? Have you tried to program in a functional language lately? Have you tried to program in anything but your standard imperical/oo language lately? There are tons of styles of languages, and each one required its own linguistic decompression. Which one feels more natural its a matter of opinion.
  • by TimToady (52230) on Monday November 08, 2004 @02:07PM (#10757144)
    If you assume that languages can't scale both up and down simultaneously, please don't try to design any computer languages near me. Thank you.
  • Re:Perl goodness (Score:5, Insightful)

    by jandrese (485) * <kensama@vt.edu> on Monday November 08, 2004 @02:10PM (#10757171) Homepage Journal
    There are two things about regular expressions:
    1. Perl chose a keystroke-efficent syntax that makes them unreadable to anybody who doesn't know how to read them. It also made them very compact and easy to write for anybody who does know how to read them. They look very intimidating, but underneath they are usually easier to understand than the C like perl code surrounding it.
    2. They are amazingly useful. Seriously, if you have never learned about Regular Expressions you owe yourself a lesson in how they work and what they do. I've seen people spend days working on stuff that can be written (more efficently!) in a regular expression in a matter of minutes. Pattern matching is the sort of thing that every general purpose language should have, it is a shame that the basic Regular Expression libraries that comes with most Unixes is such a piece of crap. Who wants to deal with the arcance invocation method, the extremely limited syntax, or the syntatic sugar like: "[[:digit:]]{2}:[[:space:]][[:space:]]*[[:alpha:]] *" when you could write "\d{2}:\s+\w*"?
  • Re:Perl goodness (Score:3, Insightful)

    by RangerRick98 (817838) on Monday November 08, 2004 @02:25PM (#10757379) Journal
    A lot of people don't seem to know that you don't have to use slashes as your delimiters. I use curly braces, myself, which would make the example you gave a little clearer:
    s{^/\\\$(.*)$}{//$1/};
    And of course, you could always use the x modifier (i.e., s{}{}x) to split the regular expression across multiple lines and document it.
  • by runderwo (609077) * <runderwo@ma i l . w i n . org> on Monday November 08, 2004 @02:48PM (#10757719)
    Perl is a language, so it follows that it is a communication medium. By that it should be able to communicate something to a party outside just the author and the perl interpreter.
    Perl source does communicate, with people who know Perl. That's like saying English is a useless language because it is constructed ad-hoc and because the complainer has never been bothered to learn it. The fact that some people find English difficult makes English no less useful to people who most easily express or comprehend ideas in it.
    IMO, "the right job" for perl is about 2% of all programming tasks out there.
    Nice statistic. Where's your breakdown of all programming tasks, and the reasoning for the other 98% why Perl is not the right tool for the job?
    This is evident by the fact that even though perl was the prominent CGI language of the mid-nineties, it lost the overwhelming majority of that interest with alarming speed.
    That has nothing to do with Perl the language, and everything to do with the shift towards languages which are designed to execute within a web server process without forking. mod_perl fills this hole, but as a general purpose language it is not as tightly integrated with a web server environment as something like PHP or ASP.
  • Re:Perl goodness (Score:4, Insightful)

    by ajs (35943) <ajsNO@SPAMajs.com> on Monday November 08, 2004 @04:34PM (#10759420) Homepage Journal
    Perl chose a keystroke-efficent syntax that makes [regular expresssions] unreadable

    No, it most certainly did not. Regular expressions as they exist in Perl today are a direct descendant of POSIX regular expressions which derive from the original work done by Ken Thompson (which resulted in the grep program, which stands for "global regular expression print"). That syntax further dates back to the giants in the field of computational theory, and was specialized only slightly for text matching.

    grep, awk, sed, ed, vi, emacs, and dozens of other programs and languages for Unix used this notation before Perl came along and adopted it, so let's not pretend that this syntax is somehow Perl's doing.

    The extended regular expression syntax of today IS perl's doing and in almost all cases it has been a process of making regular expressions both more powerful and more readable, culminating in Perl6's rule syntax which is highly readable by comparison.
  • by ajs (35943) <ajsNO@SPAMajs.com> on Monday November 08, 2004 @04:43PM (#10759553) Homepage Journal
    You have to divide further. Let me illustrate:

    Reasons to convert to Ponie (Perl 6 on Parrot):

    • Access to code written in other high-level languages without glue code.
    • Just in time compilation to machine code (no interpretation unless you eval a string at run-time!)
    • Cleaner access to C and C++ libraries without glue code.


    Reasons to convert from Ponie to Perl 6:

    • Vastly superior OO model, especially when trying to interface to multiple large object trees.
    • Debuggability improvements throughout the language.
    • Rules are far more powerful than regular expressions.
    • Lazy evaluation of lists powers huge efficiency improvements.
    • Subroutine definition is much more powerful
    • Named parameter passing is no longer ad-hoc and is available for all subroutines by default.
    • much, much more.
  • Re:Perl goodness (Score:3, Insightful)

    by bedessen (411686) on Tuesday November 09, 2004 @02:44AM (#10763850) Journal
    When you see a regular expression like that it's a good indicator that the person that wrote it wasn't very familiar with how to write good REs. The above suffers from "leaning toothpick syndrome." If you are trying to match the '/' character, then don't use it as the delimiter of the RE. For example, compare the following REs, which are equivalent:
    m/\/\/(.*)\/\//i
    and
    m,//(.*)//,i
    Using ',' for the delimiter of the RE means you don't have to backslash-quote the forward slash to use it in a match.

    This is just basic RE stuff and has nothing to do with perl. In fact, perl gives you many tools such as the "/x" modifier that allow you to vastly clarify the meaning of long and complicated REs, to the point of having indentation, extra whitespace, and even embedded comments. It's not perl's fault that many people only have a cursory knowledge of REs, and so they tend to write terrible REs.

"Don't worry about people stealing your ideas. If your ideas are any good, you'll have to ram them down people's throats." -- Howard Aiken

Working...