Forgot your password?
typodupeerror
Perl Programming

Perl 6 Grammars and Regular Expressions 202

Posted by CmdrTaco
from the zztop-wants-a-perl-necklace dept.
An anonymous reader writes "Perl 6 is finally coming within reach. This article gives you a tour of the grammars and regular expressions of the Perl 6 language, comparing them with the currently available Parse::RecDescent module for Perl 5. Find out what will be new with Perl 6 regular expressions and how to make use of the new, powerful incarnation of the Perl scripting language."
This discussion has been archived. No new comments can be posted.

Perl 6 Grammars and Regular Expressions

Comments Filter:
  • Re:Big problem (Score:3, Informative)

    by WWWWolf (2428) <wwwwolf@iki.fi> on Monday November 08, 2004 @01:36PM (#10756115) Homepage

    The idea of :p5 is not just that you can take Perl 5 code and modify it to make it work.

    The idea is that if you don't bother to write a zillion-rule grammar to match whatever you're trying to match, you can still use the P5-style regular expressions you know and love. It's another case of Not Swatting A Fly With The Nuke.

  • Re:Big problem (Score:5, Informative)

    by Speare (84249) on Monday November 08, 2004 @01:37PM (#10756138) Homepage Journal
    Um, ALL PERL CODE IS TREATED AS PERL5 CODE unless you use a specific Perl 6 keyword in your script. Perl 6 interpreters will not require you modify your scripts AT ALL to use Perl 5 scripts.

    Therefore, it's just Perl 6 scripts which want to use Perl 5 regular expression syntax, which would want to use the :p5 modifier.

    Don't get your knickers in a bunch.

  • by Speare (84249) on Monday November 08, 2004 @01:41PM (#10756180) Homepage Journal
    From what I've seen, it's more amenable to modular libraries and structured design. As for basic scripting where you may not even use a "package" statement, you probably won't care.
  • by winkydink (650484) * <sv.dude@gmail.com> on Monday November 08, 2004 @01:52PM (#10756317) Homepage Journal
    I've heard/seen that "this upgrade won't affect your existing code" song and dance before. In my youth, I actually believed it.
  • Re:Grammar (Score:5, Informative)

    by Mr. Muskrat (718203) on Monday November 08, 2004 @02:02PM (#10756391) Homepage

    Perl never was an acronym. It's a backronym.

    Perl is the language and perl is the interpreter. Remember, "only perl can parse Perl" and it's easy to remember.

  • Re:Big problem (Score:5, Informative)

    by Zaak (46001) on Monday November 08, 2004 @02:06PM (#10756432) Homepage
    Meaning that it is not backward compatible without modifying your source code.

    Thus spake Larry Wall in Apocalypse 5:
    ...we took several large steps in Perl 5 to enhance regex capabilities. We took one large step forwards with the /x option, which allowed whitespace between regex tokens. But we also took several large steps sideways with the (?...) extension syntax. I call them steps sideways, but they were simultaneously steps forward in terms of functionality and steps backwards in terms of readability. At the time, I rationalized it all in the name of backward compatibility, and perhaps that approach was correct for that time and place. It's not correct now, since the Perl 6 approach is to break everything that needs breaking all at once.


    And unfortunately, there's a lot of regex culture that needs breaking.

    And from Apocalypse 1:
    It would be rather bad to suddenly give working code a brand new set of semantics. The answer, I believe, is that it has to be impossible by definition to accidentally feed Perl 5 code to Perl 6. That is, Perl 6 must assume it is being fed Perl 5 code until it knows otherwise.

    In other words, it is backwards compatible, it isn't backwards compatible, and when you install Perl 6, you are installing both.

    TTFN
  • Re:bad example? (Score:3, Informative)

    by TheSnakeMan (59408) on Monday November 08, 2004 @02:17PM (#10756576)
    Next time, I swear I'll read the comments:

    # Perl 6 :w modifier surrounds all tokens with "automagic" whitespace,
    # which basically means it will match what most people would call
    # "words"

  • Re:Adoption (Score:4, Informative)

    by kavau (554682) on Monday November 08, 2004 @03:00PM (#10757074) Homepage
    I don't know the context of the quote, but to me it reads more like this: "Python benefited greatly from adopting Perl technology in the past. I hope the Perl guys will be as open-minded as we are."

    Not much hypercompetition there, if you ask me. But then, it might as well be me who misunderstood the quote.

  • Re:Pet Project (Score:2, Informative)

    by qualico (731143) <worldcouchsurfer@ g m ail.com> on Monday November 08, 2004 @03:08PM (#10757156) Journal
    Hmmm...make sure you don't make spelling mistakes in that languaje of your new kernel.

    Guess Mark wants to resolve a host name. Here is a working link:

    http://www.ozonehouse.com/mark/blog/code/PeriodicT able.html [ozonehouse.com]
  • by TimToady (52230) on Monday November 08, 2004 @04:42PM (#10758640)
    The intent is that grammars default to recursive descent, but that it be possible to ask for various kinds of optimizations via pragma. The grammar for parsing Perl 6 itself will be a hybrid between top-down and bottom-up techniques to maximize both speed and flexibility.
  • Re:Big problem (Score:4, Informative)

    by ajs (35943) <ajs@nOsPam.ajs.com> on Monday November 08, 2004 @05:22PM (#10759240) Homepage Journal
    As others have pointed out, Perl 6 interpreters (at least the default one that is Parrot-based) will hand your code off to Ponie [poniecode.org] or something like it by default. You will have to start your program with the module keyword or the use 6 statement to force Perl 6 behavior, or use a special binary (e.g. something like /usr/bin/perl6).

    The :p5 modifier is not there for backward compatibility so much as to allow the programmer to choose the model of regular expression to use. There are trade-offs. Here are two Perl 5 regular expressions:
    m{[a-z][A-Z]+}
    m{^(?:\w+\d|\S+(?:\'s)?)$}
    which are written in Perl 6:
    m{<[a-z]><[A-Z]>+}
    m{^[\w+\d|\S+[\'s]?]$ }
    Note that Perl 5 syntax is actually a bit nicer for the first one, so you can continue to use Perl 5 syntax there. In the second case, the new bracket-operator is very handy for enclosing sub-expressions that don't have to be remembered in the positional variables (the same as the Perl 5 (?:...) operator). You can even mix them:
    $r1 = rx:p5{[a-z][A-Z]+};
    $r2 = rx{[\w+\d|\S+[\'s]?]};
    $r3 = rx{^[<$r1>|<$r2>]$};
    Perl 6 is about making the things that you're going to need to do the most often much easier and much more supportable in very large projects. Relax and enjoy it, it's going to be a great ride.
  • Re:Perl goodness (Score:4, Informative)

    by ajs (35943) <ajs@nOsPam.ajs.com> on Tuesday November 09, 2004 @01:27AM (#10763299) Homepage Journal
    Larry Wall could have chosen a different syntax

    There really aren't many choices. The current regular expression syntax is the only form I've seen tried, with only minor variation.

    as he has done somewhat with the Perl 6 expressions

    Perl 6 regular expressions have almost exactly the same syntax as Perl 5. The parts that are new are not regular expressions. Cosmetic differences (like [] vs <[]> are fairly ignorable syntactically. It would be like saying that Perl 5 will use // as the comment character instead of # (not that it will, just an example).

    All of the inline comments and whitespace are part of Perl 5 extended expressions, though the word-matching on whitespace is new to Perl 6.

    POSIX on the other hand ignored most of that historic syntax and instead chose their horribly bloated keyword syntax.

    That's not really part of the regular expression syntax. Having [[:digit:]] as an alias for Perl's \d is hardly a different syntax so much as sugar. The fundamentals of POSIX regular expressions are the fundamentals of all modern regex syntaxes:
    • alphanumerics are literals
    • backslash is a character escape
    • parens are used for grouping
    • *, +, ? and {} are repeat count specifications
    These are the fundamentals of Perl regular expressions, POSIX, and all of the other modern regular expression engines and in turn have only a few small differences from the basic regular expressions which Unix started with.

    I've often thought that the ease with which regular expressions can be accessed within per was a blessing and a curse. So many people like yourself seem to think that Perl championed regular expressions, when in fact it just followed AWK's lead in integration between C and Ken Thompson's regular expression implementation (which in turn inspired the version that was written from scratch by Henry Spencer and used by Larry for Perl).

    If you have a new syntax in mind, I suggest introducing it and seeing how it does. Modern regular expressions are an incremental improvement on classical set notations, and have served us well to date, but I'm sure someday someone will see a better way.

It is not for me to attempt to fathom the inscrutable workings of Providence. -- The Earl of Birkenhead

Working...