Forgot your password?
typodupeerror
Perl Programming

Exegesis 7 Released (Perl 6 Text Formatting) 319

Posted by michael
from the folding-spindling-mutilating dept.
chromatic writes "Perl.com has just published Exegesis 7, Damian Conway's explanation of how text formatting will work Perl 6 (and now, Perl 5, thanks to his Perl6::Form module) will work. Think of it as Perl 1 for the 21st century. Also, Parrot 0.1.0, the virtual machine for Perl 6 and several other dynamic languages, released on Leap Day -- ever wanted to program in an object oriented assembly language?"
This discussion has been archived. No new comments can be posted.

Exegesis 7 Released (Perl 6 Text Formatting)

Comments Filter:
  • by Saven Marek (739395) on Thursday March 04, 2004 @08:44PM (#8470813)
    The long lost art of Good Documentation. There's been quite a case made lately (read ESR's CUPS rant for an example) for software that doesn't need documentation, when its method of use is made obvious merely by it's design. I think for consumer software that's just meant to be used one or two ways sure that's a good idea.

    But for something like Perl, it's all in the documentation. Here's to writers like Damian Conway not only providing summaries for new releases, but writing the original documentation!. If only it paid well!

    That been said O'Reilly would sell a good deal less books if the original docos were all they should be cracked up to be. Guess it doesn't have to be that good! There's nothing like getting a new fresh O'Reilly title in the mail.

    Mac desktops, OSX hints, scripts and more [67.160.223.119]
  • by Anonymous Coward on Thursday March 04, 2004 @08:46PM (#8470834)
    Programming in assembler allows the programmer to create machine instructions tailored to a specific processor. This allows her to do things that are beyond the capabilities of any JIT optimizer or bytecode interpreter. If it assembles to VM bytecode, it's not assembler.
  • Re:VM's (Score:4, Informative)

    by erikharrison (633719) on Thursday March 04, 2004 @09:01PM (#8470960)
    Well, actually VMs are the way of the past - in research circles the VM has been around forever.

    However, for what it's worth, Parrot's relationship to the JVM and the .Net VM is rather small. JVM/.Net are designed from the ground up to support systems languages (like Java and C#). They optimize for static typing and languages where most complexity happens at compile time. Parrot is a VM for languages like Perl, Python, and Ruby, (and TCL, and Lisp etc) whose typing is weaker, and where a runtime eval is a moderately common occurance.

    What specifically about the JVM puts you off? Or is it the host language that bothers you?
  • by No_Weak_Heart (444982) on Thursday March 04, 2004 @09:14PM (#8471044)
    Form follows format. From the end of this Exegis, some hightlights:
    "Report generation was one of Perl's original raisons d'etre. Over the years we've found out what
    format does well, and where its limitations lurk. The new Perl 6 form function aims to preserve format's simple approach to report generation and build on its strengths by adding:
    • independence from the I/O system;
    • run-time specifiable format strings;
    • a wider range of useful field types, including fully justified, verbatim, and overflow fields;
    • the ability to define new field types;
    • sophisticated formatting of numeric/currency data;
    • declarative, imperative, distributive, and extensible field widths;
    • more flexible control of headers, footers, and page layout;
    • control over line-breaking, whitespace squeezing, and filling of empty fields; and
    • support for creating plaintext lists, tables, and graphs.

    And because it's now part of a module, rather than a core component, form will be able to evolve more easily to meet the needs of its community. For example, we are currently investigating how we might add facilities for specifying numerical bullets, for formatting text using variable-width fonts, and for outputting HTML instead of plaintext."

    'cause i'm lazy, that's why
  • by TimToady (52230) on Thursday March 04, 2004 @09:20PM (#8471082)
    I recommend they just wrap up whatever concepts they have now and start moving toward an alpha.
    Which is precisely what we've been doing for some time now. Apocalypse 12 will be out very shortly, and it will look like a lot of new concepts, but they're mostly concepts we've been aiming at for a long time now. Get this through your noggin--it's not the conceptualizing that's the hard part. The "wrapping up" is where almost all the effort goes, because that's where the hard work of design is. Anybody can come up with a list of new features. We've had the RFCs for three years, and you know what a mess they were...
  • Re:ruby! ruby! (Score:3, Informative)

    by geniusj (140174) on Thursday March 04, 2004 @09:20PM (#8471090) Homepage
    I am also a huge fan of Ruby.. However, Perl 6 is going to benefit everyone. Ruby will be able to target the Parrot VM as well as languages like Python and TCL. What does this mean? As I understand it, this means that anywhere parrot is installed, your bytecode can be run. No matter what language it's written in. This also means you'll be able to do things like use perl modules from Ruby or use python modules from Ruby or use Ruby modules from Python, etc.

    Parrot is very exciting. I personally can't wait. :)
  • by Anonymous Coward on Thursday March 04, 2004 @09:25PM (#8471117)
    Yeah, except for the 75% of its operators that are cribbed straight from Perl...
  • Re:Camel book :] (Score:3, Informative)

    by dpuu (553144) on Thursday March 04, 2004 @09:42PM (#8471281) Homepage
    You can read about the new regex syntax in exegesis 5 [perl.org] (and its corresponding apocalypse)
  • OO Assembly? (Score:4, Informative)

    by gidds (56397) <slashdot@@@gidds...me...uk> on Thursday March 04, 2004 @10:06PM (#8471430) Homepage
    ever wanted to program in an object oriented assembly language?

    No, but if I wanted to, I could already [sun.com], thanks.

  • Re:VM's (Score:3, Informative)

    by voodoo1man (594237) on Thursday March 04, 2004 @10:30PM (#8471581)
    JVM/.Net are designed from the ground up to support systems languages (like Java and C#). They optimize for static typing and languages where most complexity happens at compile time. Parrot is a VM for languages like Perl, Python, and Ruby, (and TCL, and Lisp etc) whose typing is weaker, and where a runtime eval is a moderately common occurance.
    I have to disagree about typing. Python and Lisp (don't know about the others) type systems aren't "weak," but rather dynamic, meaning that they keep type information around at runtime. Java (don't know about C#) is weakly typed in comparison, as you can cast any class to Object and vice-versa, throwing away the type information, and an incorrect cast will only give an error at run-time (and then it may be difficult to track down if the original class type and the one cast to are closely related). This doesn't happen in dynamically typed languages, where you don't need to cast ("coerce" is a more appropriate term for dynamic type systems) objects to a generic type just to put them somewhere.
  • by jstarr (164989) * on Thursday March 04, 2004 @11:06PM (#8471832)
    Not quite. Weak is not the opposite of dynamic, but the opposite of strong. Type systems may be either weak or strong, and may be either dynamic or static.

    A weak type system will allow implicit type conversions, even those that are 'lossy' or improper. For example, converting a float to an int without requiring a cast. Or, more importantly, treating memory references (pointers) identically to integers. Pointer arithmetic is an abuse of a weak typing system.

    Strong typing requires explicit casts and will throw errors where casts do not appear. Java, Lisp, Python are all strongly typed. Haskell is _really_ strongly typed. When you cast a object to type Object in Java, you are losing type information, but you are doing it _explicitly_.

    C, Pascal, and Java are statically typed. Variables are created with a specific type in the code, not on demand. Python and Lisp are dynamically typed -- a variable's type is determined at run-time.

    For example, in C:

    int foo( int a, int b );

    declares a function that returns type 'int' and takes two arguments a, b, both of types 'int'.

    In Python:

    def foo( a, b ):

    declares a function that may or may not return a value (and whose type is known only at run-time) and takes two arguments, which may be of any type (although, internally, the program likely assumes a type).

    There are some quirks in the type systems of many languages. In Java, for example, "str" + 3 doesn't have any normal meaning, but the developers have defined any operation using a string as concatenation. In Python, and in most languages, such an expression will either return an error on compilation (static) or when running (dynamic).

    However, all combinations are possible and type systems are a fertile area of research.
  • by Matchstick (94940) on Thursday March 04, 2004 @11:48PM (#8472136)
    The reason Parrot is register based has nothing whatsoever to do with reduction in the number of opcodes. In fact, the Parrot VM opcode count is quite large, largely because of combinatorial issues. If every x86 opcode that used mod/reg/rm addressing was actually counted as (# mod/reg/rm combinations) opcodes, that would be roughly analagous to what's going on in Parrot.

    Parrot uses registers for speed, that's "all".
  • Re:Uh... (Score:5, Informative)

    by orthogonal (588627) on Thursday March 04, 2004 @11:52PM (#8472171) Journal
    I thought about this for five seconds in pseudo assembler, then my brain started leaking...

    Leaking is especially apropos, because you should be thinking about encapsulation -- keeping implementation details from leaking --, not inheritance.

    I actually did a toy program or two -- very toy, class assignments -- in assembly after I knew C++, and consciously employed an Object Oriented outlook in designing the programs.

    This really is easier than it might seem at first: the second biggest hurdle -- and the most important first step -- to OO design is to always think of the entities in your program as objects with responsibilities.

    (The biggest hurdle is discovering -- and I use the word "discovering" intentionally, because it's a iterative process of exploring your problem domain -- where to "carve nature at the joints", or where one object ends and the next starts. Alas, a further discussion of this hurdle is beyond the scope of this comment.)

    Given that you keep in mind that you're dealing with objects, and that OO requires you to do so polymorphically, -- that is, you want to be able to do the same sorts of things with objects of different sizes and shapes -- you'll quickly find that you need a level of indirection, some stand-in for the actual object, a proxy that is itself the same size and shape for every kind of different object. In C (and C++ and Java) that "same-ness" proxy is a pointer; in perl, it's a hash, which the language conveniently handles the pointing to; in assembly it's a pointer too, or given assembly's inherent weak typing, a memory address.

    Just as the real first parameter to every C++ member function is the (hidden and implicit) this pointer, any object-oriented assembly is going to have to pass an object pointer to any functions called on that object. The object pointer will be the address of the actual object, and the object's state, instead of residing in numerous functions -- as you'd do in non-OO structural programming -- will reside in the object, at that pointed to memory.

    Finally, and most tedious, is the need for one function called on the object to access other member functions of the object. Essentially, we need a way to determine which of several possible functions foos should be the foo called for a particular object. C++ generally implements this as an array of pointers to function, perl by means of a hash map. Implementation details are implementation details, but essentially you need to specify some ordered list of (address of) functions when the object is created. A naive (and inefficient) implementation that would look like very late binding (and weak typing) to a C++ programmer would be simply to have each object include in its state an array of address; better solution would, as usual in computer science, involve a few more levels of indirection.

    The point I'm trying to make is this: Object Orientation isn't so much a property of one language or another -- although some languages support it far better than do others --, as it is a property of the way you think about the problem domain and about programming in general. It's an outlook, a mindset, a world-view, and it's maintaining that world-view, much more than worrying about the implementation details, that matters.

    Good Object Oriented programmers can -- and do -- write OO code regardless of the language they're writing in. Programmers who still don't get OO will write bad, pointless OO even in languages that support OO the best. And really good programmers know when to use OO, what parts of it to use, and when not to use OO.
  • by ChaosDiscord (4913) on Thursday March 04, 2004 @11:55PM (#8472187) Homepage Journal

    Perhaps the biggest thing I took from this is that the increasingly specialized and unused fixed width formatting functionality in Perl is moving out of the core language and into a powerful module. Those of us who don't need it will never need to worry about it; those who need it will find it actually improved over previous versions. Finally "write" will be free in Perl to mean something slightly less crazy.

    That said, I can think of one important use for fixed width formatting: email reports. Sure, you can use HTML, but you really ought to also provide a plain text version. Many (arguably crappy) web mail clients will only display the plain text, ditto for us crotchety old command line mail tool users. With a bit of care you have have your table heavy report in shiny HTML and your functional text all in one MIME encoded message. (Remember that "reports" includes things like your email receipt from an online store.)

  • Re:Me either ... (Score:1, Informative)

    by Anonymous Coward on Friday March 05, 2004 @12:15AM (#8472307)
    Imagine writing a game and being able to do the rendering engine in c, the game logic in python/perl/ruby and the AI in scheme or lisp.
    You can do this already. Go talk to some game developers...most of the rendering engines seem to be written in C++ these days though.
  • Re:Me either ... (Score:3, Informative)

    by aled (228417) on Friday March 05, 2004 @12:39AM (#8472425)
    I don't know if you are talking about invoking scripts from Java, in which case you a lot of alternatives, from beanshell [beanshell.org] to jython [jython.org] (the python implentation in Java) and most of them could be run through BSF [apache.org] to have an uniform API.
    DinamicJava [koala.ilog.fr] on the other hand is an interpreter of a superset of Java.
    I don't know what you find annoying: compiling and executing. That's the norm for most programs. Java programs are Just In Time Compiled [mindprod.com] but that is done transparently by the virtual machine and is faster than interpreted.
  • by ajs (35943) <ajs@@@ajs...com> on Friday March 05, 2004 @01:31AM (#8472747) Homepage Journal
    I asked the same thing recently on the p6l mailing list [google.com]. Larry responded [google.com] with some interesting news. First off, I had not realized that he had taken a lot of time off last year for health reasons.

    Second, he posted (as you can see from the link above) a full outline of Perl 6's specifications-to-be and explained that he's been spending a lot of time on A12. That's right, he's skipping over A7 (delegated to Damian) and A8-A11 (which he'll return to later) and doing the chapter on objects. This is an important part of the language, and really did need to be covered before the rest could be fleshed out. It seems that he expects most of the rest of the spec to be about as much work as A12 is alone, and he claims that's just a few days or weeks at most away from being finished.

    That said, keep in mind that the cryptic things you see on p6l are the result of reading code snippits written in a language that doesn't exist yet. Every time Larry steps in and explains things, the picture gets a bit clearer (partly because Larry is a great communicator but partly because he's quite capable of and willing to cut away a lot of noise and render some signal from its remains).

    Perl 6 is, as far as I can tell a lovely evolution of Perl... it's perhaps more orders of magnitude more evolved than I would have suggested as the next step, but looking at the good work it has resulted in for Parrot, I'm not sure I'd turn back.
  • Re:OO Assembly? (Score:3, Informative)

    by ajs (35943) <ajs@@@ajs...com> on Friday March 05, 2004 @01:41AM (#8472794) Homepage Journal
    There's several probelms with that theory. First off, the Java VM is designed to run Java and not much else. This means that if you want to write JAVA, but in and assembly-like way you're all set. Parrot by contrast is designed to run any dynamically typed programming language, so you can write whatever you like for it.

    Also, the JVM is not really a very good development environment. It is, for the most part, only used as a back-end. Parrot is designed from the beginning to be programmed in because there will be many times that it makes sense for a library to target Parrot with chunks in C for performance reasons.

    To this end Parrot provides a paper-thin abstraction called IMCC [perl.org] which allows you to write code without having to worry about register spilling, etc.

    Give Parrot a try, and when you have I guarantee playing with stone knives and bear-skins won't seem nearly so appealing anymore.
  • by ajs (35943) <ajs@@@ajs...com> on Friday March 05, 2004 @02:15AM (#8472940) Homepage Journal
    Very few people will ever learn Perl 6, just as very few learned Perl 5. What they will do is absorb the basics of the language from examples they've seen an write a basic subset. This is what happens to all languages.

    The good things about Perl 6 are a) you have a rich and powerful language at your disposal that provides everything from dynamic grammar construction to advanced functional programming concepts and b) parrot will allow you to use libraries written in any other Parrot client-language.
  • by straybullets (646076) on Friday March 05, 2004 @05:25AM (#8473523)

    What they will do is absorb the basics of the language from examples they've seen an write a basic subset

    This is very interessting since it's exactly what i do !

    I consider myself a fairly poor coder, but still Perl helps me to do a lot of things very fast, and to a point i can say it pays a lot of my bills ! I could do most of this in VB but Perl is fun and intuitive.

    Some times I need an new feature and Perl will then help me learn something new, thus making my coding skills a little better.

    I realize that Perl is a good glue but not very "industrial". It's fun and all but when i look at the Rexx coder across the room, i wonder if Perl will not eventually end like Rexx, something mostly useless & readable only to old timers... And all the Perl 6 fuss with smart ass names (apocalypse, exegesis, jeez ...) isn't going to make me feel more confident about it !!

    Wait & see ...

  • by master_p (608214) on Friday March 05, 2004 @10:41AM (#8474782)
    I don't get it, since I have never worked with Perl, so I am asking the perl programmers out there: what is so special about plain-text reporting ? there are fine tools out there that can produce beautiful html reports that plain text will never be comparable to. I understand that text reports may have their uses in the back office, but that accounts for a small percentage of overall reporting.

    Furthermore, 'reporting' is a not a feature of a programming language. The same report package could be done with C++, for example. Will Perl 6 bring something *really new* in the programming languages department ?
  • by ajs (35943) <ajs@@@ajs...com> on Friday March 05, 2004 @01:56PM (#8476968) Homepage Journal
    1. Javadoc isn't well suited to man-pages
    you mean like "man ls"? i thought we were creating documentation websites...

    Sure that too [perlpod.com], or PDFs or command-line extraction form installed modules or whatever. There should be no difference. Docs are docs, and it should all come from the same source, no?

    Me: Javadoc is not just simple text[...]
    You: could you explain further[...]

    Sure, here's an example from the Javadoc site:
    /**
    * A class representing a window on the screen.
    * For example:
    * <pre>
    * Window win = new Window(parent);
    * win.show();
    * </pre>
    *
    * @author Sami Shaio
    * @version %I%, %G%
    * @see java.awt.BaseWindow
    * @see java.awt.Button
    */
    Ok, start with the fact that there's embedded HTML. Mistake. HTML makes it bulky to write docs. HTML is also not abstract enough to easily convert into, say, TeX or PS without a full rendering engine to get it right.

    Second, it's not free-form, so my doc might not fit into the Way that Javadoc wants me to approach such things.

    Let's look at POD for roughly the same thing (I am going to start at an odd place because the above was just a snippit of a presumably complete document):
    =item Window

    A class representing a window on the screen.
    for example:

    Window win = new Window(parent);
    win.show();
    Ok, now you might have at the bottom of that document:=head1 AUTHOR

    Sami Shaio

    =head1 VERSION

    $Id$

    =head1 SEE ALSO

    L, L

    =cutAre you seeing the differences here? Free-form to accomodate the vast array of needs, high level of abstraction from both the content AND the medium for ease of translation to just about any format (Perl ships with translators to nroff/man, HTML, LaTeX, and plain text but there are dozens of other converters out there).

    This difference suffuses the Java and Perl mindsets. Java has its own way of doing things and doesn't need to bow to local system standards. Perl finds a way to do its own thing in powerful ways while also adapting to the local culture.

    You also mentioned Jakarta, which while very cool, is a sort of rebelious subculture within the whole of the Java community. I would point you back at Jakarta for an example of how the model that Perl, Python, Ruby and many others follow has been forced upon Sun's Java community because it's the right solution.

    Again, I want it to be clear: I respect Java, the JVM and Java's documention. It's just that I feel there's even better tools available (and no reason you can't document your Java code using POD!)
  • by ajs (35943) <ajs@@@ajs...com> on Friday March 05, 2004 @02:07PM (#8477056) Homepage Journal
    "You couldn't write a compatible perl implementation using just the manual pages"

    That's both true and false. I would consider any program which implemented everything that was in the Perl documentation to be "compatible" (you have to understand what a behemoth undertaking that would be...), but Perl has never had a formal specification, and as such... just like every other language for which that's the case... there is a great body of code that relies on bugs, undocumented features and other implementation-specific aspects of the reference implementation. Your compiler would be "compatible" with Perl, but not with perl (note caps) and so would be of use only to those who were willing to test against it or modify existing code to work with it.

    Perl 6 is evolving a complete specification to address that among other concerns, but... what exactly were YOU trying to solve for? Did you want to write a Perl 5 compiler, or were you just looking for something to complain about?

    Perl 5 DOES do what you asked... it docuements (perlop, perlref, perlsyn, etc) all the way down to the gory details of the C implementation (perlguts, perlapi, overload, etc) what happens when you $a=$b+$c so I don't see what the problem is.
  • by Just Some Guy (3352) <kirk+slashdot@strauser.com> on Friday March 05, 2004 @02:13PM (#8477149) Homepage Journal
    If you like POD, then Python's doctest module [python.org] will blow your mind. Basically, you include example code snippets in your documentation to demonstrate how the code is supposed to work. Then, the doctest module finds all of those snippets, evaluates them, compares them to your example results, and reports on any differences.

    So, not only can you easily document your code, but you can trivially insert automated test cases for later verification. Good stuff, that.

    PS: I'm at home, sick with a fever, and jacked up on cold medicine. Proofreading is beyond my ability today. Please take any grammar mistakes with a grain of salt. :)

  • by TimToady (52230) on Friday March 05, 2004 @04:55PM (#8478905)
    Equivalent Perl 6:
    class A {
    has $.a is rw;
    }

    class B is A {...}

    $b = B.new;
    $b.a = 1;
    say $b.a;
  • by thoughtstream (140380) on Friday March 05, 2004 @05:18PM (#8479140)

    I seriously doubt that the task..Takes text on standard input, outputs word frequency counts sorted first by count, then alphabetically. ..can be handled with 5 lines and with equal clarity.
    while (readline STDIN) {
    for my $word (split) { $freq{lc $word}++ }
    }
    my @sorted = sort { $freq{$b} <=> $freq{$a} || $a cmp $b } keys %freq;
    for my $word (@sorted) {
    print "$word: $freq{$word}\n"
    }
    And in Perl 6 you could write almost exactly same thing:
    while (readline $STDIN) {
    for split -> $word { %freq{lc $word}++ }
    }
    my @sorted = sort { %freq{$^b} <=> %freq{$^a} || $^a cmp $^b } keys %freq;
    for @sorted -> $word {
    print "$word: %freq{$word}\n"
    }
    Or (since we're notionally talking about Perl 6 formatting here) in Perl 6 you could also write it more idiomatically as:
    slurp $STDIN ==> split ==> @words;
    for @words { %freq{lc $^word}++ }
    @sorted = sort %freq: by=>[ {.key} is descending, {.value} ];
    print form "{]]{6}]]}: {[[{60}[[}", @sorted>>.value, @sorted>>.key;
    and have your output prettified as well.
  • by TimToady (52230) on Friday March 05, 2004 @08:14PM (#8480955)
    No, Larry is personally in favor of letting users add Unicode operators. Big difference.
  • by chromatic (9471) on Saturday March 06, 2004 @01:56AM (#8482921) Homepage

    We're not going in numerical order anymore; we're going in order of importance. Apocalypse 12 should come out in the next few weeks. That's everything objects.

When the weight of the paperwork equals the weight of the plane, the plane will fly. -- Donald Douglas

Working...