Perl Features of the Future - Part 1 61
Kevin writes "This story
highlights some of the features being included in Perl 6. "There will be substantial changes in the move from Perl 5 to Perl 6. We've been hamstrung for a while by the need to maintain backward compatibility all the way back to Perl 1. There are some things we want to remove, because they seemed like good ideas when they were introduced but they're more trouble than [they're worth] now."
Goodbye "my", hello UTF-8? (Score:2)
In serious programming, most variables are local in scope, while few or none are global. Perl variables default to global, so you end up having to fight the default with every variable you create: "my $computer", "my $documents", "my $images", "my $ms_bob", (sorry)....
I hope Perl 6 will turn its back on past booboos and make the things that clearly ought to be default the default, even at the cost of backward compatibility.
And if they really have guts, I hope they'll do what Wayne Gretzky always advocated and "skate to where the puck will be" by making "use UTF-8" the default from now on, and "use bytes" the exception (legacy mode). Windows, Mac, and the major Linux distributions are all converting to Unicode pretty fast now.
Re:Goodbye "my", hello UTF-8? (Score:1, Interesting)
Maybe "my" is an poor choice of a word to use (same with "use vars", "our", or "local"), but automagically creating variables is asking for trouble -- really unreadable trouble too.
Also, perl 5.8 (and I'm pretty sure earlier versions) have full UTF-8 support.
Re:Goodbye "my", hello UTF-8? (Score:1, Flamebait)
By prefacing the variable name wiith the type:
int foo
Oh.
Oh, wait, it's perl. No types!
"int foo;" plus some line noise probably implements DeCSS. Or does an rm -f. Who knows? Hey, it's perl. It's not like you can read it.
Re:Goodbye "my", hello UTF-8? (Score:5, Interesting)
I'm so f**king tired of hearing how perl is hard to read.
ITS NOT HARD TO READ UNLESS YOU MAKE IT HARD TO READ!!! and this is true for ANY and EVERY language out there. I can read perl all day long without problems, as long as it wasn't meant to be hard to read, but if you put a C program in front of me it might as well be some made up language that doesn't work, i woudln't be able to tell the difference.
I've come to the conclusion that those of you that say perl is hard to read either a) don't have a single solitary clue about perl at all, b) are trying to stir people up, or c) are trying to convince everyone else that your favorite language is "better" for each and every circumstance, which isn't true of any language at all, not even perl.
Hey, its Slashdot. I'm voting for all three.
Re:Goodbye "my", hello UTF-8? (Score:5, Insightful)
I think every computer scientist worth thier pay should realize that language advocacy stinks. Every language has its niche--a reason for why some guy sat back and said, "I need to write a shortcut language to do this" or "wouldn't it be great if I could have a better correlation between the way I think/design and the language I write in" and it evolves from there. Perl was designed as a glue language heavily modeled after awk/sed and other unix tools and the concept of following natural language and "having more than one way to do/say it" so you need to have a good feel for the language pieces as you do when you become a master of english and understand different connotations and methods of stating something.
That being said, it is unfortunate that because there are large groups of people who either A) get religious about the language they choose or B) choose not to learn other languages to a point of knowing thier true niche we take every language and bloat it out and take it out of its scope. And in turn that makes it that much harder to grasp each new languages niche because you have to sort through a bunch of crap that trys to make every language the universal language.
Well, it's human nature I guess. Easy to point out as a problem but not easy to fix...but remember that the next time you are about to tell your coworker that they should "write that in _____" instead of answering thier question. Or, be careful when you complain that "____ is bloated or is too hard to understand" because you are just adding fuel to the fire my friend. It is better by far to state why you chose a certain language on a certain project than to be a universal advocate of "_____".
Advocacy is a clear mark of inexperience.
Re:Goodbye "my", hello UTF-8? (Score:4, Insightful)
I meet none of a, b and c. Here's my opinion:
Python is easier to read than the equivalent Perl code, even if the latter well-indented. Now before you close your mozilla tab, or mark this as "Flamebait", please hear me out.
Yeah, I agree that bad code is bad code, regardless of the lang.
But compare these two equivalent statements of good code:
pythonNumber = 1
my $perlNumber = 1;
Do the "$", the "my" and the ";" look necessary, or extraneous and confusing? They are the latter to me.
What about $_ and @_ ? Those don't seem very sensible. Nor does "<>", the backtick "`" or several other common Perl paradigms such as the fact that it makes a big difference whether you have single quotes or double quotes around a string.
Yes, I agree that some of Perl's "hard-to-read" reputation is deserved, but not all of it.
Why do reasonable people think Perl is hard to read? Because it has lots of unneeded, non-alphanumeric characters and there are lots of conventions that don't make sense (e.g. '<>' meaning a line of standard input).
Re:Goodbye "my", hello UTF-8? (Score:2)
Anyway isn't it hard to read tabs vs spaces in someone else's Python code? Distinguishing different white space characters from each other can sometimes be a pain (telnet, vi?).
I find Perl easier to read than Java.
Compare the source code:
http://developer.java.sun.com/developer/qo
http://use.perl.org/article.pl?s
With Perl I can stare at a few lines of text till it makes sense (and that's usually fast if the programmer hasn't purposely obfuscated stuff).
With Java you often have to scroll up and down for a program that does the same thing (and then if you're familiar with other languages you'd ask: why so many lines to do this?).
The scrolling is not that big a problem if the program works. But if you're looking for a bug in someone else's program having to scroll a lot makes it harder.
Re:Goodbye "my", hello UTF-8? (Score:2, Insightful)
"my" makes it a local variable. It's only necessary if you want that.
"$" is used to denote a scalar variable. Technically, it is extraneous. (It's part of perl's shell heritage.) On the other hand, some programmers intentionally add several extraneous characters to every variable (hungarian notation), so it must be a taste thing.
";" is necessary in languages which don't consider whitespace to be significant. Some people consider significant white space to be annoying and/or dangerous.
What about $_ and @_ ?
"$_" and "@_" are pronouns. They usually disappear in the same way "self" does. (Actually, in perl6, "self" is spelled "$_".) Perl, imitating natural language, has more pronouns than most computer languages.
"<>" is an idiom. In natural languages, commonly used idioms tend to get shorter. "<>" is an idiom for "read the next line from the file(s) named on the command line, or from standard input if there are no files on the command line." How much code would it take you to write that in python?
Perl is refreshingly concise and expressive.
Re:Goodbye "my", hello UTF-8? (Score:2)
OK, perhaps the semicolon is unnecessary. The language could use end of line as a statement terminator. Concede that one to Python.
The dollar sign, well it does have its uses, consider string interpolation:
print "The number is $num.\n";
which is not nearly as concise in Python, especially for interpolating a large number of variables into a string. The $ also means that Perl functions can be called without parentheses, as in the example above. In Python, every function call must have () around its arguments, and you could argue this is 'unnecessary' or 'extraneous'. (There is some special case sugar for 'print' in Python but I don't think you can use it for your own functions.) In Perl you have some extra syntax for variables but it allows you to use less syntax for function calls and string interpolation: swings and roundabouts.
The 'my' part is not just extra syntax, it is semantic. It declares the scope of the variable. One of the biggest problems with Python IMHO is the impossibility of specifying the scope of a local variable - I don't know whether this has been fixed in the latest Python releases.
Partly agree about the other stuff, but I think that having two different quote characters mean different things is sensible. You wouldn't argue that C is a broken language because it makes a big difference whether you have () or {} around bits of code.
Re:Goodbye "my", hello UTF-8? (Score:1)
The counter-question: exactly what is the python variable you mentioned? What is its scope? What kind of information it contains? foo = 1 tells it's a variable assignment somewhere in the program. my $foo = 1 says it's a new, local and scalar variable.
This isn't a black-and-white issue, you know.
Re:Goodbye "my", hello UTF-8? (Score:2)
Hard to read Perl [5] (Score:4, Informative)
If it's not hard to read, then why are the designers of perl 6 making a lot of efforts to make it a lot easier to read than perl 5?
Quoting Larry Wall from the Apocalypses [perl.org]:
I agree. I was unduly influenced by Ada syntax here, and it was a mistake.
Just for a point of reference, I'm a perl programmer who doesn't fit your categories (a), (b), or (c), but still finds perl code hard to read fairly often.
With all that said, I'll close with one more quote from the Wall:
Re:Goodbye "my", hello UTF-8? (Score:1)
To quote Larry Wall: "You can write assembly in any language."
Remember, the Obfuscated Perl competition took its' inspiration from the Obfuscated C competition. If you want to talk really obfuscated you should consider INTERCAL! ;-)
Re:Goodbye "my", hello UTF-8? (Score:3, Interesting)
The only problem is when people who write quick scripts decide to try and make real programs, and the result is unreadable spaghetti. The same thing used to happen when sysadmin "gurus" strung together unreadable combinations of shell, sed, and awk...
If you to do "serious" programming, "use strict;" is your friend. Or waste time with C or Java.
Re:Goodbye "my", hello UTF-8? (Score:2)
It's true, just be careful. I happen to know c and assembly and (name a bunch of other archaic langauages) pretty well, and use perl frequently and effectively. However, I've seen people who first learned perl use perl, and, wow, they can do some ugly stuff. Sometimes they don't know what an O(n^2) algorithm means, so they use them - alot.
As usual, perl gives you plenty of rope to hang yourself (not that I know of a language that stops bad algorithms).
Re:Goodbye "my", hello UTF-8? (Score:2)
I always use strict, because it stops me writing extremely sloppy code without noticing. It forces you to properly initiate all variables. Does it really take too long to type two or three letters extra to declare your variables?
Re:Goodbye "my", hello UTF-8? (Score:2)
Or do you think that variables should not need any declaration? If so, how does the language decide what the scope of the variable should be?
Code parsing (Score:2)
Does this mean that I will be able to parse stuff out like HTML tags, and nested parenthesis?
Or even catching VBScript strings, with the "" inside a string representing a single ", so I'll be ale to parse out something like which currently is incredibly annoying to parse, especially if all you want to do is catch the comments at the end of the line.
Re:Code parsing (Score:1, Interesting)
Although strictly speaking from a computer science perspective, matching parenthesis cannot be done with regular expressions (finite automata). But you can do it in perl with backreferences.
Re:balanced parens: NO YOU CAN'T (Score:3, Informative)
Also there exist CPAN module Text::Balanced [cpan.org] which does balanced expressions matching.
Re:Code parsing (Score:3, Informative)
Parse::RecDescent is available for the 5.x series (I think it's part of the core download for 5.8, and was optional for 5.6, but am not 100% sure about that), but it seems like it's going to become a more core component when 6 arrives. If you want to be able to do this stuff, look into it -- you don't have to wait for Perl6 to start using this. It's available now, and it's *great*. :)
$my $computer (Score:2)
Out of curiosity, are you a MS programmer ?
Perl 6 is a mistake (Score:2, Flamebait)
One of the goals of Perl 6 is to make non-trivial projects possible. That's good. The way it's being done is bad. Perl was once a lightweight, extremely flexible language. Now it's become a huge ugly monster [mozilla.org]. People wanted OO, so a nasty hack was bolted on top to allow some semblance of it. Now this nasty hack is being expanded. Sure, the code's different, but the basic form is the same. Kludge upon kludge upon kludge; I'd much rather have a nice, clean, pure language [rubycentral.com] (and not one with loads of irritating whitespace [python.org] thank you very much).
The same goes for the syntax. All the switching between $, @ and % is really irritating (ask a newbie how to get at the length of the keys array of a hash inside a hash, for example), and the changes proposed for 6 are just making this worse -- it seems that Larry, in his infinite wisdom, wants to prefix every data type with a different hard-to-type character. Perl was only designed for the three data types, and adding more is a mess.
Perl 6 is a complete rewrite, but it keeps all the mess which has accumulated over the previous versions. This is not good. Sure, my const int $var = 27; may look neat (in the same way that, say, Pascal [lysator.liu.se] does), but $var isn't entirely constant, or entirely an integer, it's just a hack which makes it sort of behave like one. The whole thing is an exercise in pseudo-computer science masturbation with little real purpose except to please the managers who dislike the one thing that makes Perl special.
On a similar note is regexes. I'm an avid fan of regular expressions simply because a nondeterministic finite automata is far more flexible than linear code. However, Larry must have been smoking that cheap $2 crack when he wrote this [perl.com] . Does he want Perl 6 to be flex [gnu.org] or something?
I won't be going on to use 6. It's a nice idea, but it's completely unnecessary. It won't make large projects any easier to manage (the language is still, at heart, an almighty hack -- an impressive one, but still a hack). It won't make OO any cleaner. It won't make development any faster. To put it bluntly, Perl scripts will still look less beautiful than our friend Mr Goatse [goatse.cx]. I'd prefer to use a language [ruby-lang.org] which has always been pure synthesis of science and engineering, not some half-baked imposter [beonex.com].
Perl 6 will be nice, but I'm guessing it will be the end of Perl. It can't do what it wants to do whilst still being based upon a nasty mess. There are now other options, which provide all of Perl's power and none of the mess. Sorry, but BSD^W Perl is dying. Larry is buggering it up the ass without lubricants, just like Shoeboy is doing to Larry's daughter.
Re:Perl 6 is a mistake (Score:3, Insightful)
Wow! He managed to make an allusion to "BSD is dying" and a legitimate use of the Goatse man!
I would have skipped the gratuitous and tasteless slam at Larry Wall's daughter he ended with, though.
Re:Perl 6 is a mistake (Score:2, Funny)
While I'm here, what the hell does "nondeterministic finite automata" mean, and is Larry's $2 crack the same stuff that moderators are often on?
All in all, a top class post, worthy of being modded way beyond the +3 it is currently on.
Re:Perl 6 is a mistake (Score:5, Interesting)
They're state machines. They're in a given state, and they know how to go to the adjacent states. So given the string 'abc', if you're currently looking at the 'b' (having already seen the 'a'), you know that you'll have a valid match iff the next character is a 'c'. If it's not, you have no match. if you have 'ab[cd]', and are looking at the 'b', you know you have a match if the next char is a 'c' or a 'd', you've got a match. 'c' and 'd' then are the 2 next valid states.
The nifty thing (and the limiting thing) is that true RE's require no memory. Just the knowledge of what state they're currently in. For this reason, no true RE can be written to see if a given string is a palendrome (you can write a RE to match a specific palendrome, but not an arbitrary one).
The difference between a NFA and DFA is that NFA's allow 'null transitions'. This basically says that there are more than 1 state that you can leap out of when you see the next character, because you can go to these special adjacent states without seeing a character, and then leap out. There's also a proof out there that any NFA can be written as a DFA.
All of that said, Perl's extended RE's are not true DFA's. They have some features that can not fit into the DFA model. This is one of Larry's reasons for wanting to make Perl's RE's into true CFG's (context free gramers).
This model is much more powerful than RE's, but at a greater cost, since you have to have memory too. The mathmatical definition of a CFG is a state machine that drags around a stack of memory. The state machine may at arbitrary times push data onto the stack, and later pop it off. It must be done in order though (to match the math model. If you add a second stack, you have the definition of a 'turing machine' (aka the computers on our desk)).
A CFG can be written to match arbitrary palendrome's for example (just push each letter onto the stack, and when you hit the middle, start poping off, and matching each letter. Yes, this is over simplified. The true algorithm is left as an exercise for the reader)
Re:Perl 6 is a mistake (Score:1)
Re:Perl 6 is a mistake (Score:1)
CFG's are not implementable as a DFA with a stack. They are actually an NFA with a stack, something that has no direct tie-back to any sort of deterministic automaton. (Whereas, as you noted, NFA's can be expressed as DFA's with an exponential growth in state-space, and Nondeterminstic Turing Machines can be rewritten as TM's with an exponential growth in time-complexity (I'm not saying that they *must* incurr an exponential growth, mind you, just that they can... see P=NP?)... however, a non-deterinstic push-down automaton cannot be rewritten as a deterministic PDA at all.)
Re:Perl 6 is a mistake (Score:3, Insightful)
Re:Perl 6 is a mistake (Score:2)
One, I'll bet a lot of people said a lot of the same stuff about Perl 5, no? I know that people still despise the OO stuff, but, hey, some people will hate any OO implementation until it's C++, when what they should really be doing is hating it until it's Smalltalk. But I wouldn't be surprised if at the Perl 4->5 transition people were complaining that Perl was losing what it was good for/at, and I think it survived that pretty well. Past performance blah blah blah, and maybe this transition is completely different (well, ok, it is pretty unarguably completely different in a lot of ways), but I think the burden of proof is on the detractors.
Two, I learned pattern matching with Perl. In time I learned to use the pattern matching in things like grep and vi, and only then did I learn the extreme usefulness of Larry's inspiration to make the special chars require escaping to be non-special, rather than the other way around.
Given that he has one huge win in pattern-matching reform, I think I'll give him the benefit of the doubt on what he's talking about doing with the new pattern matching stuff.
But the most important thing is that the design goal stays the same--"how can I make this language easier to get things done with?". I don't care enough to dig into all the flamewar on Perl 6, but I really don't think it's going to be The End Of Perl as people so often predict. It may be The End Of Perl As We Know It, but as long as it's still Larry asking the same "how can I make easy things easy and hard things possible" question, I still have faith that I will Feel Fine.
Re:Perl 6 is a mistake (Score:2)
OT Re:Perl 6 is a mistake (Score:2)
that was just for fun. (Well, actually it was also promptes by the fact that 'Apocalypses" sounds weird, and so I just grabbed another pluralization rule.)
But I didn't really know the real place to use that rule, so it was nice to have the chance to learn something. Thanks.
ps In case anyone is wondering, if you are talking about a _lot_ of Apocalypses, the correct rendering is "Apocaloodles".
Seconded (Score:3, Interesting)
I too looked at Python. Like you, I decided that basing your language's syntax on differing amounts of whitespace was a really, really bad idea, not because it's ugly, but because I have enough trouble keeping tab damage under control on a single platform.
So I'm looking at Ruby. In fact, the only thing stopping me ditching PERL for Ruby tomorrow is lack of time for re-learning, given all the other new stuff I'm learning right now (J2EE, Objective-C, Cocoa, OpenGL,
Re:Seconded (Score:2, Interesting)
Perl 6'll be neat (Score:1)
But I think I'll probably check out Perl 6 when it comes out. It looks like the OO and references'll be cleaned up, the new regex stuff looks kinda neat, and hopefully Perl and Ruby and Python will all be able to coexist peacefully when they're all ported to Parrot. Need to do some fany regexing from a Ruby program? Just write your regexing function in Perl 6 and link to it from your Ruby program!
Plus, exploring Perl is just fun
Re:Perl 6'll be neat (Score:2)
What annoys me is that Perl has no way to just store one list inside another. If I want to make a list of lists, and have value semantics, why can't I just say $a[5] = @b? It might not be that efficient to program this way, just as a vector is not normally that sensible in C++, but it would make things a lot more consistent. You could choose to work with values or with references, instead of being forced into references just to make nested data structures.
I don't know whether perl6 will support this; probably not. It might end up using only references but without the -> operator that reminds you that you are dereferencing something.
Not that hard, man (Score:1)
It's called:
$a[5] = [@b];
how hard was that? Once again, it's nice to have *control* over value/reference semantics.
Re:Not that hard, man (Score:2)
$a = 55;
$b = $a;
$b = 66;
print $a;
Clearly the original value of $a has not been changed. You don't need any voodoo with square brackets or reading 'man perltoot' to get this.
And if you do the same thing with lists, it's fine too:
@a = (55, 56);
@b = @a;
push @b, 'hello';
print join(', ', @a);
Again the original object is unchanged. Now, what if instead of variables I have another data structure (say, a hash) storing these values?
$h{a} = (55, 56);
Well there's the first problem, you can't do that, it has to be a scalar. Well, okay,
$h{a} = [ 55, 56 ];
$h{b} = $h{a};
push @{$h{b}}, 'hello';
print join(', ', @{$h{a}});
And bam, it's different. You don't have to take references to strings to store them in hashes, why should you have to take references to lists to store them? If I want a hash of lists, why can't Perl manage it?
Of course you should be able to make explicit references if you want, but they shouldn't be forced on you just for simple things like the above.
'Easy things should be easy' - I wish Perl would follow its own motto in this area. It's good that there is documentation such as perltoot to guide the new programmer through this stuff, but rather odd that none of it seems to acknowledge the possibility that this stuff is anything other than bleeding obvious.
Re:Seconded (Score:1)
Re:Seconded (Score:2)
Re:Seconded (Score:1)
Consider calling ls LS then.
Re:Seconded (Score:2)
LS(1) OpenBSD Reference Manual LS(1)
NAME
ls - list directory contents
I see no acronym, dumbass.
Re:Seconded (Score:1)
The only place it's spelled PERL is in the header -- the same place as the LS you quoted. My sincerest apologies for taking your post at face value.
Still, if you're going to be that pedantic, why not use ldc instead of ls?
Re:Seconded (Score:2)
"perl - Practical Extraction and Report Language"
Supporting the acronymic derivation of the name.
The man page for ls, in contrast, doesn't claim that it stands for anything.
Re:Perl 6 is a mistake (Score:2)
* I'll take that as a compliment
* Your Perl or Ruby code probably could use some tidying up before someone else tries to read it
Re:Perl 6 is a mistake (Score:2)
- No decent closures. They say that in Python 2.2 proper closures have been added, but it still doesn't seem possible to construct them, because the 'lambda' operator only allows an expression, not a statement, so you can't do anything non-trivial inside your lambda expression. For example, I often write Perl code passing round functions, eg:
my $count;
my $f = sub { ++$count };
my $g = sub { print "hello\n" if $count == 5 };
But there doesn't seem to be any way to construct these closures in Python.
- The other thing I miss is Perl's labelled loop blocks, so that instead of 'next' and 'last' always referring to the inner loop you can say 'next LABEL' and 'last LABEL'. This can often make code more readable and eliminate the need for Pascal-like condition variables and contorted code to check them. Also I think 'next' and 'last' are clearer names than C's 'continue' and 'break', but that's a matter of taste
Perl section (Score:2)