 
			
		
		
	
		
		
		
		
		
		
			
				 
			
		
		
	
    
	Regex Golf, xkcd, and Peter Norvig 172
			
		 	
				mikejuk writes "A recent xkcd strip has started some deep academic thinking. When AI expert Peter Norvig gets involved you know the algorithms are going to fly. Code Golf is a reasonably well known sport of trying to write an algorithm in the shortest possible code. Regex Golf is similar, but in general the aim is to create a regular expression that accepts the strings in one list and rejects the strings in a second list. This started Peter Norvig, the well-known computer scientist and director of research at Google, thinking about the problem. Is it possible to write a program that would create a regular expression to solve the xkcd problem? The result is an NP hard problem that needs AI-like techniques to get an approximate answer. To find out more, read the complete description, including Python code, on Peter Norvig's blog. It ends with this challenge: 'I hope you found this interesting, and perhaps you can find ways to improve my algorithm, or more interesting lists to apply it to. I found it was fun to play with, and I hope this page gives you an idea of how to address problems like this.'"
		 	
		
		
		
		
			
		
	
FWIW, the Regex Golf game (Score:5, Interesting)
http://regex.alf.nu/ [regex.alf.nu]
Some favor trickiness, some favor just listing possibilities, but it's fun. I'm at 3651.
Re: (Score:3, Informative)
Cookies required? (Score:2)
Be sure to allow cookies, else the site won't work.
Cookies are an essential part of websites nowadays, dontcha' know...
Re: (Score:2)
Re: (Score:2)
Yep, lots of fun, thank You.
Got 3912 but I'm really confused by the "Alphabetical" level , I could craft a regex to discriminate both set that scores at 262, however I suppose it is considered cheating since I do not implement any underlying "principle" here because I cant recognize it. Any clues of what was the intent of this level?
some answers for Regex Golf game (Score:2)
Aren't there any solutions posted anywhere? This is a good exercise if you want to spend time puzzling out the more arcane features of regular expression evaluators. Some of these aren't fully working solutions.
Is there a regex... (Score:4, Funny)
... that can find frost psits?
Re: (Score:1)
Meta-meta-meta-meta.. (Score:2)
RegExps (Score:5, Interesting)
Regexp's are a programming language unto themselves.
I'm currently doing some temp IT work for schools while my promised job becomes available and it's eye-opening. The web-filtering is all reg-exp based but nobody understands how it works.
They just copy/paste an example and change the parts of the URL that they can see to match the one they want. They barely bother to test the impact, past the site they need becoming "unfiltered" or "filtered" as necessary (i.e. no implication of knock-on effects on other sites with similar names). Let's not even mention the use of "." without the escape character for them to mean a literal period (but, obviously, it means "any character" in a regexp).
I talked to them about changing their template regexp because, from the start, I could see that it wasn't really up to the job and just met if not opposition then at least apathy about the problem.
Until someone brought an iPad into the helpdesk where a site that was supposed to be unfiltered was filtered - because nobody had considered what happens if you use "http://example.com" instead of "http://www.example.com". I was the one to spot it, and tell them that it's because their regexp was very basic.
The good thing was, the other tech on the team was young and keen to learn and I was able to give them a quick rundown of regexps and we crafted an alternative template for them to use that would take account of the situation without, for instance, the blocking of "microsoft.com" affecting "antimicrosoft.com".
But it is amazing how many people I know that work in IT have no idea how to program, no idea how to handle regexps, and just work on a "copy a working example" basis.
Re: (Score:3, Funny)
It would appear that apostrophes are too.
Re: RegExps (Score:3)
Re: (Score:2)
Indeed.
So why isn't there one after reg?
(If you don't know, ask the boatswain. He was on the forecastle last time I saw him).
Re: RegExps (Score:2, Insightful)
Calling backslashes "whacks" disqualifies you from being involved in such discussions.
Re:RegExps (Score:4, Insightful)
...and then there's the people who think they understand regexes, but try to use them to decide context-free grammars.
Re: (Score:2)
Just wait till you play with Privoxy. It uses different sorts of regex for different parts of the URL pattern. You then have two additional problems, rather than just one  ;-)
Re:RegExps (Score:5, Funny)
But it is amazing how many people I know that work in IT have no idea how to program, no idea how to handle regexps, and just work on a "copy a working example" basis.
You will be truly amazed by the number of people who copy a not-working example...
Re: (Score:2, Informative)
If regular expressions are a programming language, then they are not a very powerful one. The languages they recognize are the so-called regular languages, which are the least expressive category in the Chomsky hierarchy of languages (note the difference between the language of regular expressions and the languages recognized by regular expressions). See the Wikipedia articles on regular languages [wikipedia.org] and the Chomsky hierarchy [wikipedia.org] for details.
Re: (Score:1)
But it is amazing how many people I know that work in IT have no idea how to program
Is it? Programmers routinely make 30% more. Why should an IT person, with their entire forte of knowledge and skills, ever need to take on someone else's and even if they did, why would they keep their low salary.
It is akin to an auto mechanic that knows how to use a CNC machine to make replacement parts. Sure it would be convenient to hire a $10/hr mechanic that can do $40/hr G-code, but what is in it for the mechanic?
Re: (Score:3)
A $40/hr G-code monkey would be underpaid.
Re: (Score:1)
This is pretty common. It is not some new, strange thing only you have seen:
http://en.wikipedia.org/wiki/Cargo_cult_programming
People go through life with blinders on, because life is complicated and our minds try to ignore things that it figures aren't important. A lot of people think details of one sort or another aren't complicated or relevant, then complain when things don't work for some reason. That's why empirical reasoning and the scientific method is cool; it works against this sort of logical f
^https?://([a-z0-9\-]+\.)*foo\.com(/|$) (Score:5, Interesting)
For matching URLs from a domain, here's a regex we came up with that covers some special cases. Hopefully Slashdot doesn't mangle it too badly.
https?://([a-z0-9\-]+\.)*foo\.com(/|$)
That covers:
https as well as http
"subdomains" like maps.google.com as well as www.google.com and google.com
It's not fooled by google.com.hacker.ru
Re: (Score:2)
That may be troublesome... Note that while DNS comparisons are supposed to be performed case-insensitively, uppercase letters are valid characters.
If your regex-based web filter doesn't canonicalize the domain name first and/or use a case-insensitive regex match, users may be able to circumvent the filter with http://www.foo.com/ [foo.com] (or even http://foo.com/ [foo.com]).
Of course, browsers these days do so much "helpful" manipulation on anything typed in the URL bar, I'm not sure you can slip a mixed-case domain through t
Yeah, NC for case insensitive, separate from regex (Score:2)
That's true, the regex is used with the non-case-sensitive flag. We use it with mod_rewrite, so it would be[NC] something like:
RewriteCond %{HTTP_REFERER} https?://([a-z0-9\-]+\.)*foo\.com(/|$) [NC]
In for example Perl the regex would be used within  //i (or another delimiter for clarity).
Re: (Score:2)
Yep, pretty much what I crafted. Their problem was that their ()'s lacked the power to express that it had to have something-dot, or nothing at all (specifically the "nothing at all" part, so foo.com wasn't caught by a regexp for *.foo.com
ioccc 2013 US president matching code (Score:5, Interesting)
The International Obfuscated C Code contest had a winning entry that could flag the names of US presidents as republican or democrat. [ioccc.org]
Quoting: "This one-line C program accepts as a first command-line argument the last name of any of the last 31 US Presidents (from Franklin Pierce onwards), in lower case, and prints out their political affiliation. Use "republican" as the 2nd command-line argument, and "democrat" as the 3rd (or equivalent strings of your choice)."
De-obfuscated, it is a boolean expression acting on a string s,
I wonder whether you can make a regexp that is shorter than this and accomplishes the same thing.
Re: (Score:3)
Re:ioccc 2013 US president matching code (Score:5, Informative)
Re: (Score:1)
Can someone explain, how are they using 1[..] as the name of an array? I tried making a simplified program and gcc kicked out the approriate error.
Re: (Score:1)
In C, arrays are really just pointers to the first element, and array[index] is defined through pointer arithmetic: It just means *(array+index). Addition is commutative, so you could just as well write index[array], because that is defined as *(index+array), which is the same as *(array+index).
Re: (Score:1)
In C, arrays are really just pointers to the first element
No.
, and array[index] is defined through pointer arithmetic: It just means *(array+index). Addition is commutative, so you could just as well write index[array], because that is defined as *(index+array), which is the same as *(array+index).
Yes.
Re: (Score:1)
Could you be less helpful?
Re: (Score:2)
I just saw this grossly wrong claim and since it was AC's i didn't bother to come up with a detailed explanation on why exactly it is wrong, but I didn't want it to become +5 Informative either. A likely thing to happen it seems.
Since some more people asked or complainedI did reply, dear AC [slashdot.org]
Re: (Score:3)
I think the subtletyâZ the objector had was that arrays and pointers are slightly different which is true In this context, an array is a pointer with potentially compiler allocated backing memory for the data while the pointer might not. A pointer will also have an address while the pointer used in array definitions won't have an address. Old compilers used to treat them identically but then again they used to treat pointers as integers as well. Modern compilers tend to know enough about the CPUs a
Re: (Score:2)
arrays and pointers are not 'slightly different'. I replied in detail somewhere else in the thread, if you care about details.
Re: (Score:2)
int array[32];
and
int* blah = &(array[0]);
will both point to the exact same element in memory.
``array'' doesn't point anywhere, it's an object in memory of size (32*sizeof (int)) and of type int[32]. [yes, C does have objects.]
``blah'' is another object, of size (sizeof (int*)) and of type int*
Obviously these two things are not remotely "the same" (this was the original claim(*)).
With this out of the way:
If you do an equivalence operator on them, they'll be equal.
In fact (array == blah) will be true in this case, but this is only because there is syntactic sugar involved, and the actual comparison done is (&array[0] == blah).
*(int*)((int)blah+(3*sizeof(int)))
No, this is utter gib
Re: (Score:2)
Spoilers available at 6.5.2.1^W6.3.2.1#3
(FTFY)
I am fully aware of that subscripting in turn is syntactic sugar; I agree however that it wasn't the best way to put it. It was sufficient for the point I was trying to make, anyway.
and 6.3.2.3.
Well it explicitly says there that the results are implementation-defined -- i.e. outside the scope of what C defines.
Kind of supporting my point, thanks.
Re: (Score:1)
main(int i, char **s){
puts( s[ 1-~!(*(int*)s[1] % 4796 % 275 %i) ]);
}
because a[b] is the same as a+b is the same as b+a is the same as b[a].
The "not" forces a boolean expression, which evaluates to 0 or 1. The bitwise not (~) flips the bits. For 1 the result is -2 (two's complement representation of -2 is 1111...1110). For 0 the result is -1 (two's complement...). So the index evaluates to 1-(-1)=2 or 1-(-2)=3 and the puts prints the second or third argument to the program, depending on the r
Terrifying (Score:2)
"Find it interesting"? I find it fucking terrifying. There's no way I'll be following that link. That's in bat country.
Re: (Score:2)
No, actually it is far from terrifying. Just some brute force generation and application of simple, short, regex fragments. No "AI-like", no "algorithms going to fly".
Re: (Score:3)
Re: (Score:2)
How about that. Most intelligence is just multiple layers of pretty simple stuff.
If you don't believe that, why not? To me it seems quite likely.
Re: (Score:3)
Well now you've just ruined my terror. I'll have to go read up on the up-coming UK legal hilarity to restore it.
Missed the point (Score:2)
I'd have been extremely surprised if solving for the shortest regex golf pattern for a pair of lists weren't NP hard. And greedy approaches are fairly obvious.
The point is, that's what makes it analogous to golf. The optimal solution is your hole in one. Some greedy algorithm solution is your par. Those aren't the interesting areas, they're just the end points.
For those who couldn't work it out for the,selves, here are the rules of regex golf:
How many characters you use in your regex is your score
Lower is b
whitelisting with regexp (Score:4, Funny)
There's really no better way to scan a log file for odd log entries than to write a big regexp that filters out whitelisted entries. Lets you find log entries you're NOT expecting. (and occasionally, log entries that not even the developers are expecting)
Editing them of course is a royal pain, (not to mention debugging!) so I usually write a script to compose the regexp. I just checked on one of mine, and it composes a 17,000+ character single-liner that scans my wired.log file.
I've got a smaller one that keeps an eye on secure.log for anomalies.
Re: (Score:2)
for what it's worth:
http://vftp.net/virtual1/temp/comb [vftp.net]
the same technique can be applied to any log file that you can manage to parse consistently. devs that don't understand the need to escape delimiters in text fields can make your day go bad.
Re: (Score:2)
Since SED and GREP only like to work on single lines, in some applications I will use TR to change linefeeds to some other character (like  /x01) and do my replacements and then TR them back to linefeeds.  I know SED supports pattern buffers but I have yet to take the time to play with it.
I forget. I know I've ran into problems in the past but I just grabbed an older build from an earl
This problem has been studied for decades (Score:5, Informative)
There's a field called Grammar Induction, and the problem of learning regular languages, aka regular inference, can be considered a subfield. People have been working on this since the '50s. Applications include learning DTDs for XML/wrapper induction, and all kinds of problems in bioinformatics and natural language processing.
There's a strong link with the graph coloring problem, see
http://www.cs.ru.nl/~sicco/papers/alt12.pdf [cs.ru.nl]
In this field, the focus is generally on learning FSAs, but these can easily be transformed into regexps. There's work on learning regexps directly, see
http://www.informatik.uni-trier.de/~fernau/papers/Fer05c.pdf [uni-trier.de]
Enjoy.
Re: (Score:2)
The 70's research brought us Lex and Yacc. Bison and Flex just carry on the traditions.
NP-hard? (Score:1)
Sure, it's NP-hard, but it's also nonrecursive via Rice's theorem.
Calling this NP-hard is like calling the Andromeda Galaxy "bigger than the average rhino".
Re: (Score:1)
For clarification, this is referring to the problem of writing a program that identifies regex finders. The problem of finding the shortest regex is likely NP-hard, but it's not as obvious as the article attempts to imply.Giving a shitty algorithm to solve a problem, and then saying that it would also solve SET COVER, does not imply that any algorithm for solving the first problem can be used to solve SET COVER via a Karp reduction.
Unless Pyhon has changed recently. (Score:1)
Python code? I would expect Perl or similar if parsing was your emphasis, or Lisp or some AI language if AI was your emphasis, but not a standard block-structured language.
Re:Unless Pyhon has changed recently. (Score:4, Informative)
Re: (Score:1)
This is about regexes, that is, hard to decipher code that looks like line noise. Obviously Perl is the only language that matches.
Re: (Score:2)
Additionally, Python accepts in-line comments in Regexes. That alone makes it an above-average choice.
Re: (Score:2)
Very cool, thanks.
Re: (Score:2)
Then you're a moron.
He's trying to communicate his neat ideas with as many people as he can, not optimize the shit out of something (in execution time, or length of code, or whatever). Python has become the standard language for communicating with a broad base of people who are not idiots, but not necessarily "real programmers" either. This is a good thing: it has basically replaced pseudocode.
Re: (Score:3)
could make google more "googlely" (Score:3)
Obligatory XKCD (Score:1)
Obligatory XKCD.
Oh, never mind.
Re: (Score:3)
ah the old 1 or 2 problem (Score:1)
if a_1=nopaper
if a_2=paper
//problem with this solution though is that it tends to be gender specific,
//and that is the problem with two compared variables in a statement it can easily lead to a 3turd condition.
Is the problem actually NP-complete? (Score:4, Interesting)
My reading of Norvig's blog post is that he suggests his specific approach (stapling together short regexps with ORs) requires solving the NP-Complete Set Cover problem, but he doesn't actually say anything about whether the core problem (match everything in list A and nothing in list B) does; it's conceivable that e.g. some sort of dynamical programming approach could do the job more efficiently than Norvig's algorithm does. Does anyone know whether the root problem (to avoid having to do the optimization, just phrase it as 'is there a separating regexp for the sets A and B of length less than k?') is specifically known to be NP-complete?
I want this on my business cards (Score:2)
This started Peter Norvig, the well-known computer scientist, director of research at Google and wearer of brightly colored shirts, thinking about the problem.
Now I got to get me some brightly colored shirts.
Re: (Score:2, Insightful)
That other people have hobbies different than you is going to be quite common. If you get this angry every time you run into someone with a different hobby then I fear for your long term mental and physical health.
Take deep breaths. It is going to be ok.
Re: (Score:1)
Take deep breaths. It is going to be ok.
This is what I really can't stand in a teacher. Don't lie to your student. Don't sugar coat the hard truth. It's not going to be okay. He's going to die. Dead; no longer living. It may take time. It might be not that painful, but there's a fair chance that it's going to be awful. Just tell him the truth now. Don't say
He will have to face it:
It might seem to him to be okay to be angry, about people with different hobbies, bu
Re: Regex this (Score:5, Funny)
Some of us programmers have families, a life, don't watch anime, and aren't basement dwelling nerds.
Umm  ... I just spent the last hour playing regex golf with my wife and kids.
Hard AI (Score:5, Insightful)
Regex me a list of folks that have time to sit around fucking off their life-time in order to write a regex to work on the XKCD "problem", and folks that don't.
The field of study known as "AI" has been stagnant, for about 50 years now. One of the field's many problems is the lack of a good definition for intelligence.
Despite lacking a definition, we have working examples intelligent systems in the real world - humans.
Humans are very good at partitioning sets by descriptive differences, and they discover these descriptive rules largely by themselves.
We don't know what intelligence is yet, but if we keep looking at problems and trying to figure out the human approach, eventually we'll have enough contrasts and similarity to partition sets based on differences in intelligence.
In other words, the more problems we solve, the more data we can use to formulate rules that define intelligence.
That's a pretty important and useful goal.
(And belaboring the obvious: If we had even simple AI constructs we could automate much of out work force, freeing us up for more leisurely pursuits. Whether this leads to a post-scarcity utopia [wikipedia.org] or unemployment/welfare apocalypse depends on your political affiliation.)
Re: (Score:3)
That's not true. The field hasn't been stagnant at all. The problem is you're missing most of the field in your assumption of what it is.
AI is not all about making "computer programs that can think" (a.k.a "Strong AI"), it's more about creating systems that can adapt to their environment in order to improve their chance of success. If their environment is getting lists of names and they adjust, without the direct input of a person, to better match them according to some scoring system, that is AI.
It can use
Re: (Score:2)
AI is not all about making "computer programs that can think" (a.k.a "Strong AI"), it's more about creating systems that can adapt to their environment in order to improve their chance of success.
Take a few moments and see if you can come up with a definition of Intelligence - one that can be used as a metric to see whether a topic falls within the field of study.
Abstract Algebra [wikipedia.org] has a precise definition, so it's pretty easy to tell whether something falls within the realm of Abstract Algebra. There are corner cases and some overlap, but the field is pretty-well delineated.
In contrast, AI is all over the map. Anything and everything can be put under the term, including lots of stuff that more rightl
Re: (Score:2)
I gave a definition for AI already. In fact, you quoted it. It's not the most precise one, but I don't have to provide a precise one. All I was doing was demonstrating that your implied definition was far, far too narrow.
There are many things that are considered AI that don't match the range you suggested for it. Genetic algorithms, neural networks, language learning systems, robots that take information from their environment, discard it, and drive into walls. They're never* going to think, but they are ri
Re:Hard AI (Score:5, Interesting)
Re: (Score:2)
See that big key on the right of your keyboard? The one that doesn't look like any of the ones on the left?
It needs love. Caress it every now and then.
Just like that.
Re: Hard AI (Score:2)
Re: (Score:2)
Re: (Score:2)
And for certain programs you could certainly do that. The thing that is often forgotten about the Halting Problem when it is brought up in AI discussions is its application to things like the Goldbach Conjecture. If a Halting Problem algorithm existed it could be used to determine the truth of the Goldback Conjecture. That it is proven it cannot doesn't mean that the Goldbach Conjecture is like a
Re: (Score:2)
One of the field's many problems is the lack of a good definition for intelligence.
Well, I don't mean to sound like I know everything, I certainly don't. But in my mind, 'intelligence' is the ability to make sense of Nature (or another way to say this is to be able to sense Nature). Those that don't have a good definition for 'intelligence' more than likely are living in a way that is not in tune with Nature.
They're probably dorking around with some regex, trying to sort a list of stupid from silly. Not saying that you, sir, are (as from the tone of your response, it appears that yo
Re: (Score:2)
Since you brought it up, I should point out an interesting continuum I've noticed.
On one end, we have stoner fuck-ups who think everything is an amazing manifestation of Nature (or Jesus, or Buddha, or whatever the fuck) that they're constantly and uniquely (maybe along with a few of their friends) in touch with. They can't really ever describe why, or convince anyone of anything, but they know that they are in touch with something transcendent, and if other folks are wearing blinders, so what? This is perf
Re: (Score:2)
is this person a stoner fuck-up, or is this person an unrecognized Feynman?  ...
Well, which one are you?
We're all in the middle, sir. What, I think, you're referring to - regarding the extremes - are simply those that are trying, and those that are imagining trying (they try, but without effort). Richard Feynman said himself that he was a normal person, working with limited a capacity. What he had was what we call intelligence, and it was based on his ability to make sense of Nature. It was how he understood things at a really small scale that impressed us. Ultimately, his father started him thinking alo
Re: (Score:2)
(And belaboring the obvious: If we had even simple AI constructs we could automate much of out work force, freeing us up for more leisurely pursuits. Whether this leads to a post-scarcity utopia [wikipedia.org] or unemployment/welfare apocalypse depends on your political affiliation.)
Or what it means to be rich and poor will be redefined. Having performances by skilled musicians in your home all the time used to be the mark of the fabulously wealthy, but now virtually anyone can afford to do it (provided you're satisfied with radio or recordings).
Re: (Score:2)
One of the field's many problems is the lack of a good definition for intelligence.
Intelligence is the ability to draw correct conclusions from incomplete information. There, you have it.
Re: (Score:2)
How about:
Intelligence is the ability to draw probably correct conclusions from incomplete information.
Re: (Score:1)
He works at Google on AI, and this problem is both amusing and interesting. Having two monitors is totally normal.
Slashdot really has gone to the dogs, hasn't it. Would be great if there were some sort of technical test one could take here; the results of which could be used to filter out all the - for want of a better word - users.
Please, no, don't filter all users out (Score:5, Interesting)
Re: (Score:3)
Re: (Score:2)
Re: (Score:2)
Because people who are encouraged to create new solutions for fun may comes up with something that someone getting whipped into coding may not.
More solutions, more creative ones, can mean the Terminators will be 20% faster at face-recognition and may not be so inaccurate at shooting.
Kids learn well by playing games, machines may learn to learn faster out of humans playing games.
Re: (Score:1)
It's a special case of automatic code optimisation: Since regular expressions should be reduce-able to logical gates, it stands to reason a minimized equivalent boolean function could be decompiled into a regex.
On it's own, having compilers\interpreters that optimize regexes is beneficial. But how about you turn the table over and ask yourself this, "Can I design a high, expressive, programming language that could be fully optimised without sacrificing human readability and productivity?" As far as I can te
Re: (Score:2)
Re: (Score:1)
There are sorts of "problems" that may seem a waste. When Einstein was pondering Relativity, I'm sure it would have been thought of in the same manner. The point is, sometimes great things arise from "trivial" pursuits. This could also be construed as a implementation of a search algorithm that can take a relatively broad search term and produce a more generalized search pattern, i.e. now you cannot specify give me presidential candidates A, B, C, D, E, F according to whether or not they won an election.
Th
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
No, I don't particularly find those interesting and don't really care all that much
You don't eat or drink? I see your point, but how is coming up with better ways to feed the multitudes with limited land and resources not a puzzle? See, the thing that I cannot stand about this article is that it is going to attract the attention of some really smart people (maybe you are one of these) and take your time, where if you would direct your gigantic brain and all of it's magnificent abilities (and no, I'm not being sarcastic here) to problems that aim to make life better for all, then your ow
Re: (Score:2)
Re: (Score:2)
If you want, you can probably buy some tractors and water filters right now and have them shipped to poor villages.
The length at which I'd like to point out the intricacies of growing food and filtering water in areas of the world that are heavily populated and dry, far exceed my want to type, and I'm sure your want to read. Let's just say that sending over a tractor and a filter would be as close to solving the problem that they're having as sending over a scalpel and air-mask to your house, when you need heart surgery.
The techniques needed to make fertilizer or purify water by boiling are well known.
Are they? If they're so well-known, let's hear your take on it. How do you make fertilizer? How do
Re: (Score:2)
Have you ever noticed how everyone else in the world is not you? You might find it enlightening to think about that.
Re:The Motion Picture (Score:4, Funny)
We fans know the first one is really "Star Trek: The Tone Poem".
Re: (Score:2)
The Motion Picture does not pass the regex:  /M | [TN]|B/
1) no word ends with 'M'
2) no word starting with 'T' or 'N' is preceded with a space
3) no B at all
So, I venture that "The Motion Picture" *is* considered the subtitle for the first Star Trek movie.
spoiler: r.e. presidents vs losers regexp (Score:2)
From Norvig's blog:
To avoid a contradiction and achieve Randall's intent, eliminate all winners from the set of losers:
In [293]: losers = losers - winners
The code on Norvig's blog is pretty interesting.
This one was worth my coffee break time today.
I might be missing something here, but the list of winners and the list of losers in US presidential elections both contain Richard Nixon. How can a regexp match ALL the winners and NONE of the losers in that case?