Regex Golf, xkcd, and Peter Norvig

Regex Golf, xkcd, and Peter Norvig 172

Posted by samzenpus on Sunday January 12, 2014 @04:20PM from the problem-to-solve dept.

mikejuk writes "A recent xkcd strip has started some deep academic thinking. When AI expert Peter Norvig gets involved you know the algorithms are going to fly. Code Golf is a reasonably well known sport of trying to write an algorithm in the shortest possible code. Regex Golf is similar, but in general the aim is to create a regular expression that accepts the strings in one list and rejects the strings in a second list. This started Peter Norvig, the well-known computer scientist and director of research at Google, thinking about the problem. Is it possible to write a program that would create a regular expression to solve the xkcd problem? The result is an NP hard problem that needs AI-like techniques to get an approximate answer. To find out more, read the complete description, including Python code, on Peter Norvig's blog. It ends with this challenge: 'I hope you found this interesting, and perhaps you can find ways to improve my algorithm, or more interesting lists to apply it to. I found it was fun to play with, and I hope this page gives you an idea of how to address problems like this.'"

Regex Golf, xkcd, and Peter Norvig

This discussion has been archived. No new comments can be posted.

Search 172 Comments Log In/Create an Account

Comments Filter:

Re:FWIW, the Regex Golf game (Score:3, Informative)

by Garridan ( 597129 ) writes: on Sunday January 12, 2014 @05:09PM (#45933809)

Or, if you want to try your hand at meta regex golf [stackexchange.com] there's a place for that, too.

Re:RegExps (Score:2, Informative)

by Anonymous Coward writes: on Sunday January 12, 2014 @05:25PM (#45933915)

If regular expressions are a programming language, then they are not a very powerful one. The languages they recognize are the so-called regular languages, which are the least expressive category in the Chomsky hierarchy of languages (note the difference between the language of regular expressions and the languages recognized by regular expressions). See the Wikipedia articles on regular languages [wikipedia.org] and the Chomsky hierarchy [wikipedia.org] for details.

This problem has been studied for decades (Score:5, Informative)

by DogPhilosopher ( 1149275 ) writes: on Sunday January 12, 2014 @05:34PM (#45933955)

There's a field called Grammar Induction, and the problem of learning regular languages, aka regular inference, can be considered a subfield. People have been working on this since the '50s. Applications include learning DTDs for XML/wrapper induction, and all kinds of problems in bioinformatics and natural language processing.
There's a strong link with the graph coloring problem, see
http://www.cs.ru.nl/~sicco/papers/alt12.pdf [cs.ru.nl]
In this field, the focus is generally on learning FSAs, but these can easily be transformed into regexps. There's work on learning regexps directly, see
http://www.informatik.uni-trier.de/~fernau/papers/Fer05c.pdf [uni-trier.de]
Enjoy.

Re:ioccc 2013 US president matching code (Score:5, Informative)

by hankwang ( 413283 ) writes: on Sunday January 12, 2014 @06:16PM (#45934199) Homepage

It takes the first four bytes of the president's name, converts them to int; then applies four modulo operations (%4796 %275 %4). How the author figured out that those four operations would do the job, who knows? Maybe brute-forced the parameter space.

Re:Unless Pyhon has changed recently. (Score:4, Informative)

by bunratty ( 545641 ) writes: on Sunday January 12, 2014 @06:37PM (#45934329)

Python is very similar to Perl, and also has most of the characteristics that make Lisp a good language to program AI algorithms.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Regex Golf, xkcd, and Peter Norvig 172

Regex Golf, xkcd, and Peter Norvig More Login

Regex Golf, xkcd, and Peter Norvig

Re:FWIW, the Regex Golf game (Score:3, Informative)

Re:RegExps (Score:2, Informative)

This problem has been studied for decades (Score:5, Informative)

Re:ioccc 2013 US president matching code (Score:5, Informative)

Re:Unless Pyhon has changed recently. (Score:4, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot