The State of Natural Language Programming 387
gManZboy writes "Brad Meyers (and co) of the Human Computer Interaction Institute at Carnegie Mellon have written an interesting paper about the state of natural language programming. They point out that well understood HCI principles aren't finding their way into relatively new languages like Java and C#."
Programming in english sucks anyway (Score:5, Insightful)
Closest I've found to Natural Language... (Score:2, Insightful)
But natural language requires more typing than say C syntax.
A EQUALS B
A = B
But does the thought process get speeded up. If so one needs to know how the gains and loss affect overall development.
I don't buy it. (Score:5, Insightful)
jeff
Re:Programming in english sucks anyway (Score:4, Insightful)
Hmmmm (Score:3, Insightful)
On that site, there's http://www.alice.org/whatIsAlice.htm [alice.org] which says
So, this is just like Visual Basic. I know that can't be true, or else Microsoft would be marketing VB as NLP. So what am I missing?
Oh NO! Not Again! (Score:3, Insightful)
Multiply x by y to get something or the other
Natural language programming. (Score:5, Insightful)
An interesting read.
People think in their languages. (Score:4, Insightful)
I'd imagine that a "natural language" system could be developed with different approaches based on the native tongue of the programmer, but I would think this would damage the benefits of commonality that other languages now enjoy.
I didn't RTFA... (Score:4, Insightful)
That's about as far as I got. I guess he didn't really express his ideas in the same way that I wanted to think about them.
Which nicely illustrates the point that there's always a "semantic gap" associated with natural languages, which builds up because people have different ways of thinking. The semantic gap is even wider when one of the entities being communicated to happens to be a machine. There's a reason why traditional programming languages are precise and exact...it's so that the gap is reduced - the machine will do exactly what you tell it to do...even then we have a disconnect between what the programmer's thinking, and the code that he's writing.
Natural Language isn't for Serious Programming (Score:4, Insightful)
Good point! (Score:5, Insightful)
Natural language, while easier for beginners, would make for horribly inefficient code and would be undesirable for any sizeable application.
What is the PURPOSE of natural language? (Score:3, Insightful)
Would would any programming want to code in english? To me this:
myvar++
makes more sense than:
increase the variable myvar by one please
Do we really want people who can't understand something as simple as "myvar++" to be programming in the first place? Seems to me we NEED a barrier to entry. There're enough lousy programmers out there already.
The problem is (Score:2, Insightful)
The fact is that people don't care what's academically sound, or what people have "proven" is the best way to do things. In fact, the things people do care about are directly contradictory with what's academically "best". It isn't some kind of head-slapping coincidence that the new popular languages ignore "natural programming". It's the market speaking, and it's saying "we don't want natural programming languages".
Wow. (Score:5, Insightful)
We've had a lot of posts about "OH NO! COBOL!" Yes, yes, I agree with you -- pretending to be English usually results in awkward and unnatural syntaxes. One of the advantages of a formal syntax like most programming languages is that it clicks the brain into a different mode. (How many of you can read sigs like 2b||~2b? I thought so.)
But that's not really the paper's main aim. It makes a couple of notes that all of us, particularly those of us in language design, could benefit from.
1. People tend to deal with collections in the aggregate far more often than they step through them an item at a time. The example given was "set the nectar of all the flowers to 0." Look past the syntax for a moment and look at how simple that is.
2. Debugging the traditional way sucks. Did anyone actually read that bit at the end about the 'Why?' questions, and look at the screenshots? Holy crap. That's really impressive.
Of course, I may be biased, because the points made in the article are basically the same that underlie a language I'm currently designing.
Deterministic vs. nondeterministic (Score:3, Insightful)
The real problem is a lack of strong domain models for most real world situations. That is, if you're starting a project to emulate something happening outside of a computer, then there's a very large likelihood that you're going to have to build your own object model to describe the situation to the desired level of accuracy. Once you have that model, it's easy enough to say "do this until that happens", but there's a world of difference between that point and staring at a blank screen at the beginning of a project.
There's been some progress (depending on who you ask) to make this easier for those who aren't full-time programmers, such as UML and related design tools, but even these are mainly limited to building a high-level template of the final result so that a human can manually implement all of the details.
This may or may not be avoidable. Vernon Vinge (author and CompSci professor) refers to the "Age of Failed Dreams" where humans eventually concede that some things just aren't possible. Expecting a current deterministic Turing device to be programmable at the level where people interact with each other may very likely be one of those areas.
Re:Programming in english sucks anyway (Score:3, Insightful)
Is this a good idea? (Score:5, Insightful)
Right now that happens - only the program gets generated by programmers (sometimes outsourced to India!)
Unfortunately, what the user says they want, and what they really want are usually very different things. Natural Language Programming really doesn't solve that problem.
The critical piece is the Designer, who sits between the end user and the programmer, and asks the tough questions: "Do you really want that? Let me explain the implications of what you just asked for." "How critical is that piece of functionality that you just added on a whim, but it just added 3 years to the project plan?" "You're asking for the data to be selected this way, but really there's no use for that - have you considered selecting the data this other way?" etc.
Re:Remember Apple Script (Score:2, Insightful)
I think you really mean the opposite of what you said. The syntax of natural language is bogglingly complex. You can express the syntax of even perl with a few kilobytes of EBNF. Noam Chomsky tried to come with formal syntax rules for spoken languages and utterly failed (though his work is what led to BNF and company)
Re:Oh NO! Not Again! (Score:2, Insightful)
(Or SQL for that matter. We managed to finally obsolete COBOL more or less, but SQL is still with us.)
Anyhow, the idea behind the COBOL Natural Language push was to rid the need for programmers or at least greatly reduce their training. However, it was found that somebody with some training could translate business logic into something that the computer could understand better than amatures. In other words, with some training productivity was so much higher than some manager trying to be precise enough.
Further, regular human speach is vague. To learn not to be vague takes a some training itself. And, many have discovered that English is not geared well to being precise. Rather than bend English to be precise, it appears better to toss English with something meant for precision. In other words, there is a bigger payback in efficiency to bend the human to be more like the computer (or at least use better language structure) than the other way around.
Further, non-programmers often have a horrible time with "normalization". They tend to duplicate stuff in ways that comes back to haunt the design down the road. In the real world making copies of papers is the norm, for example. If you do this in the computer, then you have to remember to change all the copies and know where they are.
Thus, if the training effort involves issues not just related to language to do it right, then it is worth it to integrate a more precise language into the training rather than futz with english.
Otherwise, it is like putting training wheels on motorcycles. It pays to get the prerequisites right first.
Maybe on a small scale, natural language will be okay. However, scaling this to production systems would probably be a mess. Even with languages more precise than English, we still make some ugly bugs. The problems would probably jump an order or two of magnitude if a hacky amature system based on the naturally vague English language is applied.
What is natural language? (Score:3, Insightful)
I understand that the goal is to have the user just tell the computer what to do in English. The problem is that English is not precise and is too ambiguous. I don't know if I would want to fly on an airpline if I knew the computer on board was programmed in English.
Re:Good point! (Score:1, Insightful)
The next issue is who would want to use it. How often have you read some techniacal instructions for anything relatively complex and not been entirely sure exactly sure what to do? Even stuff writen by proffesionals is not always exactly clear. Ever been confused reading a text book, I have. That won't fly with the computer. Any code one writes would have to be super verbose and probably writen and rewriten several times to be suitibly clear. No that the compiler could not give intelligent help like, "I don't understan what you mean by 'then goto the next node' did you want the left or the right?" but still while it might be easy for beginners it would be a pain for large projects; I doubt it could really ever be useful.
Which brings us back to the only thing you can do is comprimise. Limit the vocabulary and select a particualr definintion for each key word(or one for each of a limited set of contexts). As soon as you do this though the language becomes less "natural" and you get Cobol; I doubt many want to go back there.
standard flaw in research like this (Score:3, Insightful)
Great, I accept that a new language can make toy problems easier.
However, I think the situation is very different when you have a real programmer working on a real program. Writing a real application, like a word processor or a web browser, is difficult no matter what language you do it in -- and I would argue that the difficulty doesn't vary much between languages. In fact, I would further argue that many of these research languages, while making toy problems easier, would actually make "real" programming substiantally harder, because the semantics of the language are not as formalized and thus more difficult to remember and deal with.
I'm certainly not opposed to advances in language theory and design -- our modern-day large applications would be essentially impossible to write if all we had to work with was machine language. But to be a major advance, a new language should focus on making real problems easier for real programmers, not making toy problems easier for non-programmers.
Re:Natural Language isn't for Serious Programming (Score:1, Insightful)
What do you mean by "serious programming"? This strikes me as someone that learned assembly mocking those that knew C, which mock those that use Java.
Re:AppleScript, anyone? (Score:3, Insightful)
Re:Is this a good idea? (Score:5, Insightful)
Re:Programming in english sucks anyway (Score:5, Insightful)
We learn math pretty friggin' soon after learning to spell. And at that point, you never write "one plus one equals two" ever again, if you ever did.
The fact is that comptur language has a closer relationship mentally to math than to english. So why not use mathematical language? Its even better, as its not tied to a single human language (who wants to translate "plus" into Swahili?)
IMHO, modern commercial languages do suck for not learning from their more academic peers (Java, I'm looking in your direction - inner classes rock, but they're no excuse for all the missing stuff).
But for "englishoid" languages, I learned VB first and then Java, and I have to say the first thing I loved about Java was the unambiguous { and } blocking.
I'm all for using natural english in the parts of programming where it belongs - like flow control, function calls, class definitions, etc. But having unambigous "this is a block" characters helps mental consistency, and using mathematical syntax helps people understand the more mathematical concepts (although the C = as assignment thing is an abomination -
Re:Programming in english sucks anyway (Score:4, Insightful)
but _WHICH_ natural language? (Score:3, Insightful)
I know that natural language is creeping into UI's in specialized search engines. If you know where to look, you will find natural language search features on Fidelity.com and perhaps other financial websites. These are much more carefullly bounded problems than the broad challenge of allowing a user to express a solution or algorithm for an arbitrary problem a computer could be programmed to do in, say C, but using ordinary speech. The article sited is interesting and it might make life better for us programmers but I am not getting my hopes up that more than incremental change to computer languages is around the corner.
Specialsied languages are not just for programming (Score:5, Insightful)
Programming is also something that is easier to express in a specialised language. Sure we can make some things more human readable but does that make it easier to understand? The hard part of programming isn't reading/writing the code so much as knowing what structures and concepts to use. Making programming more natural language like will not really make programming easier, you still need skill and practice. Using the music analogy again: I don't play music and can't read music score (the language of music). If Beethoven's fifth (if he ever had a fifth) was rewritten in a natural language it would not make it easier for me to play; I'd still need a whole lot of practice with a piano or whatever to play effectively. Relative to aquiring the piano skills, I expect learning to read sheet music would be relatively simple.
Where natural languaages might help is in system design and requirement capture. Still, however, I think that most often things go wrong because when people are expressing their thoughts in a natural language they use very woolly thinking and use vague terms.
Re:Programming in english sucks anyway (Score:3, Insightful)
This is true for primatives.
But for even slightly more complex tasks ("Find all the people who live in New York, and add their votes."), natural language is about as good. As the tasks get more complex, natural language ("Look for your friend in this picture, then see who's standing next to him.") quickly describes things that are formal language headaches.
Flowcharts (Score:4, Insightful)
Natural language does not... (Score:3, Insightful)
Natural language compilers?are they kidding??? (Score:3, Insightful)
We humans don't even talk logically at times (logically in the mathematical sense). We say one thing, we mean another one. One of the most difficult things for new students is to get used to the strictly mathematical nature of computer languages. Computer thinking requires every bit to have its special meaning in the universe. Most people choke on that. The most capable programmers are those that can hold a mental model of the application, its various parts and as a whole. These types of people can translate requirements to code very efficiently, because they can reason about a program's state better since they remember the whole program and they can immediately recognize the consequences of any programming decision.
And when one becomes familiar enough with the way the computer works, then the verbosity really gets in the way.
What we need is a development environment that can reason about the state of the program. That's the root of all problems. Embedding state information in a program is something I haven't seen in any language. Most languages, if not all, work in the assumption that anything can happen anytime, and they don't have state constraints, thus allowing the programmer to make mistakes that could be cought in compile time.
It's hard enough to find a programming job... (Score:3, Insightful)
The world will always need people who understand that asking for the last digit of Pi isn't a worthwhile request.
"Computer, sort this list of names, then beat me at chess without moving your queen, then formulate a method of reversing entropy." "Computer, tell me a joke."
If natural language aims at letting users tell the computer what to do in the terms they think about their tasks, the computer needs to be aware/intelligent to understand the requests. Otherwise there's always going to be a manual describing what you can and can't ask and how/how not to ask it. And people won't read manuals, they'll write programs that don't work.
And then, you and I will *finally* get programming jobs. :)
Comment removed (Score:4, Insightful)
Re:Perl version (Score:1, Insightful)
This is as natural as it gets.
Re:Perl version (Score:3, Insightful)
Of course, what this is really doing is:
So unless there's a you method in package love, this will cause a runtime error. The following would be a little more consistant with the other examples, but less like English:
...and if you want this to run with the strict pragma in effect, you'll have to quote the string "perl", or use a scalar variable $perl.
Re:Specialsied languages are not just for programm (Score:3, Insightful)
In a way, the languages of mathematics and music are natural languages. Someone didn't sit down one day and enumerate all of the rules for mathematical expressions, it evolved to suit the needs of mathematicians and has retained the flexibility that results from such evolution, much like "social" languages.
It's hard for programming languages to "evolve" in the same sense since they aren't "for humans, by humans", but we do try new language designs and find that some work better than others.
Some of the more "dynamic" languages go some way to enabling this kind of evolution. If I try to use an unusual construct in a mathematical expression, I'd probably follow it with a statement in English or mathematics explaining the meaning. If it was a useful construct, others will adopt it and slowly the explanation will become unnecessary. Likewise, in some languages we can define new constructs (within certain boundaries, of course) and tell the compiler what is meant by them in simpler terms, usually by writing some kind of function. Over time, popular constructs will be adopted as core features in newer languages. One example that springs to mind is the foreach construct, which does vary from language to language but arose because it was very common to want to visit each element in a list in turn and perform some operation on it. Modern languages have become a lot more expressive so this kind of evolution will probably become more common.
Re:People think in their languages. (Score:2, Insightful)
English speakers (at least the grammatical ones) are familiar with a handful of verb inflections -- singular vs. plural; present or past tense -- but Old English actually inflected the nouns of a sentence as well, to indicate the subject and the predicate. You could say either "Dick hit Jane" or "Jane hit Dick" and the noun inflection, not the word order, determined who actually got hit. I'm no linguist, but I believe there are contemporary languages with similar features.
Well yes, of course there are contemporary languages with similar features.
That's a vast conceptual shift, that a person's name can be said differently according to whether he's the subject or the object of an action.
No it isn't. What you're missing here is that the underlying argument structure of natural languages is pretty universal: inflection and word order are just two different ways of encoding the same agent-patient-theme hierachy. A parser could extract the same semantic information from two languages which on the face of it appear to be quite different. The hypothetical natural language programming system would interpret sentences based on their argument structure rather than their syntax per se.
There are no natural ways of thinking; thought patterns are learned and largely cultural.
Says who? Certainly not modern cognitive science, or indeed modern linguistics. What is a "thought pattern", anyway? If it's any technique for thinking which is learned, then, by definition, anyone who isn't already familiar with it can...learn it. Problem solved.
For "natural programming" to really work, the system would have to be tailored to each native language and culture it would serve.
Obviously the system would have to be tailored to each language, but this would just be a matter of rewriting the parser and vocabulary (and in fact it should be possible to share a lot of the parsing infrastructure between languages). There's no reason to expect any "cultural" issues would extend further than this.
As a linguist, I'm very sceptical of the general utlity of NLP. While it's perfectly possible to parse natural language, it's an impossible problem to reliably extract meaning from it. This is because most meaning in a natural language sentence is not encoded -- it is infered by the hearer based on assumptions about the attitudes, knowledge and intentions of the speaker. Clearly, for a computer to do this it would require a massive encyclopaedic knowledge, and (more or less) a fully functioning human-like mind. I doubt we'll be seeing this any time soon.