Turing Test Passed 432
schwit1 (797399) writes "Eugene Goostman, a computer program pretending to be a young Ukrainian boy, successfully duped enough humans to pass the iconic test. The Turing Test which requires that computers are indistinguishable from humans — is considered a landmark in the development of artificial intelligence, but academics have warned that the technology could be used for cybercrime. Computing pioneer Alan Turing said that a computer could be understood to be thinking if it passed the test, which requires that a computer dupes 30 per cent of human interrogators in five-minute text conversations."
Turing Test Failed (Score:2, Insightful)
The test itself failed and is meaningless.
Re:Thirty percent? (Score:5, Insightful)
Re:Thirty percent? (Score:5, Insightful)
By random chance you would detect the computer 50% of the time, so that should be the goal.
Still 30% as "passing" seems unreasonably low.
Re:Turing Test Failed (Score:5, Insightful)
It's a bit of an underhanded way to pass to pretend to be someone who doesn't speak English natively. The point of the test is to have a conversation for 5 minutes, not 5 minutes of "oh I can't understand you because I'm from Ukraine".
Outdated test (Score:4, Insightful)
A pretty low requirement (Score:4, Insightful)
I feel like the requirements for the Turing test have been consistently lowered over the years to match what would be considered realistic to achieve rather than, as Alan Turing seemed to believe, demonstrate that a computer can be said to actually be "thinking."
Time to move the goalposts! (Score:2, Insightful)
"Well, 30% isn't very impressive."
"Well, but people expect online correspondents to be dumb."
"Well, nobody ever thought the Turing test really meant anything."
Whether you "believe in" AI or not, progress is happening.
There will always be people who refuse to believe that a computer can be intelligent "in the same sense that humans are". Eventually, though, most of us will recognize and accept that intelligence and self-awareness are mostly a matter of illusion, and that there's nothing to prevent a machine from manifesting that same illusion.
Unbounded tape (Score:5, Insightful)
Not Really Passed... (Score:5, Insightful)
It convinced 33% of judges it's a 13-year-old Ukrainian. Since the test wasn't run in Ukrainian, you can't really say it proved that it had human-level language skills. Poor syntax, grammar, not understanding the question, etc. would be excused by the Judges as the "kid" doesn't know English well.
Since the program claimed to be 13, it also did not actually have to understand most of the things there are to talk about. Or anything, really. As an Englishman you wouldn't expect a Ukrainian teen to know anything about your life in England, and in turn the computer could make up all kinds of things about it's life in Ukraine and you'd have no clue.
So this isn't really AI, it's a take on the Eliza program of the late 80s/early 90s that hides the computer better.
Now if the test had been in Ukrainian, and happened in Odessa or Kiev; or even in Russian and in Moscow; tricking 33% into thinking your computer is a 13-year-old Ukrainian boy would be really fucking hard. It would be an amazing accomplishment.
Re:Outdated test (Score:5, Insightful)
A good turning test has an equal mix of humans and AI, and rewards the best in both..
Humans who pass as human, or as bots.
Bots that pass as Bots or as Human.
And has equal numbers of those shooting for each goal.
Half your entrances are trying to convince you they are human, the other half that they are AI, and half of each are lying.
Re:A pretty low requirement (Score:5, Insightful)
I'd say we keep raising the bar.
"If a computer can play chess better than a human, it's intelligent."
"No, that's just a chess program."
"If a computer can fly a plane better than a human, it's intelligent."
"No, that's just an application of control theory."
"If a computer can solve a useful subset of the knapsack problem, it's intelligent."
"No, that's just a shipping center expert system."
"If a computer can understand the spoken word, it's intelligent."
"No, that's just a big pattern matching program."
"If a computer can beat top players at Jeopardy, it's intelligent."
"No, it's just a big fast database."
Re:An autist chat simulator duped 100% of people. (Score:5, Insightful)
And a chair is not a chair. It seems you are not living in the real world. Except for high functioning autism, autism is a severe mental dysfunction.
Re:Thirty percent? (Score:4, Insightful)
Most humans _are_ stupid. AI on their level would not be useful at all.
Garbage (Score:3, Insightful)
All it showed, like any other Turing Test, is the gullibility of the subjects.
1) "Ukrainian" speaking English
2) 13 years old
Right there you have set up an expectation in the audience of subjects for a limited vocabulary, no need for grammatical perfection, little need for slang, and a lack of education. Now add in "star wars and matrix" and you have reduced the topics of discussion even more to the ones the programmers know best.
This thing would never have answered a question of 'Why', it also was under no pressure to being able to create a pun, both of which are easy things any older and educated human could do.
Garbage test, garbage results.
As usual.
Re:Turing Test Failed (Score:5, Insightful)
Re:A pretty low requirement (Score:2, Insightful)
Re:Turing Test Failed (Score:5, Insightful)
What has been conducted precisely matches Turing's proposed immitation game. I don't know what do you mean by a "full-blown Turing test", the immitiation game is what it has always meant, including the 30% bar (because the human has three options - human, machine, don't know). Of coure, it is nowadays not considered a final goal, but it is still a useful landmark even if we have a long way to go.
That's the trouble with AI, the expectation are perpetuouly shifting. A few years in the past, a hard task is considered impossible for computers to achieve, or at least many years away. Then it's pased and the verdict prompty shifts to "well, it wasn't that hard anyway and doesn't mean much", and a year from now we take the new capability of machines as a given.
Re:A pretty low requirement (Score:4, Insightful)
...and your brain, during a game of Jeopardy, is what if not a search engine?
Of course, (at least) advanced deductive capabilities are also important for general intelligence. That's the next goal now. (Watson had some deductive capabilities, but fairly simple and somewhat specialized.) We gotta take it piece by piece, give us another few years. :-)
Re:Turing Test Failed (Score:5, Insightful)
People were fooled (really, really fooled) by Eliza way back in the day. It doesn't mean squat.
No. They weren't. I speak as somebody who's had a go with Eliza and you could spot that it was a computer program in a couple of minutes if you wanted to. It's more likely that people were suspending their disbelief than really fooled.
Re:Turing Test Failed (Score:3, Insightful)
Turing never ruled out this sort of conversation...
Probably because he expected people to have some fucking common sense.
Re:Turing Test Failed (Score:5, Insightful)
What has been conducted precisely matches Turing's proposed immitation game.
NO, it DEFINITELY does NOT. For just one example, it tries to get around the "natural language" stipulation by pretending to be someone who doesn't fully know that language, and uses a simplified version instead.
That is a very clear attempt to subvert the rules.
I could go on, but it isn't necessary. It wasn't a real Turing test. We can leave aside the other nuances because the first criterion wasn't met.
The 'test' was fixed (Score:5, Insightful)
What has been conducted precisely matches Turing's proposed immitation game.
While they may have matched the letter of it, they subverted the spirit of the test. This quote [independent.co.uk] from the programme maker in particular is highly suggestive that they lowered the standards :-
To illustrate what I mean by lowered standards, imagine if I set up the same test, with 10 entries, and I tell the judges some of them are 2 year old babies playing on the keyboard. Armed with this information, some of the judges are likely to interpret even gibberish as typed by a human and it is not too farfetched to get more than 30% of them to agree.
This "result" is bollocks and a pure publicity stunt conveniently on falling on the 60th anniversary of Turing's death.
I want to see the actual transcripts which do not appear to have been released so far, which in itself is highly suspicious.
Re:The 'test' was fixed (Score:5, Insightful)
Interrogator: In the first line of your sonnet which reads "Shall I compare thee to a summer's day," would not "a spring day" do as well or better?
Witness: It wouldn't scan.
Interrogator: How about "a winter's day," That would scan all right.
Witness: Yes, but nobody wants to be compared to a winter's day.
Interrogator: Would you say Mr. Pickwick reminded you of Christmas?
Witness: In a way.
Interrogator: Yet Christmas is a winter's day, and I do not think Mr. Pickwick would mind the comparison.
Witness: I don't think you're serious. By a winter's day one means a typical winter's day, rather than a special one like Christmas.
I think the problem is that the way Turing was picturing the test, the human interrogators would be as smart as Turing and his friends, people who actually know how to ask probing questions. When you look at the conversation above, you see that he had in mind a program that does things which is decades beyond of what chatbots can do today. Everybody is dissing the Turing test, and if it has a problem, it's in that Turing overestimated people, in assuming that they actually know how to have conversations of significance. I still think there is something deeply significant about the Turing test, but in the one that I'm picturing, the interrogators must all be broadly educated experts on natural language processing with specific training in how to expose chatbots. And there should be money on the line for the interrogators: $1000 bonus for each correct identification, $2000 penalty for incorrect identification, no penalty for "not sure". If the majority of such experts can be fooled by an AI under these circumstances, then I think we should all be impressed.