Wikipedia Used for Artificial Intelligence 177

Posted by Zonk on Sunday January 07, 2007 @02:25PM from the great-it-has-finally-become-self-aware dept.

eldavojohn writes "It may be no surprise but Wikipedia is now being used in the field of artificial intelligence. The applications for this may be endless. For instance, the front of spam fighting is a tough one and it looks as though researchers are now turning towards an ontology or taxonomy based solution to fight spammers. The concept is also on the forefront of artificial intelligence and progress towards an application passing the Turing Test and creating semantically aware applications. The article comments on uses of Wikipedia in this manner: '"... spam filters block all messages containing the word 'vitamin,' but fail to block messages containing the word B12. If the program never saw B12 before, it's just a word without any meaning. But you would know it's a vitamin," Markovitch said. "With our methodology, however, the computer will use its Wikipedia-based knowledge base to infer that 'B12' is strongly associated with the concept of vitamins, and will correctly identify the message as spam," he added.'"

Wikipedia Used for Artificial Intelligence

This discussion has been archived. No new comments can be posted.

Search 177 Comments Log In/Create an Account

Comments Filter:

Wikipedia needs work for spam filtering.... (Score:2, Insightful)

by MoHaG ( 1002926 ) writes: on Sunday January 07, 2007 @02:31PM (#17499066) Homepage

With the example of using Wikipedia for spam filtering as mentioned in the post, maybe more articles need to be written on spam-slang for Viagra....

Comment removed (Score:3, Insightful)

by account_deleted ( 4530225 ) writes: on Sunday January 07, 2007 @02:35PM (#17499108)

Comment removed based on user account deletion

Save me! Math. (Score:1, Insightful)

by Anonymous Coward writes: on Sunday January 07, 2007 @02:38PM (#17499148)

"The applications for this may be endless. For instance, the front of spam fighting is a tough one and it looks as though researchers are now turning towards an ontology or taxonomy based solution to fight spammers. "

So what happened to bayesian filters as our saviour [slashdot.org]?

Re:uh oh, there goes wikipedia (Score:5, Insightful)

by WilliamSChips ( 793741 ) writes: <full.infinity@gma[ ]com ['il.' in gap]> on Sunday January 07, 2007 @02:50PM (#17499286) Journal

You don't think there are hundreds of thousands of zombifiable computers in the United States? And what about people with business connections in China or Korea?

Since when (Score:4, Insightful)

by trifish ( 826353 ) writes: on Sunday January 07, 2007 @02:54PM (#17499316)

Since when a database + automated search (keyword patterns and relations) = artifical intelligence?

Just make spam a crime! (Score:4, Insightful)

by D4C5CE ( 578304 ) writes: on Sunday January 07, 2007 @02:58PM (#17499358)

However many academic papers and spam filters throw their ever-more-elaborate algorithms at this issue, it is an arms race that cannot be won by the "good guys", as long as lawmakers keep pretending that technology alone could prevent the effects of sociopathic behavior: unsolicited bulk messages won't go away unless sending them is severely punishable and vigorously prosecuted in all nations that contribute to the problem. This should have happened more than a decade ago, but now the world is simply running out of storage, bandwidth and CPU cycles much too quickly to afford waiting another decade (or even a year) for serious, intransigent anti-spam legislation that is long overdue.

Re:Wikipedia needs work for spam filtering.... (Score:1, Insightful)

by Anonymous Coward writes: on Sunday January 07, 2007 @03:01PM (#17499392)

by spam slang, do you mean stuff like V1AGRA or V14GR4 or V1I1A1G1R1A?
If so, I'm pretty sure thats a pattern recognition problem.
As long as the AI knew what the correct spelling for viagra,it would be able to recognise the characters of the word viagra in V1I1A1G1R1A.
Also you could train an AI to recognise 1 as I or L so that when the text V14GRA appears, it knows what viagra is, and realises it looks like V14GR4 so it raises the probability of the text being spam.

More abstract phrases would be harder to classify, but there is a link to slang words for stuff like http://en.wiktionary.org/wiki/Wikisaurus:penis#Eng lish [wiktionary.org]

so stuff like "got wood?" etc could in theory be classified.

Re:uh oh, there goes wikipedia (Score:3, Insightful)

by Walt Dismal ( 534799 ) writes: on Sunday January 07, 2007 @03:17PM (#17499548)

I agree that using Wikipedia opens up the knowledge base to strategic contamination. Any party with a vested interest could alter certain information and bias AIs using it. That is why I think the Israeli approach cited will run into problems.
In my own research I've looked at the problem of AI knowledgebase contamination and know that unless a truth validation system is employed, it is all too easy to condemn the poor AI to reasoning with flawed data. And it's very difficult to design a good validation mechanism. Can you use 'common' knowledge and opinion to check against? Well, the masses aren't always right. There are a lot of falsehoods floating around the Internet. Collecting a pool of information from various sources requires effort to cross-check and evaluate.
Of course humans face the same problem, and a lot of people reason with incomplete, incorrect, invalid data. Which might explain why the dollar is dropping versus the Euro. :)

Not very "intelligent" (Score:5, Insightful)

by iamacat ( 583406 ) writes: on Sunday January 07, 2007 @03:28PM (#17499632)

There are lots of legit e-mails discussing vitamins, viagara or even penis enlargement, this post included.

Re:Wikipedia needs work for spam filtering.... (Score:5, Insightful)

by Metasquares ( 555685 ) writes: <slashdot@met[ ]uared.com ['asq' in gap]> on Sunday January 07, 2007 @04:31PM (#17500176) Homepage

Infer too much and the false positive rate skyrockets, though...

Re:Since when (Score:3, Insightful)

by maxwell demon ( 590494 ) writes: on Sunday January 07, 2007 @06:06PM (#17501094) Journal

What part of human/animal intelligence is not detecting, storing, and applying patterns and relations?

The creative part?

Comment removed (Score:2, Insightful)

by account_deleted ( 4530225 ) writes: on Sunday January 07, 2007 @06:55PM (#17501504)

Comment removed based on user account deletion

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Wikipedia Used for Artificial Intelligence 177

Wikipedia Used for Artificial Intelligence More Login

Wikipedia Used for Artificial Intelligence

Wikipedia needs work for spam filtering.... (Score:2, Insightful)

Comment removed (Score:3, Insightful)

Save me! Math. (Score:1, Insightful)

Re:uh oh, there goes wikipedia (Score:5, Insightful)

Since when (Score:4, Insightful)

Just make spam a crime! (Score:4, Insightful)

Re:Wikipedia needs work for spam filtering.... (Score:1, Insightful)

Re:uh oh, there goes wikipedia (Score:3, Insightful)

Not very "intelligent" (Score:5, Insightful)

Re:Wikipedia needs work for spam filtering.... (Score:5, Insightful)

Re:Since when (Score:3, Insightful)

Comment removed (Score:2, Insightful)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot