Block Spam Bots With Free CAPTCHA Service 56

Posted by timothy on Wednesday November 12, 2003 @01:55PM from the screwtape-and-pals-shriek dept.

Chirag Mehta writes "I just released a freeware service called BotBlock (barebones demo) that lets site owners copy/paste a few lines of PHP code and insert a CAPTCHA image-verification system into any web form. The amount of form spamming by bots is on a rise. While remedies exist for MT blogs, a more efficient solution is to use image-verification or text-identification. Used for a while by sites like Yahoo! (scroll to bottom), Hotmail and patented in 2001 by AltaVista, CAPTCHAs are now being used more widely. PARC also came up with two algorithms Baffletext and Pessimal Print. The technology always existed, but until now required the site owners to install image libraries and understand how to generate images that cannot be OCR'ed. With BotBlock it is like inserting a page counter."

Block Spam Bots With Free CAPTCHA Service

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 56 Comments Log In/Create an Account

Comments Filter:

What about blind people? (Score:5, Interesting)

by FattMattP ( 86246 ) writes: on Wednesday November 12, 2003 @01:58PM (#7454692) Homepage

What about people who are blind or visually impared? Does your implementation take that into account?

- Re:What about blind people? (Score:2, Informative)
  
  by Phoenix Dreamscape ( 205064 ) writes:
  
  They have one that generates sounds. You're in trouble if you're blind and deaf, though.
  - Re:What about blind people? (Score:2)
    
    by FattMattP ( 86246 ) writes:
    
    They have one that generates sounds.
    
    Where on his demo page [chimetv.com] does it allow me to hear a sound? There's no mention of it on the main page [chimetv.com] either.
- Re:What about blind people? (Score:5, Interesting)
  
  by Glass of Water ( 537481 ) writes: on Wednesday November 12, 2003 @02:33PM (#7455081) Journal
  
  What they should do is use a question, written out in regular HTML text that is easy for a human to answer but hard for a computer. Example: What color is the sky on a cloudless day? Another example: My name is Joe Frank Smith. What are my initials?
  Think those are easy for basic AI bots? Then try them with one of the existing online bots [alicebot.org].
  Seems like the problem with this (as opposed to generating pictures) is that it's hard to generate question/answer pairs where there is a one-word or obvious single answer. You don't want to use yes/no questions or questions where the answer is a word in the question ("Which is heavier, lead or cotton?").
  
  - Re:What about blind people? (Score:1)
    
    by J x ( 160849 ) writes:
    
    Wouldn't it be amusing (chilling?) if, in an effort to circumvent your proposed security measure, spammers stumbled upon true AI ?
    - Re:What about blind people? (Score:2)
      
      by Glass of Water ( 537481 ) writes:
      
      The way I see it, we win either way (unless these spammer-created bots become our new overlords).
      - Terminator 4: Rise of the Spambots (Score:2)
        
        by Channard ( 693317 ) writes:
        
        The way I see it, we win either way (unless these spammer-created bots become our new overlords).
        And if they do, the worst they'll do is try to sell us penis enlargement pills, which is still preferable to a Terminator style apocalypse.
  - Re:What about blind people? (Score:3, Interesting)
    
    by Jerf ( 17166 ) writes:
    
    What they should do is use a question, written out in regular HTML text that is easy for a human to answer but hard for a computer. Example: What color is the sky on a cloudless day?
    
    I'm afraid I'd have to recommend against using that question for blind people.
    
    Might want to pick your examples a bit more carefully ;-)
    
    (Not that it's absolutely impossible they'd know the answer, but it's mere meaningless trivia to someone who has been blind from birth; I don't think I'd remember it.)
    
    Think those are easy f
    - Re:What about blind people? (Score:2)
      
      by hbo ( 62590 ) * writes:
      
      I'm already getting SPAM that gets through SpamAssasin's Bayesian filter. They include lots of non-spammish words as white text on a white background. Then they break up the SPAM spew with unbalanced, bogus closing tags. For example:
      
      "En</figure>large yo</allowed>ur me</plastic>mber!"
      
      which helpful HTML renderers will print in glorious spamavision. (As Slashdot's did until I enclosed the example in an ecode block.)
      
      Your point is well taken. If you come up with a suite of questions. the spa
      - Re:What about blind people? (Score:2)
        
        by Carnildo ( 712617 ) writes:
        
        Either the filter will learn the bogus tags, or SpamAssassin will get a spam test that assigns a high score to the tags.
        
        Re:What about blind people? (Score:2)
        
        by hbo ( 62590 ) * writes:
        
        It would have to be the latter, since the tag text could be any dictionary word whatsoever, except some currently open tag.
        
        Assigning a high score merely to "bogus" closing tags would be bad too, because of XML. You could score a large number of poorly formed (in the XML sense) tags as suspect. Doing so for only one or two might catch fat-fingered, but otherwise innocent coders. 8)
        
        Re:What about blind people? (Score:2)
        
        by Carnildo ( 712617 ) writes:
        
        Does any e-mail software use XML rather than HTML for formatting e-mail?
      - This can work at a low level (Score:2)
        
        by jtheory ( 626492 ) writes:
        
        If you come up with a suite of questions. the spammer can come up with a suite of responses.
        
        You (and parent poster) have some good points here. Something you're missing, though -- you're still thinking in terms of a large service that can be reused by lots of websites.
        
        Suppose the system only offered the framework, and you had to provide (and rotate) the questions yourself for your own website. I'm thinking of writing a filter question into my forms, since I hate those text recognition things (my eyesig
        
        Re:This can work at a low level (Score:2)
        
        by Carnildo ( 712617 ) writes:
        
        The simple fact that you're doing the forms yourself will stop 99.9% of all spambots. A spambot usually doesn't download the page and fill it in, it takes a list of pages known to have submission forms of a known type (usually found by a google search) and submits pre-filled forms to them. Since you're doing a custom form, a spammer would need to find your form, and then spend the time to tell his spambot how to fill it out -- a much less productive use of time than finding more customers to spam for.
      - Re:What about blind people? (Score:2)
        
        by herrvinny ( 698679 ) writes:
        
        What about running the email through SpamAssassin, then strip out all HTML tags and run the message itself through it? That should kill it. Or just switch to text email.
        
        Re:What about blind people? (Score:2)
        
        by hbo ( 62590 ) * writes:
        
        That's possible, but difficult. The bogus tags themselves reveal why that's so. They are not valid HTML, but they have the form of valid closing tags. Though I don't know the pre-XML (read fairly current) HTML spec very well, and being too lazy to look it up at this hour, I nevertheless seem to recall that it says browsers should ignore tags they don't recognize. In any event, browsers are notoriously liberal about what they will render, so as to make the "user experience" nicer, and the job of standardizat
        
        Re:What about blind people? (Score:2)
        
        by silentbozo ( 542534 ) writes:
        
        Well, there is the final solution - whitelisting. Unfortunately, like the atomic bomb, it may render the the battlefield unfit for human consumption...
      - Re:What about blind people? (Score:1)
        
        by Ed Avis ( 5917 ) writes:
        
        In this case the obvious cure is to render the 'HTML' to plain text first and then do spam-checking on that. Of course if you use a lame mail reader that really wants to display the lovely red colours and FONT SIZE="+9" then you still have a mismatch between what is checked and what is displayed, but not such a big one.
    - Re:What about blind people? (Score:2)
      
      by dubious9 ( 580994 ) writes:
      
      perl -pe 's/My name is (\w)\w* (\w)\w* (\w)\w*. What are my initials\?/$1$2$3/g'
      (Try it on your question. Be sure to type the question precisely.)
      
      What is the perl code for arbitrary questions? The spam programmer doesn't have access to your question. Nobody has programmed a bot that can correctly answer arbitrary question. There is no current way to de-obfuscate (er.. clarify?) this problem. All everybody has to do is write a unique question the a normal person would understand.
      
      Then you are on t
      - Re:What about blind people? (Score:2)
        
        by Jerf ( 17166 ) writes:
        
        I'd reply, but I already have.
        
        BTW, before criticising this 'solution', be sure you understand what an arms race is. I know you could further obfuscate it. But you could also further de-obfuscate it. And believe me, with a halfway intelligent system I can keep pace with you; for instance, if I write my cheating spammer so it brings things to my attention in real time as it can't figure them out, I can build a solution bank pretty quickly, not quite as quickly as you can create new challenges (well, maybe, if
    - Re:What about blind people? (Score:2)
      
      by Glass of Water ( 537481 ) writes:
      
      Might want to pick your examples a bit more carefully ;-)
      Uh, Oh! It's harder than I thought!
      Your criticism of generating question/answer pairs is insightful. Don't forget that the bots can also learn to read the pictograms (I think there's a paper on this linked off the captcha.org home page). Whatever type of turing test you come up with, there are likely to be holes in it.
      I'm also aware that even a small hole can be just as bad as a big one. I guess the question is whether you can have enough of
      - Re:What about blind people? (Score:2)
        
        by herrvinny ( 698679 ) writes:
        
        *Nods I agree. Arms races are fine, they may even be beneficial, because in this race, each side works harder and harder to increase the capabilities of a computer. That can only be a net good, because someday something good is going to come out of all this anti-spam research. But for now, we have to concentrate on this arms race. As long as we can keep a small advantage over spammers, keep them reacting to us, we hold the advantage. Some military general once said that you have to keep the enemy reactin
  - Re:What about blind people? (Score:5, Insightful)
    
    by herrvinny ( 698679 ) writes: on Wednesday November 12, 2003 @08:28PM (#7459594)
    
    The problem is, generating all those sentences. The sentences have to vary, they can't all be: My name is Barney Big Purple Dinosaur. What are my initials? My name is Einstein Mozart Bach Quartet. What are my initials? Then a spammer could just use regular expressions to handle that. Even Java introduced an easy-to-use regex package a few versions ago. Another problem is, you would have to generate literally billions of them, because a spammer may theoretically just hit a service with billions of requests - who's to say that the requests are real or not? And then the ultimate problem: How are we going to generate all these questions? A computer, of course, but the problem is again, how does a computer generate billions of these things so only a human and not a computer can interpret it? At that point, you're approaching true AI. And if we had AI, forget the spam problem: Just have the AI process each and every email.
    
    - Re:What about blind people? (Score:2)
      
      by Glass of Water ( 537481 ) writes:
      
      Yeah. That's definately the challenge.
      I really didn't mean to use the same format question and just change the insignificant bits. It just so happens that the examples I chose are bad. I really mean you have to have a supply of question/answer pairs where the answer is obvious and not contained in the question.
      That this is a problem only AI can solve has not been demonstrated. It's clear that it's a hard problem, though.
      Maybe you could come up with a model for simple things that people understand
- Re:What about blind people? (Score:1)
  
  by dbullock ( 32532 ) writes:
  
  That's what alt tags are for.
much better (Score:3, Informative)

by capoccia ( 312092 ) writes: on Wednesday November 12, 2003 @01:59PM (#7454708) Journal

much better than blacklists and captcha is a bayesian filter.

blacklists are innaccurate: blacklisted words can be misspelled and pass through.

captcha discriminates against the disabled and cuts them off from online discussions.

James Seng has crafted a good bayesian filter for movable type [james.seng.cc].

okay class, pencils down (Score:2, Interesting)

by Phoenix Dreamscape ( 205064 ) writes:

Some of the examples on their site take a lot more time and mental effort than just looking at a word and typing it. I would be very bothered if I had to take one of those little tests just to fill out a form.
- Re:okay class, pencils down (Score:2)
  
  by AllUsernamesAreGone ( 688381 ) writes:
  
  Even better then: it not only stops spammers, it ensures that only people with a real need actually fill the form in.
  
  Maybe it could be modified so that only people with >120 IQ can fill in the form too.... hmmmm.....
- - Re:what about accessibility? (Score:1)
    
    by GreenHell ( 209242 ) writes:
    
    <img alt=november oscar india delta echo alpha>
    
    There's two problems with that:
    
    First no alt text is provided in the linked to implimentation.
    
    Secondly, by doing so you've just eliminated the usefulness of the image as a spam bot blocker. I mean, how long would it really take someone to fix up the code on their spam bot to check for alt text and swipe the first letter of each word in it to deal with that kind of situation?
    
    The entire point of the image was that it couldn't be read by machines, by pro
    - YHBT (Score:1)
      
      by recursiv ( 324497 ) writes:
      
      The entire point of the image was that it couldn't be read by machines, by providing alt text you've just removed that restriction and the image's usefulness along with it.
      
      The poster knew this. It was either a joke or a troll, or both.
- Re:what about accessibility? (Score:1)
  
  by lynx_user_abroad ( 323975 ) writes:
  
  Works just fine for me. Of course, the text browser has to be tied into a graphical imager (like Gimp) to display the one small image, but it was surprisingly intuitive.
Botcheck (Score:2)

by BrookHarty ( 9119 ) writes:

I tried to sign up with a forum this weekend, and I couldnt tell the letters, Couldnt tell the Zero from an "O". Only a minor problem, still has a few bugs to be worked out. But its nice to have real time authorization, instead of waiting for email to authorize the accout.

Also lots of services, are there any good free downloadable php addons?
Blatent Plug (Score:2, Informative)

by gavinroy ( 94729 ) * writes:

For my GPL'ed PHP Captcha sofware:

http://sourceforge.net/projects/session-captcha/ [sourceforge.net]
Patented? (Score:3, Interesting)

by orthogonal ( 588627 ) writes: on Wednesday November 12, 2003 @02:18PM (#7454930) Journal

patented in 2001 by AltaVista

If AltaVista patented it, does BotBlock license the patent? Or will this service be rather short-lived?

- Re:Patented? (Score:2)
  
  by Goo.cc ( 687626 ) * writes:
  
  That's what I want to know. It would seem that this software is violating the patent.
I'm neither blind nor deaf, but... (Score:2, Interesting)

by jcwren ( 166164 ) writes:

...the images here here [captcha.net] are absolutely unreadable. If I had to use this to subscribe to a site or forum, or fill out a form, I'd just say "screw it", and wander on down the 'net.
- Re:I'm neither blind nor deaf, but... (Score:2)
  
  by Curien ( 267780 ) writes:
  
  You sure? Looks fine to me. Takes a small bit of effort, perhaps, but it's definitely readable.
- Re:I'm neither blind nor deaf, but... (Score:2)
  
  by recursiv ( 324497 ) writes:
  
  absolutely unreadable? try reloading. i haven't found one yet that was remotely challenging.
BotBlock looks breakable (Score:1)

by JukkaO ( 199949 ) writes:
Not that I really looked at how configurable this is, but...

...seems to me this BotBlock thingy wouldn't be that hard to decode, juding by the example, at least.
- The font is fixed-width with black outlines on each letter
- The background consists of single-color filled ellipses and/or circles.
- Clicking the image gives you a new pic with the exact same codeword.
Ssooo, I bet it's feasible to figure out where the codeword starts on the pic. And since the font is easy I guess you can figure out each of the
The new Turing test? (Score:2)

by G4from128k ( 686170 ) writes:

It seems like all these clever bot deflectors are really intelligence tests of one form or another. That they discriminate against the blind, non-English-speakers or people with lower IQ is a shame. Bot makers will now work hard to OCR given classes of text-image-disruption algorithms or answer given classes of common sense questions. This means we will have an arms race of smarter bots and tougher tests.

At some point the tests will be so tough and the bots will be so good that many people will be thw
- - Re:The new Turing test? (Score:2)
    
    by herrvinny ( 698679 ) writes:
    
    Then we'll ban all IP's from foreign countries. I do it already with email.
Unique CAPTCHA Implementation (Score:3, Informative)

by madstork2000 ( 143169 ) writes: on Wednesday November 12, 2003 @06:32PM (#7458302) Homepage

I'm working on another version, which I believe is unique at this point. (At least I didn't find anything like in on Google a few weeks ago).

See a sample at the link below. (DISCLAIMER:: This site is a small self run hosting company, and has "sales" links, and is of commercial nature. So if you're going to get all pissed off because I am trying to feed my kids please do not click through. The sample does not collect or log anything outside of what Apache routinely collects. ) http://webshowhost.com/main.php?smPID=PHP::ui_huma n_verify.php&caseFlag=SAMPLE [webshowhost.com]

What makes this implementation unique is that in the pattern user must identify color and characters. It combines multiple levels of recognition. The user must understand the concept of COLOR and the characters. This should make it particularly difficult for SPAM bots to dicipher, since color is very subjective. I am posting this here mainly to establish prior art (as I have not seen any test use these concepts before) in case some joker tries to patent this variety of CAPTCHA.

My variety integrates into a toolkit I've developed, but basically uses imagemagik montage to fuse pre-rendered image bitmaps into a single JPEG.

It is obviously weak in the sense that it discriminates against blind folks and illiterate folks. On the bright side it has definately eliminated ALL of my spam!

If your interested in this contact me at captcha1@webshowpro.com [mailto] ** Note you'll have to verify yourself with the prototype system to sendmail to that account.

I'll do my best to provide you with the relevent code. I don't have time at this point to lead a project (as my company is a oneman show barely scraping by at this point). So my apologies in advance if I cannot support the code to your satisfaction.

- Re:Unique CAPTCHA Implementation (Score:2)
  
  by madstork2000 ( 143169 ) writes:
  
  I forgot to mention I am working on a version for blind folks, that works pretty much the say way,but instead of stitching together images, it will stitch together sound bytes of the alphabet to make the pass phrase. To help avoid confusin I started with "A - Alpha", "B - Bravo" "C - Charlie", etc though I don't have enough done to test however average users respond to this format.
  
  There has not been much demand, so I have not made much progress since my initial tests.
  
  Overall it will be a little weaker i
- Re:Unique CAPTCHA Implementation (Score:4, Insightful)
  
  by Carnildo ( 712617 ) writes: on Wednesday November 12, 2003 @08:27PM (#7459584) Homepage Journal
  
  A few things to keep in mind:
  1) Colorblind people (10% of the male population of the world). By far the most common form of colorblindness is red/green, so as long as you stick with easily-distinguished colors like black, red, and blue, you should be fine. You could probably add yellow and a medium grey to the mix, but yellow can be hard for normal people to read, and on some monitors, grey can be mistaken for black.
  2) Increase the overlapping of the characters a bit. Right now, the characters can usually be separated out by color into three images, at which point a spambot can simply pick the one that matches the color of the instruction image.
  3) You can make an audio CAPTCHA harder for computers to recognize by adding noise to the sound, or by using recordings of a person with a strong accent (or better still, a variety of accents)
  
Not a perfect solution (Score:4, Insightful)

by Eric Savage ( 28245 ) writes: on Friday November 14, 2003 @02:46PM (#7475621) Homepage

Even if you had an image that was 0% readable by OCR, image verification only stops "pure bot" spamming. It does not stop someone writing a helper or proxy app that presents them with a list of 1000 images that they type out in a very efficient manner. This could mean the difference between a million and a thousand spams per hour, but that's still a thousand spams per hour. And if you dismiss this as something that nobody would bother to do, you obviously don't know anything about spammers...

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

What about blind people? (Score:5, Interesting)

Re:What about blind people? (Score:2, Informative)

Re:What about blind people? (Score:2)

Re:What about blind people? (Score:5, Interesting)

Re:What about blind people? (Score:1)

Re:What about blind people? (Score:2)

Terminator 4: Rise of the Spambots (Score:2)

Re:What about blind people? (Score:3, Interesting)

Re:What about blind people? (Score:2)

Re:What about blind people? (Score:2)

Re:What about blind people? (Score:2)

Re:What about blind people? (Score:2)

This can work at a low level (Score:2)

Re:This can work at a low level (Score:2)

Re:What about blind people? (Score:2)

Re:What about blind people? (Score:2)

Re:What about blind people? (Score:2)

Re:What about blind people? (Score:1)

Re:What about blind people? (Score:2)

Re:What about blind people? (Score:2)

Re:What about blind people? (Score:2)

Re:What about blind people? (Score:2)

Re:What about blind people? (Score:5, Insightful)

Re:What about blind people? (Score:2)

Re:What about blind people? (Score:1)

much better (Score:3, Informative)

okay class, pencils down (Score:2, Interesting)

Re:okay class, pencils down (Score:2)

Re:what about accessibility? (Score:1)

YHBT (Score:1)

Re:what about accessibility? (Score:1)

Botcheck (Score:2)

Blatent Plug (Score:2, Informative)

Patented? (Score:3, Interesting)

Re:Patented? (Score:2)

I'm neither blind nor deaf, but... (Score:2, Interesting)

Re:I'm neither blind nor deaf, but... (Score:2)

Re:I'm neither blind nor deaf, but... (Score:2)

BotBlock looks breakable (Score:1)

The new Turing test? (Score:2)

Re:The new Turing test? (Score:2)

Unique CAPTCHA Implementation (Score:3, Informative)

Re:Unique CAPTCHA Implementation (Score:2)

Re:Unique CAPTCHA Implementation (Score:4, Insightful)

Not a perfect solution (Score:4, Insightful)

Related Links Top of the: day, week, month.

Slashdot Top Deals