Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Spam

DSPAM v3.2 Released 157

Nuclear Elephant writes "After four months of development DSPAM v3.2 has been released, bringing many new enhancements and filtering technologies. These include distributed computing support, implementation of Bill Yerazunis' Sparse Binary Polynomial Hashing algorithm (from CRM114), and v1.2 of Bayesian Noise Reduction. Other enhancements include SQLite support and many significant performance enhancements for PostgreSQL. DSPAM's official release is next week, but you can download the preview release now. Users of the project have also contributed towards creating a new logo for this release."
This discussion has been archived. No new comments can be posted.

DSPAM v3.2 Released

Comments Filter:
  • Are most people using a bayesian DSPAM, CRM114, or SpamBayes along with SpamAssassin (rule based)? Or do you just use the bayesian filter?

    I see that most of these bayesian filtering programs mention that they can be used with SpamAssassin. Is it usually best to run both for DoublePlusGood(TM) spam catching?
    • Re:second post? (Score:5, Interesting)

      by hayds ( 738028 ) on Sunday October 17, 2004 @04:18AM (#10549259)

      I would have thought that running 2 bayesian filters would cause more trouble than good. The first filter would be ok as it would be trained like usual.

      The second filter would probably have problems because it would only see a small subset of all your mail as the first filter would have removed most of the spam. The second filter's sample would therefore be skewed and it would have far less data to accurately classify spam.

      Just my thoughts on the subject anyway...

      • It depends on when the actual filter step occurs. For example, SA (SpamAssassin) by default only marks the message. The actual deletion (by the second filter) -could- occur after SA gets through it. Basic example: SA. Other filter. Other filter Act or SA Act in either order.

        The funny part is if the second filter includes headers as part of its bayesian filtering, the second filter could become biased based on spamassassin's results :P
    • I use dspam alone on my system and it does a very good job of pulling about 80 spams a day from my inbox. I can't say that nothing gets through, but little does. few enough that I don't mind too much, maybe one or two a week.

      what does surprise me s that sometimes obvious spams seem to get through, ie every now and then a 419 comes through and I'd have thought it would be well trained on those by now. nevertheless, it works a lot better for me than spamassassin did, and it requires less (or easier) mainten
    • Re:second post? (Score:3, Interesting)

      by atrus ( 73476 )
      I'm running Postfix with RBLs. Looking at SpamCop, SpamHaus, and SORBS. It auto rejects all e-mail coming from banned IPs. This brings me down to 1 spam a day. If your IP is blocked, tough, find a new ISP (these lists tend to be more self-expiring and not 'permament ban' types, which is good).
      • Re:second post? (Score:5, Informative)

        by flynn_nrg ( 266463 ) <mmendez@NoSpAM.gmail.com> on Sunday October 17, 2004 @05:58AM (#10549465) Homepage Journal

        It's your server and hopefully you'll never have to suffer the 'collateral damage' of living near a spammer (network neighbourhood wise). It has happened to me a couple of times. The first time I actually spent time sending my reply from my gmail account, and told the guy about it. The second time I didn't even bother.

        Netblock blacklisting is a really poor solution. In some cases a single spammer causes a /24 and then a /16 to be blocked. It doesn't make sense to me. OTOH, I discovered some time ago that blocking Windows boxes works wonderfully, and it's extremely easy to do with OpenBSD's pf :-)

        Btw, do you understand that changing ISP may not be an option?

        • I discovered some time ago that blocking Windows boxes works wonderfully, and it's extremely easy to do with OpenBSD's pf :-)

          How can the pf detect a windows box ?

        • I use CBL and RBLDNS -- neither block entire netblocks, the IPs in them are static IPs which either run open proxies or fail other checks. Very, Very effective.

        • by khasim ( 1285 )
          Netblock blacklisting is a really poor solution.

          It is the only solution when the ISP will do nothing to stop the spammer on their network.

          In some cases a single spammer causes a /24 and then a /16 to be blocked.

          That is rather difficult without the ISP's assistance (or them repeatedly ignoring the complaints).

          Btw, do you understand that changing ISP may not be an option?

          Sometimes that is true. In which case, you should get on the phone and make sure that your ISP understands that they have customer
      • Re:second post? (Score:3, Informative)

        by Neophytus ( 642863 ) *
        Spamcop and Spamhaus I agree with. SORBS demand payment for removal of clean servers (albeit not to them). That just doesn't chime when people spam through an isp's smtp server and get caught.
        • A valid point. I'm trying to stick toward more dynamic lists so I mean reconsider SORBS. SpamCop accounts for 90% of my blocks, while SpamHaus picks up the other 9%.
    • Re:second post? (Score:4, Interesting)

      by kalman5 ( 667417 ) on Sunday October 17, 2004 @05:33AM (#10549410)
      What I dislike is the centralized Antispam. What is spam for me could not be for you. I was using the antispam filter on thunderbird but at least in previous was not good then I switched to use K9 ( http://keir.net/k9.html ). Is there nothing around for Linux like K9 ? K.
  • by hayds ( 738028 ) on Sunday October 17, 2004 @04:10AM (#10549237)

    I am using D-Spam on a qmail/vpopmail server and I find that its great in terms of accuracy. Most of my users have never had a false positive and many havent seen a spam after a couple of weeks of training.

    The problem that I have with DSpam is the integration side. Im not sure how it goes with other mail systems but integrating it with vpopmail was a major pain. It seems easy, you just put the command in the dotfiles, but in practice getting it to work was quite a trial. Even now it doesnt integrate properly with the web administration, etc despite some scripting and minor code changes.

    Because of this Ive been thinking of switching to Spam Assassin simply because of its integration with qmail-scanner. Has anyone else had similar problems or been in a similar situation and found a good solution?

    • I'll admit I don't really understand your post.

      All these new spam removal programs are all very well and good but from an end user's point of view, all I would like to know is:

      How long am I going to have to put up with emails like this?

      Hi. This is the qmail-send program at somewhere.com.
      I'm afraid I wasn't able to deliver your message to the following addresses.
      This is a permanent error; I've given up. Sorry it didn't work out.

      info@somewhere.com
      This address no longer accepts mail.

      --- Below this li

      • by Anonymous Coward
        What you want is ClamAV:
        http://clamav.sourceforge.net/
      • by hayds ( 738028 ) on Sunday October 17, 2004 @05:23AM (#10549388)

        This is a legit message from someones mail system. You are receiving this because someone has been infected with a virus. Their computer is sending messages from your email address, and some of these messages are going to non-existant mail addresses. Because they are spoofing your mail address in the From: you are receiving all the bounces.

        So technically, this isnt spam or junk mail. Its someones email system doing what its supposed to, returning 'your' email because the sender didnt exist.

        Unfortunately, probably not much you can do about this without blocking all such legit system messages.

        • by Anonymous Coward
          Administrators really shoudn't configure their systems to return mail that contains virusses. Most of these are sent from spoofed addresses anyway and don't make it to the system that is actually infected. They just annoy people that are not responsible for the original messaga. And on top it just generates an unnecessary amount of traffic and I really just consider this to be spam.

          • This is very true but as an end user unfortunately theres not much you can do. For messages like this to stop, its not only your mail administrator that needs to block bounces like this but every mail administrator on the net. And its not like that's gonna happen anytime soon :(
          • Maybe, but then they have to distinguish, to around 100%, what is a virus and what isn't. Currently, they just have to know, i.e. if the mailbox doesn't exist anymore; to selectively bounce they would need to examine the message carefully for known virus signatures.
          • High volume of improper bounces like this is reason for blacklisting by many of the BL maintainers. (SpamCop, et al.)
        • Unfortunately, probably not much you can do about this without blocking all such legit system messages.

          Here's a crazy idea: if you crypto-sign all messages you send, it should be possible to check the signature in bounced messages and filter any unsigned bounced messages.

        • but somewhat besides the point.

          I have to disagree with you on whether it's spam, however. Just making up statistics here, but I'd guesstimate that the sender address of >99,99% (probably even more) of all virus emails is forged and probably points at an innocent third part. That means that the message from the virus scanner is completely and utterly worthless to the reciptient (i.e. the "sender" of the virus email). That makes it "junk" or "spam" in my book.

          You're right that there isn't much you can do
          • Just clearing something up. The message that he was receiving wasnt a reply from a virus scanner, it was a bounce. I totally agree with you that virus scanners that reply to addresses that are 'sending' viruses are a total waste of time as the sender addresses are always forged.

            In this case though, the receiving server is not replying to tell him that he has sent a virus, its telling him that hes sent an email to a nonexistant user. Obviously a message like this can be very useful if you have mistyped an a

            • I get so many 'Bounced Messages' containing spam and viruses every day that I automatically delete all bounces anyways. There is, in effect, no real disadvantage to just filtering them all out because SMTP/POP3/IMAP is unreliable anyways - not by design, but in reality.

              You are not guaranteed to get a delivery error message emailed back to you for each and every delivery error anyways, so you may as well not ever expect one.

              --jeff++

          • That means that the message from the virus scanner is completely and utterly worthless to the reciptient (i.e. the "sender" of the virus email). That makes it "junk" or "spam" in my book.

            A good point. However, from what I understand, this message is generated by the MTA and not the virus scrubber. So exactly what are you suggesting?

            Maybe MTAs shouldn't alert the sender that the address they used doesn't exist (user no longer has an account, mistyped address, etc.)? That works for this situation.

            • Simple solution that works for both sides of the issue: Bring back FingerD.

              Your client can finger the email address automagicly before sending and have a nice warning if it doesnt think it exists, and then the MTA can finger the sender address to make sure its valid. This way obvious spoofed spam gets dropped. Of course people could still spoof valid addresses, but it would prevent some spam.
        • "Unfortunately, probably not much you can do about this without blocking all such legit system messages."

          Which many of us do routinely. So why bother sending faked "virus warning" messages at all, if the only effect is to worry some people with clean computers, and get the rest of us to block anything with "postmaster" in the header of the email.

        • So technically, this isnt spam or junk mail. Its someones email system doing what its supposed to, returning 'your' email because the sender didnt exist.

          Technically, if they are bouncing messages back to me when I didn't send the original message, it is unsolicited email.

          Any mail that wasn't delivered because it was a virus shouldn't bounce - everyone *knows* that viruses spoof addresses. If it isn't delivered because a filter decided it was spam, it shouldn't bounce, IMO, as spam usually forges addre

        • I trained my spam filter on bounces as well as regular messages. It got a little confused at first but soon got the hang of distinguishing real bounces from spam/virus bounces.
      • If you're recieved 30 of those, then you aren't using any kind of trainable spam filter at all. Mark it spam once, and there's no need to see the other 29.
      • Well, I have a clamav running on my mail server and it sorts out virus emails as well as bounces containing them.

        Robert
      • COMPLETE COPY OF NETSKY VIRUS

        Make the mail admin install the qmail-send.mimeheaders [orfika.net] patch -- it causes bounces to bounce back only the headers of email with MIME attachments. As google provides, my qmail patchlist is quite long, actually [dasbistro.com]. :-)

        I'm moving over to Postfix these days -- it seems to do everything qmail does but without the need to recompile every time I want a change.

      • http://cou.ch/bounce.txt

        Perl script to handle spam bounces...

      • How long am I going to have to put up with emails like this?


        If you install amavisd-new and clamav, you won't have to put up with it at all. amavisd-new is a generic mail proxy that calls both spamassassin for spam filtering, and clamav for virus scanning. If you really want you can get it to call dspam as well. It also can use a huge number of other virus scanners if you prefer them. I now get zero viruses using clamav and zero false positives.
  • Is DSPAM... (Score:3, Funny)

    by DLR ( 18892 ) <.moc.liamg. .ta. .lahtnesorld.> on Sunday October 17, 2004 @04:26AM (#10549281) Journal
    ...any better than CSPAN?
    • ... to CPAN!!
    • Ok, some one moderated the parent as a troll. I didn't think I had to say this but, IT'S A JOKE PEOPLE. It's even been modded Funny just incase the clue bus doesn't stop at your terminal. Sheesh!

      Yes, the above counts as "humor" too. :) Have a nice day.

  • 3.2? (Score:1, Redundant)

    by fyonn ( 115426 )
    it seems to me that we're on 3.2 preview release 1. not 3.2 release which is scheduled for the 20th to the 22nd. is this post a bit early?

    dave
  • Check out the download page [nuclearelephant.com]

    Here's what it shows.


    October 1, 2004 3.2 Release Candidate 1
    October 8, 2004 3.2 Release Candidate 2
    October 14, 2004 Devel Frozen - Critical Changes only
    October 15, 2004 3.2 Preview Release 1
    October 20, 2004 Devel Absolutely Frozen. Release to packagers.
    October 22, 2004 3.2-STABLE Official Release


    ONLY the 3.2 Preview Release 1 is currently out!
  • by Anonymous Coward on Sunday October 17, 2004 @04:35AM (#10549301)
    From TFA, "around 99.95% (1 error in 2000)"

    I'm sick of spam filters braging about their overall error rate. All of them do OK at getting rid of the bulk of spams and saving the bulk of time.

    The real important differentating factor is how many false positives they mistakenly accuse of being spam.

    The consequenses of a spam message getting through are minimal - under a seconds of time, on average, to skip them.

    The consequenses of a non-spam getting blocked can be huge - loss of a customer - a mom not knowing her kid is in trouble.

    I wish the spam filters focused entirely on reporting how few false positives they produce.

    • by Scaba ( 183684 ) <joe@NOspAM.joefrancia.com> on Sunday October 17, 2004 @05:44AM (#10549432)
      The consequenses of a non-spam getting blocked can be huge - loss of a customer - a mom not knowing her kid is in trouble.

      Dear Mom,

      I hope this email finds you well. All is fine here, out in your garage. As you know, I love working on my cars. I'm currently replacing the engine block in my '76 Trans Am. Well, wouldn't you know it, but just moments ago, this 550 lb engine block fell on my legs and I cannot stand up, and in fact, am probably bleeding to death. Luckily, I have my cell phone handy and so am able to send you this email - the marvels of technology!! Anyway, I know you only check your email about twice weekly, but when you do, please send help.

      Your loving son,

      Dexter

    • by Anonymous Coward
      even false positives are not important. if I get 1000 spams a day, but only 40 legal mails, then marking everything as spam is 96% correct. if 35 of my mails are easily identified as legal mail (a procmail rule could do - closed and filtered mailing list) then marking those as good and everything else as bad is 99.5% correct. note that still all 5 personal mails I would get are marked as spamm.

      the big question for me is: how many mails do I need to check for false detection? and here is the dspam issue: it
      • Why isn't that relative in tyour address book? / Why don't you have whitelisting set up?
        • "Why isn't that relative in tyour address book? / Why don't you have whitelisting set up?"

          (a) Because it would whitelist any emails sent from a virus-infected computer that that person had previously sent an email to.

          (b) Because people like that change their address all the time. "Hi! I'm on AOL now -- see my new address?"

          (a+b) Because people like that never sign their emails, nor do they use different email-addresses for personal, public, shopping, and mailing-lists.

          I think his point was that you need
          • With a 99.9% accuracy on spam filters, and better performance on false positives, it just isn't worth the time. On the occasional chance that you are sent something from an address that isn't in your address book, and also happens to be a false positive, the chance of it also being vitally important are slim. And if it is vitally important, the sender will in all probability chase you when you don't respond.

            There's a story about a CEO that used to sweep his pile of memos into the waste bin every morning

    • I'm running a mailserver with postfix, dspam, squirrelmail, courier pop/imap, amavis and Postfix Admin where I also integrated the DSPAM phpControlCenter.

      DSPAM has currently given my 0 false positives.
      The clue with dspam is to start with a clean database for each user and let them start to 'sort out their spam'. For imap it's stupidly simple. Everyone has two folders "spam" and "notspam", where you can drag&drop an email to the right folder. A script picks up any emails in each folder every hour and do
    • If you reject spam at SMTP time, then the person sending the mail will know right away. You could even send back a report as to why the mail was marked as spam.

      If you look at how spamassassin works, for example, it's a lot of little things. You can actually send back what each of those little things were, by sending back SA's report.

    • The danger of false positives with modern filters is much overrated.

      People are getting used to there being mail filters in the system and know that email is not perfectly reliable. This can be due to mechanical reasons - a mail filter discarding the message, or due to human reasons - the message got lost in a pile 10,000 spams, since the user doesn't have a spam filter, or it may be an executive with email overload who gets 2000 legitimate messages every day.

      Therefore, if someone sends an important messa

  • Filters? (Score:3, Funny)

    by Anonymous Coward on Sunday October 17, 2004 @05:22AM (#10549385)
    Me've always found that the best filter still is the humble (and the not so humble) human :p
  • by Anonymous Coward on Sunday October 17, 2004 @05:54AM (#10549458)
    a few months ago those features were available, too. while dspam is great at filtering mail, I faced two crucial problems, which forced me back to spamassassin. I haven't heard that they fixed any of those:
    - the database did grow huge. when my single user server with 128 mb had to use a 512 mb spam token database, performance was terrible. even with the tools included I could not do anything to fix the issue.
    - dspam knows only yes or now, there is no usable value that gives you some grey information. as a result, I had to check all those spam postings for false positives. Spamassassin on the other hand has that spam result 0 .. 10, so I can check 0..4 where 0 is ok (few false negatives) and 1..4 spam (few false positives), and I can directly delete thousands of mails in 5..10 without looking at them.

    i wont go back to dspam unless someone can offer speciic help for those issues. I believe everyone will face them sooner or later.
    • by Anonymous Coward
      There are a lot of things you can (and should) do to keep small databases in DSPAM when disk is an issue. The problem is some of this is in the FAQ rather than the docs...but you can change your training mode to TOE (which only trains on error), set up merged groups (which uses a global db and then each user only stores corrections, almost as accurate), do some creative purging, and if you're really paranoid about disk, turn off some features like chained tokens (although i don't think it's necessary).

      As
    • "the database did grow huge. when my single user server with 128 mb had to use a 512 mb spam token database, performance was terrible. even with the tools included I could not do anything to fix the issue."

      Did you run the nightly and weekly purge scripts, as documented? (purge.sql for your DBI driver)

      Did you also change the model to TUM from the default? ( MUCH more accurate results over TOE or TEFT in our case, and we get a lot of spam!)

      "dspam knows only yes or now, there is no usable value tha

    • the database did grow huge... ...performance was terrible.

      Did you try TOE mode? Instead of analyzing everything, it just uses the errors. That means significantly less utilization of your data backend. From the FAQ:

      Switch to TOE Mode. DSPAM v2.10 supports TOE (Train-On-Error) mode, which only performs writes to the database in the event that a misclassification has occured (or if a user has fewer than 4000 innocent messages in corpus). Train-on-error mode should make a significant reduction in the numbe
  • by Axoiv ( 747887 ) on Sunday October 17, 2004 @06:02AM (#10549478)
    Does DSPAM inform the sender that his/her e-mail has been filtered out?
    • by hayds ( 738028 ) on Sunday October 17, 2004 @06:08AM (#10549490)
      No. Since spammers mostly use fake addresses, it's pretty pointless trying to send mail back to them. All that would achieve would be that you would receive all the bounces back and you'd get double the junk mail.
      • I suppose DSPAM can inform the receiver then, that an e-mail was filtered out? Somebody would need to know, if it was one of those filtered out by mistake.
        • Yes, theres is a web based CGI that comes with it which you can use to view your statistics. It also has a quarantine where any filtered messages are stored so you can view and retrieve any false positives.
      • Just say no to bounces! Bounces suck when some spammer decides to use your domain as a return address and you get all these stupid bounces. Ditto with all those stupid "your email was rejected 'cuz of a virus" when someone else was impersonating my email address.

        What you should do, however, is reject the message in the SMTP session. My mail server issues a 554 during SMTP if you send me a spam or a virus. That way, legitamate senders will still get a notification of the delivery failure (generated by th

  • Asking slashdot:
    Which provider do you think does the best effort to filter/fight spam and uses the most state of the art techniques for that? The german freemailer GMX I use now is good, but I wonder if others do better.
    And I wouldn't mind paying for never receiving spam again. Is Apple .mac email service any good? I have a Mac and sure could make use of some of the other features they offer ...
    • For me GMail's spam filter has beaten the rest (so far). YMMV.
    • I had to turn off the GMX Spam filter because it was blocking messages from a mailing list I am on.

      I tried marking the messages as 'not spam' based on the sender, but every single message has a different - unique - sender so that failed. To top it all, I could not even remove the 30-odd senders from the list again.

      Now it is down to Mozilla's spam blocker again. It has virtually zero false-positives, but misses too many (30%) spam messages.

      There are times when I'd love to have a baseball bat and a list
  • Platforms... (Score:2, Interesting)

    by Anonymous Coward
    The DSPAM site mentioned that it can be compiled on Mac OSX, but what about Winblows? I only have one box (go ahead and laugh) and it is an older Pentium III Winblows machine. I'd like to have a seperate box to act as a mail server but it just isn't currently feasable (translation: I'm broke.) Is there any way they can compile DSPAM for Win9X?
  • by Anonymous Coward
    this is one heck of a product, and I think it would be used more if there were a very verbose install of the current version on various platforms (similar to obsd version on site).

    think- spamassassin, clam, spammassassin howto or something similar but it has to be VERY verbose to bring in the crowds (newbies).

    my 2c

    AC

  • Here's another spam solution:

    If we had a respected national leader who could often talk to millions of people, that person could change the culture. The leader could tell everyone never to buy anything or even respond to unsolicited email advertising.

    It might take years, but eventually it would not be economic for spammers to operate, particularly since spam filters would continue to improve.

    The only person who could do this in the U.S. now would be Oprah Winfrey. She has an enormous following,
  • by JohnGrahamCumming ( 684871 ) * <slashdot@ j g c . o rg> on Sunday October 17, 2004 @08:52AM (#10549874) Homepage Journal
    Why does DSPAM get front page treatment when the latest POPFile [getpopfile.org] release (which now handles POP3, IMAP, SMTP and NNTP filtering) and has an XML-RPC external interface, supports different databases, etc. etc. gets rejected as a story?

    Perhaps it's because I don't tend to make super-wild claims about POPFile's accuracy? Or come up with cool marketing names for the internal technology?

    POPFile's the only Bayesian filter that can:

    1. Do more than spam vs. anti-spam and
    2. Filter POP3, IMAP, SMTP and NNTP (that's right Usenet news)

    Do I have an axe to grind with Jonathan and DSPAM? No, it's a cool project. Does it annoy me that /. has recently turned into some combination of Freshmeat and PC Magazine? Yes.

    John.
    • Does POPFile work with exchange native format. The baastard exchange admins (yes there are more then one!) at my office refuse to turn on pop or imap because "it's too dangerous" even though exchange supports imap over ssl.

    • Do I have an axe to grind with Jonathan and DSPAM? No, it's a cool project. Does it annoy me that /. has recently turned into some combination of Freshmeat and PC Magazine? Yes.

      Do I like to ask questions aloud and then answer them myself? You bet.

      ;)

    • Does it annoy me that /. has recently turned into some combination of Freshmeat and PC Magazine? Yes.
      Then why are you still a subscriber? Notice the asterisk:
      JohnGrahamCumming (684871) *
      Although I agree with your points maybe the first step would be to vote with your wallet until Slashdot starts to improve. Right now you're just rewarding them for heading in a direction of which you disapprove.
    • Recently? Are you new or something?

      By the way I love PopFile.
    • I suspect different editors have different biases. DSpam has received a lot of hype for mostly the reasons you describe. I'd never heard of popfile before your post (though there's a ton of popular open source tools I've never heard of) so my guess is some editors are more willing than others to post stories about lesser-known tools. Re-submit the story in another week and hope you get lucky.
  • by Anonymous Coward
    ... I want a spam filter that automatically forwards all spam to the abuse@ mailbox for the domain from the spammer.

    Once the admins start getting hundreds of thousands of spam complaints in their abuse boxes PER DAY. Then maybe they'll start to think of ways to fix this problem.
  • Before filtering (Score:2, Informative)

    by Phatmanotoo ( 719777 )

    I got nothing against content-filtering measures, as long as one is aware that this should be just the last layer of defense againts spam. Think about it, if your SMTP has already swallowed the spammer's email content, you have already lost precious bandwith.

    Especially if you host your own SMTP, you should put up a layered system of defenses: RBL lists, maybe tarpitting, white/graylisting, and then content filtering.

  • by Synn ( 6288 )
    Has anyone used DSPAM with xmail?
  • by hey ( 83763 )
    The paper uses the term "GPLware". I haven't seen that befofe. I might use it. Of course, we remember "freeware", "shareware", etc.
  • To prevent someone from doing something illegally while the spammers continue to do whatever they want?

    Shouldn't they pay for the costs when they are caught?

You are always doing something marginal when the boss drops by your desk.

Working...