Microsoft Bots Effectively DDoSing Perl CPAN Testers 332

Posted by timothy on Monday January 18, 2010 @08:48AM from the stuck-in-a-rut dept.

at_slashdot writes "The Perl CPAN Testers have been suffering issues accessing their sites, databases and mirrors. According to a posting on the CPAN Testers' blog, the CPAN Testers' server has been being aggressively scanned by '20-30 bots every few seconds' in what they call 'a dedicated denial of service attack'; these bots 'completely ignore the rules specified in robots.txt.'" From the Heise story linked above: "The bots were identified by their IP addresses, including 65.55.207.x, 65.55.107.x and 65.55.106.x, as coming from Microsoft."

This discussion has been archived. No new comments can be posted.

Microsoft Bots Effectively DDoSing Perl CPAN Testers

Load All Comments

Search 332 Comments Log In/Create an Account

Comments Filter:

So how do we DDoS Microsoft? (Score:5, Funny)

by drinkypoo ( 153816 ) writes: <drink@hyperlogos.org> on Monday January 18, 2010 @08:50AM (#30806824) Homepage Journal

Anyone know what sites on Microsoft's front-facing sites are most computationally intensive, and yet always dynamically generated? :D

Share
twitter facebook
- Re: (Score:2, Interesting)
  
  by Anonymous Coward writes:
  
  Bing? ...But that would only help them to DDoS Bing.
  - Re: (Score:3, Funny)
    
    by jisatsusha ( 755173 ) writes:
    
    All that'd serve to do is make them look more popular than ever. Traffic up 300%! Sounds like a good mar
    - Re: (Score:3, Funny)
      
      by Anonymous Coward writes:
      
      That exactly what i said. Dont you dare leech the score from me jackass!
- Re: (Score:3, Insightful)
  
  by Lennie ( 16154 ) writes:
  
  http://blogs.msdn.com/
  
  I've seen it fail many times
  - Re: (Score:3, Funny)
    
    by mmontour ( 2208 ) writes:
    
    Mission accomplished. I got this on the second link that I clicked.
    We are currently unable to serve your request
    We apologize, but an error occurred and your request could not be completed.
    This error has been logged. If you have additional information that you believe may have caused this error please report the problem here.
- Re: (Score:3, Insightful)
  
  by SharpFang ( 651121 ) writes:
  
  No, we just make mistakes writing our Perl programs for automatic downloading stuff from MSDN. Like, download() unless success, and forget to set success=true;
- Re:So how do we DDoS Microsoft? (Score:5, Informative)
  
  by jlp2097 ( 223651 ) writes: on Monday January 18, 2010 @09:31AM (#30807138) Homepage Journal
  
  Not necessary. A Bing Product Manager has already commented on the CPAN Testers blog entry [perl.org] upon which the article is based:
  Hi,
  I am a Program Manager on the Bing team at Microsoft, thanks for bringing this issue to our attention. I have sent an email to barbie@cpan.org as we need additional information to be able to track down the problem. If you have not received the email please contact us through the Bing webmaster center at bwmc@microsoft.com.
  As said below, never ascribe to malice that which can be adequately explained by stupidity. (Insert lame joke about MSFT being full of stupidity here).
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by John Hasler ( 414242 ) writes:
    
    Seems like the CPAN admin has already solved the "issue".
  - Re:So how do we DDoS Microsoft? (Score:5, Funny)
    
    by kulnor ( 856639 ) writes: on Monday January 18, 2010 @09:42AM (#30807224)
    
    Well, with Barbie(TM) on the case, this should be quickly resolved (unless she's too busy with G.I.Joe(TM))
    
    Parent Share
    twitter facebook
    - - Re: (Score:3, Funny)
        
        by __aaclcg7560 ( 824291 ) writes:
        
        I thought Ken(tm) was interested in G.I. Joe(tm) these days. :P
      - Re: (Score:3, Funny)
        
        by budgenator ( 254554 ) writes:
        
        The unobtainable fruit is always thought to be the sweetest.
  - Re:So how do we DDoS Microsoft? (Score:5, Insightful)
    
    by Anonymous Coward writes: on Monday January 18, 2010 @10:55AM (#30807924)
    
    "as we need additional information to be able to track down the problem."
    IP addresses aren't enough? You're MS--if you can't fix the problem and IP addresses are given, damn, that's just sad. You're freaking massive multi-billion dollar tech companies, and this is the best you can do?
    No wonder Chinese hackers own our asses.
    Then again, it took Comcast 9 months to fix a security hole in customer accounts (which would have required an s to http to make pages SSL'd), and the only reason it was "fixed" was because they did their annual website makeover and changed their entire system to something Flash based. Then again, I had contacted a VP, VP's security, referred to web security, and talked to web security 3x, talked to a manager. The last 3 groups verified the problem. It was referred to their web applications team by that point, who sat on it.
    Lovely world we live in.
    
    Parent Share
    twitter facebook
    - Mod parent up (Score:4, Insightful)
      
      by Lonewolf666 ( 259450 ) writes: on Monday January 18, 2010 @11:55AM (#30808568)
      
      While he could be more polite, it is indeed embarrassing for Microsoft if they cannot check their own network
      a) for the existence of computers with given IPs
      b) what these computers are doing
      I think that deserves an "insightful" that cancels out the "flamebait".
      
      Parent Share
      twitter facebook
    - Re: (Score:3, Funny)
      
      by gbjbaanb ( 229885 ) writes:
      
      IP addresses aren't enough? You're MS--if you can't fix the problem and IP addresses are given, damn, that's just sad. You're freaking massive multi-billion dollar tech companies, and this is the best you can do?
      I've seen and used Vista. The answer to your question is "yes".
  - Re: (Score:3, Insightful)
    
    by Penguinisto ( 415985 ) writes:
    
    As said below, never ascribe to malice that which can be adequately explained by stupidity. (Insert lame joke about MSFT being full of stupidity here).
    Given the back-story on the whole Danger data loss affair [arstechnica.com], stupidity is the FIRST thing I'd ascribe to Microsoft these days...
  - Re:So how do we DDoS Microsoft? (Score:5, Insightful)
    
    by Short Circuit ( 52384 ) writes: <mikemol@gmail.com> on Monday January 18, 2010 @11:23AM (#30808224) Homepage Journal
    
    A quick guess? Identifying unique sites by domain name, rather than by IP address, and either the bot or server not respecting HTTP 301 redirects.
    With Rosetta Code, I once had www.rosettacode.org serving up the same content as rosettacode.org. My server got pounded by two bots from Yahoo. I could set Crawl-Delay, but it was only partially effective; One bot had been assigned to www.rosttacode.org, while another to rosettacode.org, and they were each keeping track of their request delay independently. I've since corrected things such that www.rosettacode.org returns an HTTP 301 redirect to rosettacode.org, and have was eventually able to remove the Crawl-Delay entirely.
    I've since worked towards only serving up content for any particular part of the site on a single domain name, and have subdomains such as "wiki.rosettacode.org" redirect to "rosettacode.org/wiki", and "blog.rosettacode.org" to "rosettacode.org/blog". Works rather nice, though it does leave me a bit more open to cookie theft attacks.
    YMMV; As I said, that was a quick guess.
    
    Parent Share
    twitter facebook
    - - Re:So how do we DDoS Microsoft? (Score:5, Insightful)
        
        by Short Circuit ( 52384 ) writes: <mikemol@gmail.com> on Monday January 18, 2010 @12:58PM (#30809372) Homepage Journal
        
        The REAL solution to your problem is for everyone to abandon the dumb-as-shite "www" prefix.
        Why bother with www.example.com and example.com? Get rid of it. Anyone who still puts "www." on their business cards is a dufus.
        
        REAL solutions to immediate problems don't depend on the rest of the world changing to suit my needs. Also, the fact remains that there are links out there that point to "http://www.rosettacode.org/w/index.php?something_or_other", not all of those links will (or can) change, and I would be an absolute fool to knowingly break them, if I want people to visit RCo via referral traffic.
        
        Parent Share
        twitter facebook
      - Re:So how do we DDoS Microsoft? (Score:5, Interesting)
        
        by dissy ( 172727 ) writes: on Monday January 18, 2010 @01:28PM (#30809784)
        
        Every once in a while, I still see sites that don't serve up unless you include "www." in the address - but it's like I said - a dufus.
        
        Looks like someone hasn't read RFC 1178 and enjoys breaking interoperability.
        Your method also breaks email by redelegating MX records one sub domain above where the control should be and MX's point to, thus breaks delegation of sub domains.
        
        Parent Share
        twitter facebook
  - Re:So how do we DDoS Microsoft? (Score:5, Interesting)
    
    by jc42 ( 318812 ) writes: on Monday January 18, 2010 @11:35AM (#30808346) Homepage Journal
    
    As said below, never ascribe to malice that which can be adequately explained by stupidity. (Insert lame joke about MSFT being full of stupidity here).
    Yeah, though this particular sort of stupidity has been going on for a long time, and not just at Microsoft (though they seem to be the worst culprit).
    I run a couple of sites that, among other things, has links to return the "content" in a list of different formats (GIF, PNG, PS, PDF, ...). Periodically, the servers get bogged down by search sites hitting them many times per second, trying to get every file in every format. The worst cases seem to come from microsoft.com and msn.com, though it happens with other search sites, too. Actually, the first attempts I saw at "deep search" like this came from googlebots around 10 years ago, though they quickly backed off and haven't been a serious problem since then. MS-origin "attacks" of this sort have been happening every few months, for nearly a decade.
    I've generally handled them with a couple of techniques. One is to check the logs for successive requests from the same address, and insert sleep() calls with progressively longer sleeps as more messages arrive. The code prefixes the "content" with a comment explaining what's happening, in case a human investigates.
    Another technique is to look for series of "give me this in all your output formats" requests, verify that it's a search bot, and add the address to a "banned" list of sites that simply get a message explaining why they aren't getting what they asked for, plus an email address if they want to get in contact. So far nobody at any search site has ever used that address. I did once get a response from a guy who was studying sites with such multi-format data, for a school project, to see how the various output formats compared in size and information content. I took his address off the banned list, and suggested that he add a couple-second delay between requests, and he finished his project a few days later.
    I suspect that the googlebot folks may have read my explanation of the delays and added code to spread their requests out over time, since that's what their bots seem to do now. But I never heard from them. They must have gotten complaints (and bans) from lots of web sites when they started doing this, so they probably realized quickly that they should add code to prevent such flooding of sites.
    
    Parent Share
    twitter facebook
  - Re: (Score:3, Insightful)
    
    by Alpha830RulZ ( 939527 ) writes:
    
    You know, it's easy to poke fun at the Microsofty, but is it possible that he was just trying to find out what was being hit so that he could figure out who in his organization he should contact? Maybe there is some uber technical way he could have figured this out, or maybe he should have RTFB, but his response sounded well intentioned and responsive. What would you prefer? The microsoft of old?
    - - Re: (Score:3, Insightful)
        
        by MstrFool ( 127346 ) writes:
        
        Same reason other folks can't, they are human. Look, I despise MS for a variety of reasons and am one of the rabid anti-MS folks. But honestly, they do enough that is legit to gripe about, no need to blow a mistake like this out of proportion. Considering all they do it was inevitable to happen at some point. Shit happens, any one that codes has had a mega-woops at one point or an other, and if they haven't they they are cookie cutter coding and not risking creativity. Hate them for needlessly locking the g
    - - Re: (Score:3, Interesting)
        
        by drinkypoo ( 153816 ) writes:
        
        Instead we have Slashtroglodytes screaming about conspiracies by MSFT.
        Just for the record, since you're commenting under a thread I started, I do not believe that there was a conspiracy to attack CPAN. I think there is a conspiracy to continue accidentally attacking CPAN. The information provided ought to be more than sufficient to figure out what is going on. Remember, any time two people work to screw a third out of something, it's a conspiracy by definition.
  - - Re:So how do we DDoS Microsoft? (Score:5, Funny)
      
      by Spatial ( 1235392 ) writes: on Monday January 18, 2010 @11:23AM (#30808232)
      
      How horrible are your employees at their jobs when they require the assistance of their victims to fix the problem?
      [Every IT worker on Slashdot looks around nervously]
      
      Parent Share
      twitter facebook
    - - Re:So how do we DDoS Microsoft? (Score:5, Insightful)
        
        by raju1kabir ( 251972 ) writes: on Monday January 18, 2010 @01:00PM (#30809394) Homepage
        
        Different system's doesn't really apply but what if the site's robots.txt is slightly different (different newlines or something) which is causing an unforeseen error?
        There is a spec for robots.txt. If someone's not following it, then it's their fault. Given Microsoft's past history, I know where I'd point the finger absent any more concrete information.
        
        Parent Share
        twitter facebook
  - - Re:So how do we DDoS Microsoft? (Score:5, Insightful)
      
      by mounthood ( 993037 ) writes: on Monday January 18, 2010 @01:19PM (#30809652)
      
      As said below, never ascribe to malice that which can be adequately explained by stupidity.
      Must be really easy to just beat you in the face, and say “Ooops, I’m sorry, I’m so st00pid! *drool*” I call bullshit on that rule.
      My rule: Don’t make judgements at all (either way), about things that you just don’t know.
      How about: Don't mistake organizational stupidity for individual stupidity. This isn't the case of a single bad coder making a mistake, this is an organization that's chosen to how much effort to apply. How much testing and review? What failsafe's, logging and active monitoring? Will options for feedback be accessible and responsive? Stupidity and Malice aren't mutually exclusive for an individual, and certainly not for an organization.
      
      Parent Share
      twitter facebook
    - Re:So how do we DDoS Microsoft? (Score:5, Insightful)
      
      by Chris Burke ( 6130 ) writes: on Monday January 18, 2010 @01:28PM (#30809778) Homepage
      
      I've never liked that saying because of the implication that malice and stupidity are exclusive.
      Dumb and mean are often found together.
      
      Parent Share
      twitter facebook
- Re:So how do we DDoS Microsoft? (Score:5, Funny)
  
  by Anonymous Coward writes: on Monday January 18, 2010 @09:42AM (#30807218)
  
  As much spam as I get from ir@infousa.com , I wish that someone would DDOS that damned company. If I knew of a way to get extra spam to ir@infousa.com I would probably do it so that company could get a taste of its own medicine. ir@infousa.com sent me unsolicited spam and it drives me nuts. Thanks for nothing, ir@infousa.com . It makes me want to call the company at (402)593-4500 and complain, but I don't have time. I guess I'll email them at ir@infousa.com instead. maybe.
  
  Parent Share
  twitter facebook
  - - Re:So how do we DDoS Microsoft? (Score:4, Insightful)
      
      by Zarf ( 5735 ) writes: on Monday January 18, 2010 @10:29AM (#30807608) Journal
      
      Clue: Subtle joke, deserves 'funny' moderation ;)
      Subtle + Slashdot = FAIL
      
      Parent Share
      twitter facebook
- Re:So how do we DDoS Microsoft? (Score:5, Insightful)
  
  by PetoskeyGuy ( 648788 ) writes: on Monday January 18, 2010 @10:08AM (#30807434)
  
  Why make things worse? Block the ip address or range and notify the admins. This isn't a chan mob.
  
  Parent Share
  twitter facebook
- - Re:So how do we DDoS Microsoft? (Score:5, Insightful)
    
    by WinterSolstice ( 223271 ) writes: on Monday January 18, 2010 @11:20AM (#30808190)
    
    Actually, your statement works better with 'INSERT LANG HERE'...
    I'm always surprised by how people seem to think that any language has a monopoly of some sort on sloppy and/or lazy coders. Been doing IT a long time, and the one thing that never changes is the sloppy/lazy code issue. It even predates programming, you know - look at infrastructure around the world for examples of "just toss something out there, hope it works".
    
    Parent Share
    twitter facebook
- - Re: (Score:3, Funny)
    
    by spongman ( 182339 ) writes:
    
    let's hope they don't store it compressed...
Oh! *Literally* Microsoft bots! (Score:2)

by Culture20 ( 968837 ) writes:

Until I read the summary I thought it was another article about windows botnets and was wondering why the "microsoft" was tacked on since windows is the default OS assumption. Of course it would be interesting if these were new CPAN mirrors that MS was settings up.
- Re:Oh! *Literally* Microsoft bots! (Score:5, Informative)
  
  by Ardaen ( 1099611 ) writes: on Monday January 18, 2010 @09:56AM (#30807350)
  
  Probably not, if you look at other incidents: http://cmeerw.org/blog/594.html [cmeerw.org] it appears they just like to push the limits.
  
  Parent Share
  twitter facebook
Testers blog link... (Score:2)

by flyingfsck ( 986395 ) writes:

Sooooo, lets all go to the testers blog and DDOS that too. Dumbass...
- Re: (Score:2)
  
  by nicolas.kassis ( 875270 ) writes:
  
  If he can handle the msnbots, he probably can handle the slashdot crowd.
I've seen it before (Score:5, Interesting)

by LordAzuzu ( 1701760 ) writes: on Monday January 18, 2010 @08:54AM (#30806860)

I manage some networks in my home city in Italy, and in the past year I've often seen strange traffic coming from some of their IP addresses. Guess they have been exploited by someone long time ago, and didn't even notice it.

Share
twitter facebook
- Re:I've seen it before (Score:4, Interesting)
  
  by beadfulthings ( 975812 ) writes: on Monday January 18, 2010 @10:51AM (#30807866) Journal
  
  It's interesting to read this, as I've had some random and somewhat incomprehensible port scans coming from an IP address identified as one of theirs. If you're just an insignificant slob, you can't write to their abuse address, either; you'll get bounced. I simply blocked that particular IP address. Let them worry about who's gotten to them.
  
  Parent Share
  twitter facebook
Check the blog... (Score:5, Funny)

by strredwolf ( 532 ) writes: on Monday January 18, 2010 @08:58AM (#30806900) Homepage Journal

Looks like Microsoft's Bing managers are on it. They'll make it worse in no-time flat. :)
BTW, the difference between a DDOS and a Slashdotting? You know why your site went down -- you got linked!

Share
twitter facebook
- Re:Check the blog... (Score:5, Funny)
  
  by Anonymous Coward writes: on Monday January 18, 2010 @09:22AM (#30807082)
  
  BTW, the difference between a DDOS and a Slashdotting?
  The DDOS bots actually read TFA.
  
  Parent Share
  twitter facebook
- - Re:Check the blog... (Score:5, Insightful)
    
    by jc42 ( 318812 ) writes: on Monday January 18, 2010 @11:55AM (#30808564) Homepage Journal
    
    They admitted they were powerless to solve their own problems without help from their victims.
    Heh. It's another "damned if you do; damned if you don't" scenario. Usually, people criticise Microsoft for developing software without bothering to consult or test with actual customers. Now we have a manager of a MS dev group that actually does communicate (though not exactly with "customers"), and acts on what they say, so he's criticised for needing help from his "victims".
    Ya can't win that game.
    But the fact is that if you're developing server-side web software, you need to test it against real-world sites, not just the toy sites you've set up in your lab. And we all know the "Sourcerer's Apprentice" sort of bug that produces a runaway test that tries to do something as many times as it can per second until it's killed. Good testers will be on the lookout for such events, but it's understandable that they might fail occasionally
    Among web developers, MS does have a bit of a reputation for hitting your new site with a flood of requests, trying to extract everything that you have (even the content of your "tmp" directory which your robots.txt file says to ignore). There are lots of small sites that block MS address ranges for just this reason.
    It should be considered good news that there's at least one MS manager who understands all this, and is willing to talk to the "victims" and fix the problems. Now if they could fix the next-level problem, that this sort of thing happens repeatedly and their corporate culture seems to have no way to prevent it from happening again.
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Informative)
      
      by schon ( 31600 ) writes:
      
      They admitted they were powerless to solve their own problems without help from their victims.
      Heh. It's another "damned if you do; damned if you don't" scenario.
      Un, no. Not unless you're a rabid MS apologist.
      Usually, people criticise Microsoft for developing software without bothering to consult or test with actual customers.
      True.
      Now we have a manager of a MS dev group that actually does communicate (though not exactly with "customers"), and acts on what they say, so he's criticised for needing help from his "victims".
      Umm, exactly how did he act on what they said? According to the quote, they explicitly didn't act, which is the problem people are complaining about.
MS ineptitude? (Score:2, Insightful)

by Anonymous Coward writes:

From TFA:
Hi,
I am a Program Manager on the Bing team at Microsoft, thanks for bringing this issue to our attention. I have sent an email to nospam@example.com as we need additional information to be able to track down the problem. If you have not received the email please contact us through the Bing webmaster center at nospam@example.com.
I mean, what additional information is needed wrt "respecting robots.txt" and "not letting loose more than one bot on a site at a time"?
Bing. Meh.
- Re:MS ineptitude? (Score:4, Interesting)
  
  by ShecoDu ( 447850 ) writes: on Monday January 18, 2010 @01:14PM (#30809590) Homepage
  
  I remember reading that the MSNBOT reads the "Robots.txt" file, but cpantesters has a lowercase filename:
  http://static.cpantesters.org/robots.txt [cpantesters.org]
  http://static.cpantesters.org/Robots.txt [cpantesters.org] doesn't exist, so basically MSNBOT only respects the robots.txt on case insensitive operating systems.
  
  Parent Share
  twitter facebook
  - Re:MS ineptitude? (Score:4, Interesting)
    
    by John Hasler ( 414242 ) writes: on Monday January 18, 2010 @02:23PM (#30810500) Homepage
    
    The standard clearly specifies lower case. However, if you are correct there's a simple way to send bingbots one way and all other bots another: create Robots.txt and robots.txt with different contents.
    
    Parent Share
    twitter facebook
Probably just a bug. (Score:5, Insightful)

by tjstork ( 137384 ) writes: <todd@bandrowsky.gmail@com> on Monday January 18, 2010 @08:59AM (#30806910) Homepage Journal

I know everyone likes to assume that Microsoft is being evil here, but wouldn't the more realistic assumption be that they were just being incompetent?

Share
twitter facebook
- Re:Probably just a bug. (Score:5, Insightful)
  
  by Lloyd_Bryant ( 73136 ) writes: on Monday January 18, 2010 @09:06AM (#30806976)
  
  I know everyone likes to assume that Microsoft is being evil here, but wouldn't the more realistic assumption be that they were just being incompetent?
  Sufficiently advanced incompetence is indistinguishable from malice. For additional examples, see Government, US.
  The simple fact is that ignoring robots.txt is effectively evil, regardless of the intent. It's not like robots.txt is some new innovation...
  
  Parent Share
  twitter facebook
  - Re:Probably just a bug. (Score:4, Insightful)
    
    by gmuslera ( 3436 ) writes: on Monday January 18, 2010 @09:15AM (#30807026) Homepage Journal
    
    They are not ignoring robots.txt, probably just that they understand that file in their slighly different, but in the end incompatible, format. As every other file.
    
    Parent Share
    twitter facebook
    - Re:Probably just a bug. (Score:5, Informative)
      
      by Rogerborg ( 306625 ) writes: on Monday January 18, 2010 @09:30AM (#30807122) Homepage
      
      You're probably new here, but if you'd RTFA, you'd see that:
      It seems their bots completely ignore the rules specified in the robots.txt, despite me setting it up as per their own guidelines on their site
      Come to think of it though, isn't this what happens to most people who try to interoperate with Microsoft?
      Amusingly, if I Google for "bing robots.txt" [google.co.uk] I get a link to a bing page titled "Bing - Robots.txt Disallow vs No Follow - Neither Working!" which has already been elided from history by Microsoft [bing.com]. CLassy.
      
      Parent Share
      twitter facebook
      - Re:Probably just a bug. (Score:5, Funny)
        
        by afidel ( 530433 ) writes: on Monday January 18, 2010 @09:56AM (#30807338)
        
        I wonder if it's a CR/CRLF bug =)
        
        Parent Share
        twitter facebook
        
        Re:Probably just a bug. (Score:4, Interesting)
        
        by mR.bRiGhTsId3 ( 1196765 ) writes: on Monday January 18, 2010 @11:21AM (#30808206)
        
        That would be tremendously amusing. I can see the headline now. Bing robots DDoS attack every Unix hosted site by assuming Windows linefeeds.
        
        Parent Share
        twitter facebook
      - Re:Probably just a bug. (Score:5, Insightful)
        
        by schon ( 31600 ) writes: on Monday January 18, 2010 @10:09AM (#30807448)
        
        It has nothing to do with the RTFA.
        their own guidelines on their site
        As anyone who has ever read MS documentation can tell you, you need to read it, then implement a test, so you can see what it really expects, then adjust your test, then try it until it works.
        Their problem is that they expected MS documentation to actually describe the expected behaviour.
        
        Parent Share
        twitter facebook
    - Re: (Score:2)
      
      by drspliff ( 652992 ) writes:
      
      Well, the last I heard Bing spider was looking for `Robots.txt` rather than `robots.txt` which would explain the file being "ignored" in this case.
      - Re:Probably just a bug. (Score:4, Informative)
        
        by Goaway ( 82658 ) writes: on Monday January 18, 2010 @10:36AM (#30807696) Homepage
        
        I'm sure you heard that, but it's not actually true in any way.
        
        Parent Share
        twitter facebook
  - Re: (Score:2)
    
    by ztransform ( 929641 ) writes:
    
    The simple fact is that ignoring robots.txt is effectively evil, regardless of the intent. It's not like robots.txt is some new innovation...
    Since when did Microsoft feel existing standards were something to honour? How many times have its browsers changed behaviour? Re-defined entrenched URL standards (you cannot specify username/password in an Internet Explorer URL but this is a legal standard form of URL)?
    It stands to reason Microsoft would take no notice of anything your website has to say.
    Unless.. of course.. Microsoft define a certificate type that can sign your Microsoft-specific format exception list after payment on an annual licens
    - Re:Probably just a bug. (Score:4, Insightful)
      
      by blueZ3 ( 744446 ) writes: on Monday January 18, 2010 @10:36AM (#30807710) Homepage
      
      What's amusing about the issue in the kb is that the problem that they're "solving" by breaking the username/password in a URL standard is NOT a problem with username/password URLs, but a problem with how IE displays the URLs. In other words, rather than fixing the behavior of IE's address and status bars to display such URLs correctly, they just stopped supporting them.
      Incompetence at that level isn't just indistinguishable from malice, it IS malicious.
      
      Parent Share
      twitter facebook
  - Re:Probably just a bug. (Score:5, Funny)
    
    by Suki I ( 1546431 ) writes: on Monday January 18, 2010 @09:35AM (#30807164) Homepage Journal
    
    Try saving a copy as robots.docx and see if that works ;)
    
    Parent Share
    twitter facebook
    - Re:Probably just a bug. (Score:5, Funny)
      
      by PinkyDead ( 862370 ) writes: on Monday January 18, 2010 @10:28AM (#30807594) Journal
      
      Microsoft don't have any tools that can effectively read that format.
      
      Parent Share
      twitter facebook
  - - Re: (Score:3)
      
      by jimicus ( 737525 ) writes:
      
      The US Gov't has successfully operated as a going concern for 220+ years, with a proven and reliable management structure. Few, if any corporations, have been able to do that.
      Private corporations can go under with just a couple of bad years. Or even months, particularly if they're new businesses. Governments just have to raise taxes.
    - - Re: (Score:3, Informative)
        
        by Nadaka ( 224565 ) writes:
        
        Nothing you listed under the "War on Drugs" has anything to do with the war on drugs.
        The war on drugs has made America a police state where the government can seize any of your property and auction it for profit before your trial. Even if you are found innocent, or the charges are thrown out for insufficient grounds, you will not be compensated for your lost money or profit. It has made an America where more people are imprisoned than any other nation on earth. It has made a nation where the cheapest and mo
  - - Re: (Score:3, Funny)
      
      by Pharmboy ( 216950 ) writes:
      
      Wow, you must be new....to computers. I particularly liked you comment "A site could have quality links to non ignore sites." as justification for a bot to ignore robots.txt. Can I have your AOL email address so I can write you personally?
- Re:Probably just a bug. (Score:5, Insightful)
  
  by fish waffle ( 179067 ) writes: on Monday January 18, 2010 @09:06AM (#30806982)
  
  I know everyone likes to assume that Microsoft is being evil here, but wouldn't the more realistic assumption be that they were just being incompetent?
  Probably. But since incompetence is the plausible deniability of evil it's sometimes hard to tell.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by paiute ( 550198 ) writes:
    
    I know everyone likes to assume that Microsoft is being evil here, but wouldn't the more realistic assumption be that they were just being incompetent?
    Probably. But since incompetence is the plausible deniability of evil it's sometimes hard to tell.
    "incompetence is the plausible deniability of evil"
    fish waffle, that is great sig material.
- Re: (Score:2)
  
  by mspohr ( 589790 ) writes:
  
  Occam's razor (or Ockham's razor[1]), entia non sunt multiplicanda praeter necessitatem, is the principle that "entities must not be multiplied beyond necessity" and the conclusion thereof, that the simplest explanation or strategy tends to be the best one.
  Rough translation: "Never ascribe to malice that which can be adequately explained by stupidity."
  - Re:Probably just a bug. (Score:5, Insightful)
    
    by MrMr ( 219533 ) writes: on Monday January 18, 2010 @09:19AM (#30807058)
    
    The problem is, there is no evidence that:
    Never ascribe to stupidity that which can be adequately explained by malice.
    Is invoking more entities.
    In fact, claiming that the commercially most successfull software company got there through stupidity rather than malice sounds extremely implausible to me.
    
    Parent Share
    twitter facebook
    - Re:Probably just a bug. (Score:5, Funny)
      
      by Opportunist ( 166417 ) writes: on Monday January 18, 2010 @09:38AM (#30807186)
      
      Like my grandpa said, it doesn't matter how dumb you are. As long as you find someone even dumber to sell to.
      
      Parent Share
      twitter facebook
    - - Re: (Score:2)
        
        by horatio ( 127595 ) writes:
        
        You think Microsoft was happy every time a user got the dreaded Blue Screen Of Death?
        Yes, in a way. I never really thought about it until you asked, but it fits with their business model of forcing users into an expensive upgrade of their OS every few years. Look what has happened with XP. It doesn't blue screen [as] much, and they've met heavy resistance from folks not wanting to upgrade to Vista. (Never mind that Vista is crap.) So now they've re-packaged Vista as "Windows 7" and hope folks don't realize it looks the same and smells the same, because it basically is.
- Re: (Score:3, Insightful)
  
  by alexhs ( 877055 ) writes:
  
  these bots 'completely ignore the rules specified in robots.txt.'
  Microsoft ignoring standards is not incompetence, it's policy (NIH syndrome).
- Re:Probably just a bug. (Score:5, Insightful)
  
  by djupedal ( 584558 ) writes: on Monday January 18, 2010 @09:12AM (#30807012)
  
  > "I know everyone likes to assume that Microsoft is being evil here, but wouldn't the more realistic assumption be that they were just being incompetent?"
  
  We assume MS is evil...
  
  We know they are incompetent.
  
  We feel this is typical.
  
  We pray they'd just go away.
  
  We think this will never end...
  
  Parent Share
  twitter facebook
- Re:Probably just a bug. (Score:5, Interesting)
  
  by Yvanhoe ( 564877 ) writes: on Monday January 18, 2010 @09:16AM (#30807034) Journal
  
  There is such thing as criminal incomptence. If a script kiddie can be arrested for having a virus "out of control" I don't see why Microsoft engineers DDOSing a website couldn't be charged.
  
  By the way a philosopher once told that "evil" did not exist. That it was most of the time just a kind of hidden stupidity.
  
  Parent Share
  twitter facebook
  - Comment removed (Score:4, Interesting)
    
    by account_deleted ( 4530225 ) writes: on Monday January 18, 2010 @10:07AM (#30807428)
    
    Comment removed based on user account deletion
    
    Parent Share
    twitter facebook
- Incompetent? (Score:2)
  
  by omb ( 759389 ) writes:
  
  Yes, Evil more so
- Or both (Score:2)
  
  by cheros ( 223479 ) writes:
  
  AFAIK, the one doesn't exclude the other.
  However, assuming evil is more fun :-)
- Re: (Score:2)
  
  by Xest ( 935314 ) writes:
  
  Yes, and I like the solution too- rather than contact Microsoft to find out what the fuck is going on, post it to Slashdot and get Slashdotted as well.
  Pure genius.
Fixing Bing's poor indexing (Score:2, Interesting)

by AHuxley ( 892839 ) writes:

Its not a bug, its a feature to index a site with a new, rapid, powerful, direct, personalised crawler :)
http://arstechnica.com/microsoft/news/2010/01/microsoft-outlines-plan-to-improve-bings-slow-indexing.ars [arstechnica.com]
This is a normal occurence for Bing (Score:5, Informative)

by Anonymous Coward writes: on Monday January 18, 2010 @09:11AM (#30807010)

I had a registration page - static content basically. The only thing that was dynamic was that it was referred to by many pages on the site with a variable in the querystring. Bing decided that it needed check on this one page *thousands* of time per day.
They ignored robots.txt.
I sent a note to an address on the Bing site that requested feedback from people having issues with the Bing bots - nothing.
The only thing they finally 'listened' to was placing "" in the header.
This kind of sucked because it took the registration page out of the search engines' index, however it was much better than being DDOS'd. Plus, the page is easy to find on the site so not *that* big a deal.
Bing has been open for months now and if you search around there are tons of stories just like this. Maybe now that a site with some visibility has been 'attacked', the engineers will take a look at wtf is wrong.

Share
twitter facebook
- Re: (Score:2)
  
  by The Cisco Kid ( 31490 ) writes:
  
  Seems like a better solution would have been to setup a test for the either the User-Agent, or the IP/blocks that Bing was attacking your site from, and dropping those requests in /dev/null - your site would still exist on 'real' search engines, and Bing doesn't pound on your bandwidth anymore.
  - Re: (Score:2)
    
    by The Cisco Kid ( 31490 ) writes:
    
    Replying to myself: if testing the UA or the IP in the httpd itself was too much load, you could have also just nullrouted the IP blocks the Bing spider was coming from, either in the kernel table, or in your router.
Flooding... (Score:5, Informative)

by Bert64 ( 520050 ) writes: <bert@slash d o t . f i renzee.com> on Monday January 18, 2010 @09:15AM (#30807030) Homepage

I have noticed the microsoft crawlers (msnbot) being fairly inefficient on many of my sites...
In contrast to googlebot and spiders from other search engines msnbot is far more aggressive, ignores robots.txt and will frequently re-request the same files repeatedly, even if those files haven't changed... Looking at my monthly stats (awstats) which groups traffic from bots, msnbot will frequently have consumed 10 times more bandwidth than googlebot, but is responsible for far less incoming traffic based on referrer headers (typically 1-2% of the traffic generated by google on my sites).
Other small search engines don't bring much traffic either, but their bots don't hammer my site as hard as msnbot does.

Share
twitter facebook
Are you sure? (Score:5, Insightful)

by Errol backfiring ( 1280012 ) writes: on Monday January 18, 2010 @09:21AM (#30807070) Journal

Are we sure this traffic comes from Microsoft? Could it not consist of forged network packets? You don't need a reply if you are running a DDOS. On the other hand, why would anyone, including Microsoft, want to bring down CPAN?

Share
twitter facebook
- Re: (Score:3, Funny)
  
  by Anonymous Coward writes:
  
  Because they are coming out with P# and don't want the competition?
- Re: (Score:2, Informative)
  
  by Anonymous Coward writes:
  
  You only see an IP in an apache log after a successfull TCP handshake. This is hard (not impossible, but really, really hard) to do with a forged IP.
- Re:Are you sure? (Score:5, Informative)
  
  by TheRaven64 ( 641858 ) writes: on Monday January 18, 2010 @09:45AM (#30807246) Journal
  
  Are we sure this traffic comes from Microsoft? Could it not consist of forged network packets?
  
  It's a TCP connection, so they need to have completed the three-way handshake for it to work. That means that they must have received the SYN-ACK packet or by SYN flooding. If they are SYN flooding, then that would show up in the firewall logs. If they've received the SYN-ACK packet then they are either from that IP, or they are on a router between you and that IP and can intercept and block the packets from thatIP.
  You don't need a reply if you are running a DDOS.
  You do if it's via TCP. If they're just ping flooding, then that's one thing, but they're issuing HTTP requests. This involves establishing a TCP connection (send SYN, receive SYN-ACK with random number, reply ACK with that number) and involves sending TCP window replies for each group of TCP packets that you receive.
  On the other hand, why would anyone, including Microsoft, want to bring down CPAN?
  Who says that they want to? It's more likely that their web crawler has been written to the same standard as the rest of their code.
  
  Parent Share
  twitter facebook
Too easy for Microsoft (Score:2)

by BhaKi ( 1316335 ) writes:

I suppose Microsoft can offer a simple explanation: "Our servers and other internal infrastructure are so vulnerable that they have been hacked and being used as remote-controlled botnets."
Robots.txt (Score:2)

by anomnomnomymous ( 1321267 ) writes:

Can anyone here clarify what robots.txt stands for, as in:

Is it an 'agreement' to not scan the site at all (by a search engine bot), or is it meant to just not -display- those results in the search engine?
I'd assume, since everything on a site is more or less public, that it would be the second. And if so, I can't see anything wrong with what Microsoft's bots did.

I can see how scanning a site's content (even if you're not going to list the results in your search engine) can have some value to a company
- Re: (Score:3, Informative)
  
  by Ogi_UnixNut ( 916982 ) writes:
  
  It's the first. Whatever you specify in the robots.txt as no-follow etc... means not to spider the pages, so no scanning of them at all.
  You use it for when you only want part of your site to appear in search results, such as just the front page (for example). The rest of the site should not be touched by the bot at all.
- Re: (Score:3, Informative)
  
  by afidel ( 530433 ) writes:
  
  It's basically a rough pattern filter that the bot is supposed to follow on parts of the site not to crawl. One reason it's used is that you can have dynamically generated pages that create an infinite loop that's impossible for the bot to detect.
- Re: (Score:3, Informative)
  
  by John Hasler ( 414242 ) writes:
  
  Is it an 'agreement' to not scan the site at all...
  It is a request not to scan part or all of a site. robots.txt [wikipedia.org]
  And if so, I can't see anything wrong with what Microsoft's bots did.
  Every site does not have dozens of powerful servers and terabytes of bandwidth, nor is every site an ad-supported one that wants to maximize traffic. Common courtesy requires that a bot operator minimize his impact on any given site and honor requests not to index. Of course "courtesy" and "honor" are concepts that baff
What the hell has become of the word "problem"? (Score:2)

by John Hasler ( 414242 ) writes:

> ...issues accessing their sites...
"Issues"? What's wrong with "problem"? "Issues" is marketing-speak. Microsoft marketing-speak.
And yes, get off my lawn.
Send the lost bots home. (Score:5, Funny)

by N1ckR ( 1289800 ) writes: on Monday January 18, 2010 @10:13AM (#30807476)

I redirect lost bots home, seems a polite thing to do. 301 www.microsoft.com

Share
twitter facebook
DDoS? Really? (Score:3, Informative)

by Siberwulf ( 921893 ) writes: on Monday January 18, 2010 @10:33AM (#30807666)

I'm pretty sure the first "D" in DDoS stands for "Distributed."

If it was really a DDoS, you wouldn't be able to filter the IP out with a simple regex (like the /^65\.55\.(106|107|207)/. from TFA).

To boot, TFA didn't even say DDoS. Maybe that's too much to expect the editors to oh... I don't know...say... RTFA or Fact-Check it?

I should drop my bar a bit, I suppose.

Share
twitter facebook
No problem (Score:5, Informative)

by rgviza ( 1303161 ) writes: on Monday January 18, 2010 @10:51AM (#30807868)

ipchains -A input -j REJECT -p all -s 65.55.207.0/24 -i eth0 -l
ipchains -A input -j REJECT -p all -s 65.55.107.0/24 -i eth0 -l
ipchains -A input -j REJECT -p all -s 65.55.106.0/24 -i eth0 -l
problem solved

Share
twitter facebook
- Re:No problem (Score:5, Informative)
  
  by j_sp_r ( 656354 ) writes: on Monday January 18, 2010 @11:47AM (#30808494) Homepage
  
  Linux IP Firewalling Chains, normally called ipchains, is free software to control the packet filter/firewall capabilities in the 2.2 series of Linux kernels. It superseded ipfwadm, but was replaced by iptables in the 2.4 series.
  You're a few kernels behind.
  
  Parent Share
  twitter facebook
Complain to Upstream Providers (Score:4, Interesting)

by jchawk ( 127686 ) writes: on Monday January 18, 2010 @10:57AM (#30807954) Homepage Journal

The CPAN folks could complain to their ISP and have them drop the traffic that's coming in to their boxes.
Most ISP's will work with you to correct DDOS problems.

Share
twitter facebook
hello? firewall? (Score:3, Insightful)

by v1 ( 525388 ) writes: on Monday January 18, 2010 @12:15PM (#30808752) Homepage Journal

if it's a scan (TCP established stream, taxing the SERVERS, not the NETWORK) that's the problem, as opposed to a SYN flood etc, and the IP addresses are in a very small range, why aren't they just using a hardware firewall at the router and blocking the IPs? There's not a whole lot to "distributed" when it's coming from a pair of C's.
Not saying they should be DOING it, but this is not a Denial of Service, it's a Denial of Stupid.

Share
twitter facebook
- Re: (Score:2)
  
  by auric_dude ( 610172 ) writes:
  
  Sounds like Microsoft.CN to me.
- Re:So block those IP ranges? (Score:4, Informative)
  
  by John Hasler ( 414242 ) writes: on Monday January 18, 2010 @09:34AM (#30807152) Homepage
  
  > ...why not just block them?
  They have.
  
  Parent Share
  twitter facebook
- - Re:So block those IP ranges? (Score:5, Insightful)
    
    by Sarten-X ( 1102295 ) writes: on Monday January 18, 2010 @09:38AM (#30807184) Homepage
    
    For ignoring robots.txt, they don't deserve any more nor less.
    
    Parent Share
    twitter facebook
- Re: (Score:2)
  
  by ozmanjusri ( 601766 ) writes:
  
  Why? Bing?
  They have to have SOME activity.
  Sounds like there's more traffic from their bots than customers.
- Re: (Score:3, Insightful)
  
  by John Hasler ( 414242 ) writes:
  
  Robots.txt is merely advisory. Ignoring it is discourteous and oafish but not illegal.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

So how do we DDoS Microsoft? (Score:5, Funny)

Re: (Score:2, Interesting)

Re: (Score:3, Funny)

Re: (Score:3, Funny)

Re: (Score:3, Insightful)

Re: (Score:3, Funny)

Re: (Score:3, Insightful)

Re:So how do we DDoS Microsoft? (Score:5, Informative)

Re: (Score:2)

Re:So how do we DDoS Microsoft? (Score:5, Funny)

Re: (Score:3, Funny)

Re: (Score:3, Funny)

Re:So how do we DDoS Microsoft? (Score:5, Insightful)

Mod parent up (Score:4, Insightful)

Re: (Score:3, Funny)

Re: (Score:3, Insightful)

Re:So how do we DDoS Microsoft? (Score:5, Insightful)

Re:So how do we DDoS Microsoft? (Score:5, Insightful)

Re:So how do we DDoS Microsoft? (Score:5, Interesting)

Re:So how do we DDoS Microsoft? (Score:5, Interesting)

Re: (Score:3, Insightful)

Re: (Score:3, Insightful)

Re: (Score:3, Interesting)

Re:So how do we DDoS Microsoft? (Score:5, Funny)

Re:So how do we DDoS Microsoft? (Score:5, Insightful)

Re:So how do we DDoS Microsoft? (Score:5, Insightful)

Re:So how do we DDoS Microsoft? (Score:5, Insightful)

Re:So how do we DDoS Microsoft? (Score:5, Funny)

Re:So how do we DDoS Microsoft? (Score:4, Insightful)

Re:So how do we DDoS Microsoft? (Score:5, Insightful)

Re:So how do we DDoS Microsoft? (Score:5, Insightful)

Re: (Score:3, Funny)

Oh! *Literally* Microsoft bots! (Score:2)

Re:Oh! *Literally* Microsoft bots! (Score:5, Informative)

Testers blog link... (Score:2)

Re: (Score:2)

I've seen it before (Score:5, Interesting)

Re:I've seen it before (Score:4, Interesting)

Check the blog... (Score:5, Funny)

Re:Check the blog... (Score:5, Funny)

Re:Check the blog... (Score:5, Insightful)

Re: (Score:3, Informative)

MS ineptitude? (Score:2, Insightful)

Re:MS ineptitude? (Score:4, Interesting)

Re:MS ineptitude? (Score:4, Interesting)

Probably just a bug. (Score:5, Insightful)

Re:Probably just a bug. (Score:5, Insightful)

Re:Probably just a bug. (Score:4, Insightful)

Re:Probably just a bug. (Score:5, Informative)

Re:Probably just a bug. (Score:5, Funny)

Re:Probably just a bug. (Score:4, Interesting)

Re:Probably just a bug. (Score:5, Insightful)

Re: (Score:2)

Re:Probably just a bug. (Score:4, Informative)

Re: (Score:2)

Re:Probably just a bug. (Score:4, Insightful)

Re:Probably just a bug. (Score:5, Funny)

Re:Probably just a bug. (Score:5, Funny)

Re: (Score:3)

Re: (Score:3, Informative)

Re: (Score:3, Funny)

Re:Probably just a bug. (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re:Probably just a bug. (Score:5, Insightful)

Re:Probably just a bug. (Score:5, Funny)

Re: (Score:2)

Re: (Score:3, Insightful)

Re:Probably just a bug. (Score:5, Insightful)

Re:Probably just a bug. (Score:5, Interesting)

Comment removed (Score:4, Interesting)

Incompetent? (Score:2)

Or both (Score:2)

Re: (Score:2)

Fixing Bing's poor indexing (Score:2, Interesting)

This is a normal occurence for Bing (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Flooding... (Score:5, Informative)

Are you sure? (Score:5, Insightful)

Oh! Literally Microsoft bots! (Score:2)

Re:Oh! Literally Microsoft bots! (Score:5, Informative)