Forgot your password?
typodupeerror
Internet Explorer Microsoft Programming

Schooling Microsoft On Random Browser Selection 436

Posted by kdawson
from the let-me-show-it-you dept.
Rob Weir got wind that a Slovakian tech site had been discussing the non-randomness of Microsoft's intended-to-be-random browser choice screen, which went into effect on European Windows 7 systems last week. He did some testing and found that indeed the order in which the five browser choices appear on the selection screen is far from random — though probably not intentionally slanted. He then proceeds to give Microsoft a lesson in random-shuffle algorithms. "This computational problem has been known since the earliest days of computing. There are 5 well-known approaches: 3 good solutions, 1 acceptable solution that is slower than necessary and 1 bad approach that doesn’t really work. Microsoft appears to have picked the bad approach. But I do not believe there is some nefarious intent to this bug. It is more in the nature of a 'naive algorithm,' like the bubble sort, that inexperienced programmers inevitably will fall upon when solving a given problem. I bet if we gave this same problem to 100 freshmen computer science majors, at least 1 of them would make the same mistake. But with education and experience, one learns about these things. And one of the things one learns early on is to reach for Knuth. ... The lesson here is that getting randomness on a computer cannot be left to chance. You cannot just throw Math.random() at a problem and stir the pot and expect good results."
This discussion has been archived. No new comments can be posted.

Schooling Microsoft On Random Browser Selection

Comments Filter:
  • LAST (Score:5, Funny)

    by DamonHD (794830) <d@hd.org> on Sunday February 28, 2010 @02:36PM (#31308092) Homepage

    Hmm, there's a nice shuffle implementation in Java that Microsoft could use... Oh, wait...

    Rgds

    Damon

    • by alphabetsoup (953829) on Sunday February 28, 2010 @04:20PM (#31308916)

      Both the article and the summary mixes up the concepts. Randomness and bias are related but different things. Think of a biased coin loaded in favor of heads - the heads may appear twice as often as the tails, but the distribution is still random. Here too, contrary to the summary's claim of "far from random", the results are random, just biased, and biased against IE, if I may add, which is an important fact the summary omitted.

  • by Idbar (1034346) on Sunday February 28, 2010 @02:39PM (#31308108)
    just requires something like:
    pickabrowser() {
    if (rand()>0.05) {
    use IE
    } else {
    pickabrowser()
    }
    }

    Nobody said anything about bias.
  • by Anonymous Coward on Sunday February 28, 2010 @02:46PM (#31308172)

    What's the problem? It's random enough for a browser selection screen.

    This isn't an application where a statistically random shuffle is required.

    • Re: (Score:2, Funny)

      by El Lobo (994537)
      Exactly, let's force them solve this problem with a .000000001 floating point precision. After all, this is a critical issue.
      • by QuoteMstr (55051)

        .000000001 floating point

        There's no such thing. That number cannot be represented in IEEE binary floating point.

    • He's just bitching (Score:4, Insightful)

      by Sycraft-fu (314770) on Sunday February 28, 2010 @03:02PM (#31308322)

      It is probably a combination of two things:

      1) Hate for MS. MS is doing what some have said they've needed to do in giving users browser choice, and they've done so as to try not to promote any given one. While that makes proponents of choice happy, it makes MS haters mad. The more MS does to try and accommodate users and play fair, the less there is to hate on them for legitimately. As such haters are going to try and find nit picks to bitch about.

      2) General geek pedantry. Many geeks seem to love to be exceedingly pedantic about every little thing. If a definition isn't 100% perfect, at least in their mind, they jump all over it. I think it is a "Look at how smart I am!" kind of move. They want to show that they noticed that it wasn't 100% perfect and thus show how clever they are.

      Doesn't matter, it is what it is and as you said, random enough. This guy can whine all he likes.

      • by magsol (1406749) on Sunday February 28, 2010 @03:06PM (#31308352) Journal
        On the other hand, the devil is in the details, and one would think that a company such as Microsoft that has been owning the software market for decades now would know how to implement a randomizing algorithm correctly.
        • Re: (Score:3, Insightful)

          by AdmiralXyz (1378985)

          one would think that a company such as Microsoft that has been owning the software market for decades now would know how to implement a randomizing algorithm correctly.

          Wrong: a software company such as Microsoft that has been owning the software market for decades now knows how to use programmer time and resources effectively. Spending the extra programmer time and effort to turn a "99.99% random" process into a "100% random" one is an utter waste of both on something this trivial. Hate to break it to you, and to look-at-me-I'm-so-much-smarter-than-evil-Microsoft Rob Weir, but they're not making any mistake here.

          • To be fair, this is more like 60% random, not 99.99%.

          • by maxwell demon (590494) on Sunday February 28, 2010 @03:37PM (#31308590) Journal

            Spending the extra programmer time and effort to turn a "99.99% random" process into a "100% random"

            I don't know what you consider "99.99% random", but the difference between 20% (probability of IE turning up last in a real random shuffle) and ca. 50% (probability of IE showing up last in the implemented "random shuffle") is certainly significant enough that you can't call it 99.99% random." You might argue that it is "random enough for this," but that's of course a matter of opinion, and therefore debatable (there's no objective definition of "random enough").

            • by omfgnosis (963606)

              there's no objective definition of "random enough"

              Of course there is: "sufficiently random for the task in question". It's objective, but useless, because it depends on "sufficient" and "the task in question" to be specified, and neither are.

          • Re: (Score:3, Insightful)

            by 644bd346996 (1012333)

            They made the stupid mistake of not assigning an experienced or well-educated programmer to a project that was necessary for them to legally do business in Europe. Somewhere along the way, the legal department had to have sent the technical managers an email containing a phrase very similar to "don't fuck this up!!", and the manager ignored it and assigned a programmer who didn't go to a good CS school and thus has never heard of a Fisher-Yates shuffle or something equivalent.

            It's very understandable that s

          • by magsol (1406749)
            I don't know why you're coming down so hard on this, particularly since you are absolutely correct. My point was that Microsoft knows how to implement a trivial randomization algorithm in a trivial situation (the difference between 60% and 100% random in this situation is not only unobservable unless thousands of iterations are performed, but it's also trivial to close that gap). If they went with a less-random algorithm, it would be far more likely to have been a willful decision instead of an accidental m
          • by 1 a bee (817783)

            Agreed. The blogger's article is a bit pointless: he can't show a bias in favor of any one browser; he only shows that there is a non-random distribution. Perhaps with a bit more analysis he could have determined if any of the browsers in fact did enjoyed a bias. That would have been interesting, but as it stands, the article is little more than supercilious pedantry.

          • by cgenman (325138) on Sunday February 28, 2010 @04:11PM (#31308846) Homepage

            While in Microsoft's native browser (which would happen the first time), Internet Explorer is given a full %64 chance of receiving one of the coveted 2 edge positions. Considering that antitrust courts were involved in the creation of this screen, you'd think that getting "random" right would be a development priority, especially considering it should have taken a competent programmer exactly the same amount of time to do it right as to do it wrong. If this takes even one hour of lawyer time to ponder, it would have been much cheaper to send the programmer back to fix it.

            A 50% chance of getting a particular slot that should be %20 is not "99.99% random." It's just wrong. And when you're talking about the cost of antitrust regulation, it's really, really wrong.

            I'm glad this is being brought up on Slashdot. There is a lot of misunderstanding about how to create randomness in systems. Even on a basic level, people frequently ask for "random" when they actually want jukebox random. In this case, though, it just seems like a basic misunderstanding of statistics, which is not surprising given the moderate code complexity and likelihood this screen was given to an intern or jr programmer.

          • Re: (Score:3, Insightful)

            by bmcage (785177)
            Do you have any mathematical or logical inclination?? Well I have, and seeing such a stupid way to randomize 5 entries just makes me weep! Be truthfull here, would _you_ program it that way?? What are they smoking in Seattle?

            And as the article says, if you use statistics to determine if the program is random, then the answer will be ... not random. So please, don't call this random, and if you do, please, let me know which software you work on, so I can avoid it.

            I agree that all this does not matter for

        • by TerranFury (726743) on Sunday February 28, 2010 @03:22PM (#31308468)
          Whatever. They offloaded what looked like a menial task to some low-level programmer, who ran it a few times, saw it was "random" (without doing any statistical tests), and went home happy. He probably should have known the Knuth shuffle algorithm -- I remember studying it in high school CS, even -- but honestly it's not that huge a deal.
        • Re: (Score:3, Funny)

          by lennier (44736)

          On the other hand, the devil is in the details, and one would think that a company such as Microsoft that has been owning the software market for decades now would know how to implement a randomizing algorithm correctly.

          Sure!

          10 RANDOMIZE TIMER
          20 PRINT INT(RND * 5)
          30 GOTO 20

        • Re: (Score:2, Insightful)

          by Short Circuit (52384)

          It's paranoia and naiveté like yours that led me to stop hanging around here so much.

          Paranoia in that everything company X does is evil or has an inappropriate or immoral ulterior motive. Naiveté in that you don't stop to recognize that the not all of the developers who work for an institution are going to output code of the caliber of its most senior, experienced and/or knowledgeable developers, nor can code review and automated tests catch all of the problems and gotchas known to computer scienc

      • Re: (Score:3, Insightful)

        by DavidShor (928926)
        Did you read the god-damn article? The results are insanely non-uniform. With a sample-size of 10,000 , IE ends up in 5th place 50% of the time. I mean, it's not evil, it doesn't benefit them. But it makes them look insanely dumb.
        • by omfgnosis (963606)

          I'm not convinced it doesn't benefit them. I'm betting that eye-tracking tests would show a disproportionate amount of attention to the last position, at least over the middle positions but possibly over the first as well. This hunch is founded on a tendency in UI design to place useful things (eg search and navigation) on the right and an expectation that users have grown accustomed to this.

      • by pla (258480)
        2) General geek pedantry. Many geeks seem to love to be exceedingly pedantic about every little thing. If a definition isn't 100% perfect, at least in their mind, they jump all over it. I think it is a "Look at how smart I am!" kind of move. They want to show that they noticed that it wasn't 100% perfect and thus show how clever they are.

        First of all, let me say that I fall in on Microsoft's side on this one - So they didn't use a shuffle that would pass muster for a licensed video poker system - So what
    • by SuperKendall (25149) on Sunday February 28, 2010 @03:16PM (#31308428)

      Here's the problem - consider the results again. Safari will almost always (almost 50% of the time) be put in the bottom two elements. In fact depending on the algorithm used it's 40-50% chance of being put in one exact slot (either choice four or five).

      When the whole point of the list is promote browser competition, it makes no sense to accept a list which is that skewed for ANY browser result from the list. You need to have it properly shuffled so that no one browser has a statistical advantage or disadvantage - if you are going to claim it doesn't matter then why not let Microsoft set an arbitrary fixed order for the list?

      That is not what the legal injunction against them says they can do, therefore the randomness of the results DO matter. Just as in most things in life, correctness of results is actually important.

      • by pushing-robot (1037830) on Sunday February 28, 2010 @03:34PM (#31308554)

        Safari will almost always (almost 50% of the time) be put in the bottom two elements [out of five].

        And how well did you do in statistics class?

      • Wow... the bottom 2 elements out of a possible 5 ALMOST 50% of the time.. talk about a massive skew by those evil MS guys....

        BTW... did you ever see the Dilbert cartoon with the PHB complaining that a full 40% of employee sick days were taken on Mondays and Fridays? You weren't possibly the same PHB were you?

      • Re: (Score:3, Insightful)

        by Aladrin (926209)

        I don't know how you managed to get 'insightful' mods for this.

        For a completely random algorithm: Out of 5 slots, EACH item has a 20% chance to be put in any single slot. For any 2 slots, that's 40%.

        Let's look at your statement again: "Safari will almost always (almost 50% of the time) be put in the bottom two elements. In fact depending on the algorithm used it's 40-50% chance of being put in one exact slot (either choice four or five)."

        Wow. So you're saying that it's working perfectly.

    • by ejtttje (673126)
      It's not an academic issue of minor bias: the measurements show that IE winds up in slot 5 with 50% probability. That's hugely biased.

      And although some might shrug as the rightmost slot being somehow "bad", there are well known tendencies for people to remember the first and last thing in a list. The point of this whole ballot screen is to even the playing field to avoid this psychology stuff, and they missed the mark. Not to mention how bad it looks that they supposedly hire the best and brightest, bu
  • Yeah right (Score:3, Insightful)

    by CSHARP123 (904951) on Sunday February 28, 2010 @02:46PM (#31308184)
    Showing a browser selection has been imposed on them and these geeks think MS is going to select the best approach possible for randomness. No wonder none of you are sucess in business.
    • 1 - You didn't read the article, did you?

      2 - Although their method for generating uniformly-random permutations is incorrect, they are doing nothing to favor IE.

  • Milliseconds (Score:3, Interesting)

    by Lord Lode (1290856) on Sunday February 28, 2010 @02:50PM (#31308220)

    They could as well just have used the last millisecond to show the browser. I mean, it's a screen shown only once to a user. What's more random, and uniform, than the time the screen appears in milliseconds modulo 5?

    • Re: (Score:3, Informative)

      by SpinyNorman (33776)

      It's not an issue of how to get a truly random number, or of seeding a random number generator, but rather how to you use a source of random numbers to randomly order a list.

      Some "reasonable sounding" methods don't actually work - e.g. attach random numbers to each list item and sort the list by these numbers (1). Microsoft used a similar method of sorting using a random comparator.

      Some simple methods that DO work are picking a random permutation or executing a bunch of random swap operations on the list.

      (1

      • Ok, sorry, had read only the summary when posting this. Should have read the full article first...

      • Some "reasonable sounding" methods don't actually work - e.g. attach random numbers to each list item and sort the list by these numbers (1).

        That method does work as long as you guarantee the uniqueness of the random numbers--in other words, a tie shouldn't mean leaving elements in place, but rather repeating the algorithm on each subset of the list where the random numbers came up equal.
        I see no advantage to it over the Fisher-Yates shuffle (which should be the default solution to the problem), but it can

    • by ejtttje (673126)
      The task requires more than picking a single random number, the task is to generate a random ordering of elements. Doing this correctly is not simply a matter of choosing a seed for the PRNG, if that is what you are referring to. Where Microsoft has screwed up is applying a randomized comparison function to a sort operation. This does not yield anything close to a uniform distribution depending on the sorting algorithm being used. (Thus the test results showing 50% probability of IE in slot 5!)

      Instead
  • by jmtpi (17834) on Sunday February 28, 2010 @02:54PM (#31308258) Homepage

    A Google search on:

    javascript array sort

    gives exactly the bogus answer that Microsoft used in the top hit. [javascriptkit.com]
    Unfortunately for Microsoft, a bing search gives the same top hit.

  • There are 5 well-known approaches: 3 good solutions, 1 acceptable solution that is slower than necessary and 1 bad approach that doesn’t really work. Microsoft appears to have picked the bad approach.

    We are still talking here about the random selection of browsers, or something more broad?

  • looking at the outcome IE comes off the worst with the current algorithm, please keep it that way. Thanks from all the Web Developers.
    • Re:do not fix! (Score:4, Interesting)

      by jonbryce (703250) on Sunday February 28, 2010 @03:30PM (#31308538) Homepage

      I don't think it comes off worst at all. People usually look towards the right of the screen for the "go away and stop bugging me" button, and that's where Internet Explorer is 50% of the time.

      Remember that the yellow exclamation mark in the system tray telling people they need to reboot is an annoyance when they just want to get on with their work. Then when the computer finally does reboot and they really really want to start doing whatever it was they turned on the computer to do, they get this annoying thing about web browsers.

  • Frankly, I have no problem with this result. Quite the opposite in fact, since I think you can probably group the users into three sets, only one of which really matters in connection with the Browser Selection Screen:
    1. Those that equate "Internet" with a blue "e" and as such will pick IE regardless of its position
    2. Those that prefer another browser to IE and will pick another option regardless of positioning
    3. Those that have no clue about browsers and pick essentially at random, or belong to one of the abo
    • by westlake (615356)

      3 Those that have no clue about browsers and pick essentially at random, or belong to one of the above groups and click in error.

      It won't be a random choice.

      They will choose the browser associated with their operating system:

      Microsoft Windows = Microsoft IE 8 for Windows.

      The only people where the selection and any possible bias inherent in the implementation of the random() function are the last group, which is also quite possibly the smallest of the three sets.

      It is more likely the largest of the three s

      • by Zocalo (252965)
        If they are not picking IE at random, then they probably belong in my first group then, albeit with a change of descriptive text as whether the association is "Blue e = Internet" or with the name Microsoft is neither here nor there in context. I don't think that this group is necessarily as large as you might think though, in my experience the kind of person who falls into this set has a considerable overlap with the set of users that only upgrades their OS when they get a new computer.

        Personally, I thi
    • by jonbryce (703250)

      Those who prefer another browser to IE will have already installed one, and they won't see this screen.

  • One solutions takes 3 seconds, can be done by an intern, and makes the company no money. The other solution takes a little bit of time, maybe some reading or prior knowledge and still makes the company no money. The results yielded for each solution are acceptable for the situation. Given the cost to profit it seems like Microsoft chose EXACTLY the right solution.

    This is like your community telling you that you must have a fenced in yard for your dog to be off the leash and then setting up a cheap 6-foot

    • Well, as it happens (if you read TFA) Microsoft's solution ends up being biased against Microsoft rather than for them. It hard to say which is worse, though; in one case you work against your own market share, in the other you end up wasting time and money as the EU hauls you back in court for unfair practices.

      It seems the least cost option for Microsoft would have been to have gotten this right first time (as is always the case with software).

  • Microsoft used none of these well-known solutions in their random solution. Instead they fell for the well-known trap. What they did is sort the array, but with a custom-defined comparison function. JavaScript, like many other programming languages, allows a custom comparator function to be specified. In the case of JavaScript, this function takes two indexes into the value array and returns a value which is:

    • < 0 if the value at the first index should be sorted before the value at the second index
  • Random (Score:4, Funny)

    by Greyfox (87712) on Sunday February 28, 2010 @03:33PM (#31308552) Homepage Journal
    I like this [xkcd.com] one.

    Anywhoo... So what you're telling me is that Microsoft's programmers made a mistake for a production system that 99% of freshmen CS students wouldn't make? In this case, I think you're actually giving too much credit to Freshman CS students...

  • On the down side.. it's a webpage.

    To cover the first bit.. it's a webpage - if a potential problem has been found, it can be fixed. I.e. this random selection thing? They can implement a better one.

    Though I do wish they would start by fixing the page's use of javascript to sort the results -after- their display.
    Turn off Javascript.. the order appears to always be: IE, FF, Opera, Chrome, Safari.

    In fact - if you have a slow machine like I do.. leave javascript on, and just refresh the page.. look closely,

  • Isn't the *middle* choice going to be more important than the leftmost one?

    We have a screen-centred dialog here that shows five browsers. One is going to end up smack-middle on the screen - and guess what - Windows centres the mouse pointer when it boots, which is when the screen will appear.

    People often go for middle ground. Then they go to the direction they read... So in my mind, the 3rd choice is going to be the most important, followed by 4 for LTR cultures, and 2 for RTL.

    But I'm talking out of my ass

  • I'm a senior programmer, over 10 years experience. I don't think I would have EVER imagined a solution to randomizing that would involve a sort comparison function returning a random result. This is because I have an understanding of the implicit (and in some languages, explicitly documented [sun.com]) total ordering contract imposed on the comparator function.

    But you know what? If Microsoft doesn't care for nice things like correctness, why would any big software development company? Excepting cases like Wolfram Res

  • Malice? (Score:4, Interesting)

    by xlsior (524145) on Sunday February 28, 2010 @04:18PM (#31308898) Homepage
    "Never ascribe to malice that which is adequately explained by incompetence" One thing I couldn't help but notice though, is that Microsoft always pops IE in the number one spot for a moment *before* shuffling the browsers and showing them in randomized order... Very visible if you visit the ballot manually in IE and hit F5 a few times: http://www.browserchoice.eu/ [browserchoice.eu]
    • Re: (Score:3, Informative)

      by Bigjeff5 (1143585)

      It starts off as a list dumbass, the browsers to sort have to come from somewhere.

      Would you be upset if the first item on the list were FireFox, so FF popped up first momentarily?

      Seriously, get a life.

  • Reach for Knuth? (Score:3, Insightful)

    by fm6 (162816) on Sunday February 28, 2010 @05:00PM (#31309202) Homepage Journal

    And one of the things one learns early on is to reach for Knuth

    Knuth is for computer scientists. Not everybody who writes code meets that definition. A lot of us (and I include myself) don't even qualify as "engineers".

    For most programmers, the best way to write good "select random x from 1..n is not to brush up on our algorithmics. That's like fabricating a car part instead of going to the auto supply. (Hey, there's a good reason the car analogy keeps popping up!) You need to rely on standard, well-tested libraries. Josh Bloch even refers to this use case as an example of why you should rely on library code [sun.com].

  • by UBfusion (1303959) on Sunday February 28, 2010 @06:54PM (#31310180)

    Those of you that are computer scientists should take a moment to consider that randomness is not the same as uniformity (as an insightful reader commented in TFA and triggered me to respond there).

    Just because the only way to produce an algorithm for uniformity is via a random number generator, this does not mean that there aren't other non-statistical approaches. Here's one:

    "The computer upon Windows installation contacts a MS site that uses a global installation counter - each new installation would increase the counter from N to (N+1) and then present a browser order according to (N modulo 5!). This is a totally deterministic process, with no randomness at all (statistical tests for randomness would fail because of the autocorrelation), which however would lead to perfect uniformity: at any given time instant, each browser would have been placed in each of the 5 positions with a percentage of precisely 20%, as required. The same kind of uniformity could be produced by using the installation serial number (licence) of Windows: since the licence key space is well-defined, the order of browsers could be also well (uniformly) defined from the serial number itself. There might be a problem with volume licences, but VLKs are a small percentage of total installations.

    However, on a single offline computer, with no knowledge of history (what ballot was presented globally) or without a licence key, programmers have to resort to mathematics in order to produce
    uniform (not necessarily random) distributions. This is an application of the law of large numbers: if the ballot is uniform on the same computer, it will be uniform globally." (using quotes because I'm quoting myself).

    In conclusion, we should not care if the distribution is not "random" but whether it is uniform (i.e. all possible permutations of 5 browsers appear with equal frequencies).

  • by zill (1690130) on Sunday February 28, 2010 @08:57PM (#31311046)
    This error can be easily traced back to the first google result [google.ca] (Actually it should be the first bing result [www.bing.ca] in this case).

    To be perfectly honest, this is exactly what I would have done too.
    It takes 5 minutes to dig out my copy of Knuth.
    It takes 1 minute to pirate Knuth and search through the pdf.
    But it only takes 10 seconds to copy and paste this one-liner from the first google hit.

    That probably explains my lack of success in the job market for the past decade...
  • by LtGordon (1421725) on Monday March 01, 2010 @12:20AM (#31312174)

    I bet if we gave this same problem to 100 freshmen computer science majors, at least 1 of them would make the same mistake.

    Well, that seems an awfully high standard for Microsoft to hold itself to.

    Microsoft: Ranked 1st percentile with Freshmen CS Majors

The reason that every major university maintains a department of mathematics is that it's cheaper than institutionalizing all those people.

Working...