Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Google Suggest Dissected

Posted by michael on Sat Dec 18, 2004 04:05 AM
from the rated-r-for-responsive dept.
sammykrupa writes "Google suggest Javascript code dissected and rewritten for all of you web developers out there. Cool piece of web reverse-engineering!" Joel Spolsky astutely notes that this will raise the bar in terms of how people expect the "internets" to work.
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by BillsPetMonkey (654200) on Saturday December 18 2004, @04:17AM (#11123603)
    Let's think if the way people search for stuff.

    1. Try something specific
    2. Try something less specific

    Number 1. brings up no results on Goggle Suggest, number 2. brings up 523,334 results. Impressive, but how has this helped us search for 1. ?

    Let's try an example, lets look for "C# structs"

    1. Enter "C# structs" - no suggestions.
    2. Enter "structs" - 425,000 results.

    Grrreat.
    • I don't agree (Score:5, Insightful)

      by mansoft (371174) <zouaveNO@SPAMtelefonica.net> on Saturday December 18 2004, @05:24AM (#11123729) Homepage
      What you say might be true for us geeks, but have you ever seen how standard users do web searches? They begin with one-word searches, and if and only if the results don't satisfy them do they refine their search.

      • by Phexro (9814) on Saturday December 18 2004, @05:03AM (#11123693)
        Yup, this is one of the terribly irritating thing about google... there is simply no way to search for an exact string which contains non-alphanumeric characters. It strips out most punctuation.

        e.g. search for 'tmp/foo/bar' or 'tmp/foo/bar#baz'. You'll see results for '/tmp/foo.bar', '/tmp/foo/bar', and so forth. What if I'm looking for that exact string? This can be very frustrating when searching for posts about a specific error message, since a page with 'condition foo: bar' will be just as likely to show up as 'foo: bar condition', but they aren't necessarily the same.
  • by Vladan (829136) on Saturday December 18 2004, @04:18AM (#11123605)
    Here's what he was talking about:

    Google with Auto Complete on [google.com] Just start typing in the search field.

    It's a beta feature.
  • by hobo2k (626482) on Saturday December 18 2004, @04:21AM (#11123615) Journal
    I don't know how happy google is about this, but there is already a FF extension to put suggest in the toolbar. Great plugin and also amazing how fast somebody implemented it! [mozillazine.org]
  • by Segosa (838329) on Saturday December 18 2004, @04:22AM (#11123621)
    Unfortunately Google Suggest has really no use. If you know what you want to search for, you search for it. Suggesting search terms isn't really going to do anything apart from distract you. Hopefully this technology will be used for other things where it actually IS useful.
  • by Gopal.V (532678) on Saturday December 18 2004, @04:25AM (#11123627) Homepage Journal
    Eventhough it's an M$ spawned horror - It has brought a new revolution to javascript. Now it can load data from the server without having to refresh the screen. Flash has an XmlSocket , but I never see anyone use it till now (pointers please).

    Eventhough Google suggest looks great, I'd vote on CGI::IRC as the biggest killer HTML/Javascript browser app.

    Clientside Javascript is powerful, we never realized how much :)
    • by jasoncart (573937) on Saturday December 18 2004, @04:34AM (#11123645) Homepage

      There are quite a few Flash RSS readers.

      Also, (seeing the link in your sig) parts of the BBC site use it - News for timelines (example [bbc.co.uk]) and CBBC used XML to pass data around flash games/apps

      The best one I've seen yet it the US Election tracker [bbc.co.uk]

      • by Gopal.V (532678) on Saturday December 18 2004, @06:09AM (#11123817) Homepage Journal
        Read History of XMLHttpRequest [apple.com].

        Microsoft implemented it as an Active-X object you could invoke from Javascript - Mozilla implemented it as a native Javascript object. Microsoft calls it "Msxml2.XMLHTTP" or "Microsoft.XMLHTTP" depending on which version of IE you are running - Mozillah has a cleaner "XMLHttpRequest" naming (soon to be in the standards I guess).

        So on IE it needs ActiveX enabled to use it . Mozilla version is therfore much safer to use and easier to program with in connection :)

        Visit simple example [apple.com] for a quick and dirty example :)

  • Censored!!! (Score:4, Funny)

    by britneys 9th husband (741556) on Saturday December 18 2004, @04:39AM (#11123656) Homepage Journal
    Try typing "porn" or "sex" or "cock" into Google Suggest. It doesn't come up with anything. I started to get suspicious when I typed the letter x to see what would come up, and got 4 or 5 variations of "xbox" but not a single "xxx" or "xxx porn" or anything.

    Interestingly enough, they DIDN'T censor the racial slurs. "gay nigger" happily suggests "gay niggers from outer space" among other things. Also, type "tub" and one of the suggestions is "tubgirl".
    • by Jadrano (641713) on Saturday December 18 2004, @11:09AM (#11124508)
      Indeed, there are no suggestions for English search terms that could lead to pornographic sites. It seems that this is only based on a word list and much less sophisticated than "safe search".
      Therefore, at present, this works only for English; with other languages it can happen that it suggests porn-prone search terms for the refinement of terms that have, as such, nothing to do with pornography. Some examples:
      • the first suggestion for 'fille' (French for 'girl') is 'nue' (naked)
      • the 5th suggestion for 'dzieci' (Polish for 'children') is 'nago' (naked)
      • suggestions for 'mund' (German for 'mouth') countain 'mund auf sperma rein' (open mouth, introduce sperms), 'mund ficken' (fuck in the mouth), "mund arsch" (mouth ass)
      • devochki (with Cyrillic letters: Russian for "little girls") gives the suggestions "devochki porno"
      • the first suggestion for 'smot...' with Cyrillic letters (smotret': Russian for 'watch'/'look at') is "smotret' porno"
      I think this is probably quite problematic - someone enters a search term that has nothing to do with pornography, and Google suggests something pornographic for 'refinement'. Of course, this is not due to Google's intent, but due to the distribution of the things people search for and of contents on the Internet. I suppose this is one of the problems Google will want to address before offering Suggest as an option on the main page.
  • Beware (Score:5, Interesting)

    by kuzb (724081) on Saturday December 18 2004, @04:43AM (#11123662)
    Google suggest is a neat idea, but a potentially destructive one.

    Small sites should *not* try to do this kind of thing on a live site. The amount of pressure this could put on a bad database structure (or even a well formed one) is considerable. Think about how many database hits a user could perform in a very short space of time: (user enters something, (database hit) backspace (database hit) types another letter (database hit)), then multiply it by a hundred or more people if your site gets a moderate amount of traffic.

    Google can get away with this because they have considerable bandwidth, and large server farms. We've been seeing people trying to copy google suggest for the last couple of weeks in #javascript/freenode and in #php/freenode. The people trying to copy it generally do not understand how potentially bad this can be for a single server.

    Anyhow, my advice is, don't do it unless you have the resources to scale your site. The cost of such an insignificant feature (lets face it, all it does is save the user one or two clicks) seems like it outweighs the gain. If you do decide to do it, and your site gets popular, and you're on some kind of shared host, your sysadmin is going to hate you, and the other site admins will probably meet you at your house, torches in hand.
    • Re:Beware (Score:5, Informative)

      by broothal (186066) <christian@fabel.dk> on Saturday December 18 2004, @05:02AM (#11123692) Homepage Journal
      Actually, it's not a new lookup in the google main databse for each keypress. It's a lookup in a pre-generated table of results.

      It's pretty easy to spot, as the number of results shown in the preview doesn't match the number of results when you hit enter.

      This makes perfect sense, since a "real" lookup would generate way too much heat. But, it's also dangerous, because people are led to believe that what they're typing would'nt yield a result. This is wrong. A simple proof of concept. Type sex. It says 0 results. But if you hit enter, you get a godzillion.
        • Re:Beware (Score:5, Informative)

          by XaXXon (202882) * <(xaxxon) (at) (gmail.com)> on Saturday December 18 2004, @06:13AM (#11123824) Homepage
          I think you missed the point. The "massive array" lives on the server, and when the client requests suggestions for a particular string, it is looked up in this array. Only the portion of the array that has been grabbed from prior strings is cached on the client.

          In a naive, client-side caching system, if you DID manage to request all the suggestion strings in the client, eventually you would have the entire array client side, but you'd probably start throwing away the old data at some point.

  • by chregu (70525) on Saturday December 18 2004, @04:54AM (#11123678) Homepage
    LiveSearch [bitflux.ch] does something very similar, is Open Source and exists since April ;)

    If you look for more XMLHTTPRequest examples, which tightly integrate JS and PHP (other server side languages would be possible), see JPSpan [sf.net].

    I don't quite understand all the hype about Google Suggests. The technique for doing it exists since at least 2 years on Mozilla (and even longer on IE). Therefore, doing something like that was possible since a long time, but maybe everyone was just scared of using JS for "serious" stuff..

  • by eraserewind (446891) on Saturday December 18 2004, @05:11AM (#11123707)
    Not to dismiss the neat reverse engineering he did, but is the actual discovery that big a deal? It's just a keypress handler, and some server communication. No big deal on any graphical user interface other than a web page.

    Google have good UIs because they hire smart people. Other people don't because they don't hire smart people, or hire the wrong type of smarts (graphic designer instead of sw engineer for the coding part of a website, and vice versa).
  • XMLHTTP (Score:4, Interesting)

    by marcjps (66742) on Saturday December 18 2004, @05:12AM (#11123708)
    I've looked at using the XMLHTTP object a couple of times in the past, and noted that this is partly how Google Suggest works.

    XMLHTTP is a COM object included with recent versions of Internet Explorer. You can call it from client side JavaScript in a web page. The object will make a request to the URL you specify, and return the result into either a string variable, or an MSXML DOM object. You can then have the javascript output the results to an object (eg, a div tag) on the page without doing a full page reload.

    I wrote a small tech demo that implemented a virtual tree - so when you expand a branch in the tree the client only retrieved the data it needed. This was borrowed from the approach the MSDN web site uses. The advantages to it are that it doesn't download the same data over and over like when you expand a branch in a server side tree. You also don't have to do any work at all to remember the state of the tree since there's no full page refreshes involved.

    Google Suggest is similar in that it is a virtual list rather than a virtual tree. A virtual list allows you to list lots of items and jump around in the list without needing to download the entire data set when the page was loaded.

    Another use for this would be dynamic forms - forms that alter the state of controls based on selections the user made in previous controls.

    The biggest suprise to me was that Google have implemented this on a site live to the public. In using XMLHTTP I found it a little bit prone to locking up the browser when waiting for responses to requests. Additionally it's Windows only, so could never have been implemented on an external web site.

    I'll be looking with interest at the Mozilla side of Google's implementation, since I didn't think an equivalent existed until now. Two different implementations of the same functionality is still going put a damper on the technology though.. different code for different browsers is usually more trouble than its worth.

  • by rich42 (633659) on Saturday December 18 2004, @05:26AM (#11123732) Homepage
    interesting part is that either:

    1. Google performs several possible searches for each key you press

    2. Google already knows the estimated number of results for millions of queries

    Both of these suggest a heck of a lot of computing power. This type of thing might not scale up for general use in the near future - but still...

    we're talking massive computational power and one of the largest databases ever created.

    I'm a bit worried the Googleplex is going to wake up one day and declare to all us 'organics':

    "yo bitches - you work for me now"

  • by mrn121 (673604) on Saturday December 18 2004, @08:26AM (#11124052) Homepage
    If you just type in one letter, you get the result beginning with that letter that is most searched for.
    This makes for an interesting way to sum up the internet into 26 words/phrases.

    Check it out:

    A - Amazon
    B - Best Buy
    C - CNN
    D - Dictionary
    E - eBay
    F - FireFox
    G - Games
    H - Hotmail
    I - Ikea
    J - Jokes
    K - Kazaa
    L - Lyrics
    M - Mapquest
    N - News
    O - Online Dictionary
    P - Paris Hilton
    Q - Quotes
    R - Recipes
    S - Spybot
    T - Tara Reid
    U - UPS
    V - Verizon
    W - Weather
    X - XBox
    Y - Yahoo
    Z - Zip Codes

    If I had to sum up the internet in 26 words/phrases, I don't think I could have done it better than Google. Of course, that is keeping in mind that Google Suggest has some pretty serious filters in place, so instead of P being "Porn" it is "Paris Hilton." Not too far off, if you think about it.

        • by TomV (138637) on Saturday December 18 2004, @05:05AM (#11123695)
          People know when they're sitting behind copious bandwidth. And you could well grow accustomed to an all-text page weighing the better part of a megabyte, due to a heinous amount of information parked in hidden JavaScript data structures, giving you that near-whiplash inducing responsiveness.

          In fairness, Google Suggest, like Gmail, works very nicely for me on a 56k dialup. Gmail takes a few seconds for its inital load, true, but then it's like lightning. Suggest doesn't even have the slow initial load, since webhp.htm comes in at only 3.6kB. I'm very impressed.

          Now I've no doubt that the bandwagon will bring us massive slow bloat as everyone gets his dog to code up vaguely similar functionality, but Google haven't done that.
    • by Anonymous Coward on Saturday December 18 2004, @05:57AM (#11123789)
      I fear that might be the case. I learned to code HTML and to put a decent webpage, designed the way I wanted it, online with relative ease, at the age of 14. It took time to learn it, but it was fairly straightforward - I wanted a large header in Verdana, I put in "FONT FACE" and "H1" tags, I wanted a table with a specific background color, I put in a "BGCOLOR" etc.

      Today, we have two languages (XHTML and CSS) instead of one (HTML), and while it certainly does a lot to improve interoperability and platform independence, it is two languages to learn, not one. Throw in stuff like JavaScript, and you have even more.

      Of course one can choose not to use XHTML and CSS, but that's not the way we want it, right? We want people to use the standards, to write code which won't crash Firefox, or not use proprietary solutions. Doing this is taking more and more effort. We have the skills and time to do and learn this, but not everyone have.

      If we want a wide adoption of standards, and an Internet for everyone, where everyone has equal opportunities, the only way is to make the standards easy to use, so people will use them of their own free will.

      Otherwise, in 10 years we'll be designing our fancy webpages, while the Joe Users who don't have the time or skills to learn the 13 languages required have no choice but to hire a professional, or use a crappy proprietary solution which won't allow them to take their ideas to their full potential, and this is a great loss for everyone.

      Saying "You must do *complicated thing* because it's the specified standard!" will only work with people like us.