Forgot your password?
typodupeerror
Google Businesses The Internet Programming Technology

Google Suggest Dissected, Part II 148

Posted by michael
from the kudos-for-correct-use-of-complement dept.
Bert690 writes "To complement the recent dissection of Google Suggest's innovative front end, I investigated [Coral Link & mirror] the back end of the system in an effort to determine just how it generates suggestions. Along with some preliminary findings, you'll find a pointer to a program for enumerating all possible suggestions from a given starting point. I found the number of possible suggestions to be surprisingly small considering the immense scope of the web."
This discussion has been archived. No new comments can be posted.

Google Suggest Dissected, Part II

Comments Filter:
  • by Quasar1999 (520073) on Friday December 24, 2004 @10:45AM (#11176506) Journal
    It's not the amount of data that a program references to create a result, it's the precision of it's result that matters... if it can do it with relatively little data, then it was designed/implemented by someone who knows what they're doing...
    • On similar note.

      Instead of showing the possible results and its score as you type, I would rather it return the most probable exact single match.

      Anyway, as I tried to type some of the terms I used to search for, they do not appear on the list.

      So, it will be interesting to see how slow it might get if google is to index every single terms out there.
    • by Bioanarchism (550560) <michael.feng@frro.net> on Friday December 24, 2004 @10:59AM (#11176586) Homepage Journal
      apparently google has better programmers and engineers than most tech companies. it is not only the interesting concepts that they publish, but the courage to invest and experiment thins that are others dare not or rather, they think of as time-consuming.

      how can i, personally, not think of a flash clip which protrayed the merge of google and amazon, to create googlezon, based on google's extensive grid engine. lets hope that wont be an accurate prediction, coz i dun wanna live in a world that has the rest of the world's information at their fingertips.

      and it seems, google is on that path to 'immortality'.
      • A world where 'everyone' has 'instant' access to 'all' the worlds' information may be a world where those who have used our collective ignorance to enslave us will no longer be able to do so. It also would not require 'everyone' to be involved at the same time. Opting out temporarily...or even permanently ( a very small number) would not degrade the system.
    • by Anonymous Coward
      It's not the amount of data that a program references to create a result, it's the precision of it's result that matters... if it can do it with relatively little data, then it was designed/implemented by someone who knows what they're doing...

      True for very popular searches, but it you're searching for something more obscure, size most certainly does matter.

  • Basically this is a hacky method of accessing fields. The code to do it is burdensome to say the least.

    Is there any work on a toolkit or API that allows relatively easy access to this technique?

  • Funny (Score:5, Funny)

    by Anonymous Coward on Friday December 24, 2004 @10:53AM (#11176541)
    Press "p" and the first thing "google suggests" is "Paris Hilton", hmm. Although on a cooler note when yopu press "f" the first suggestion is firefox!
    • Re:Funny (Score:1, Insightful)

      by Performaman (735106)
      And when you enter "y," the first thing you get is "yahoo."
    • This is because, like Zeitgeist, Google Suggest is based on things people have actually searched for. OK, so that's a wild guess, but it doesn't make sense any other way.
      • Re:Funny (Score:3, Informative)

        by tdvaughan (582870)
        Are you suggesting that no-one searches for 'porn' on Google? It's more likely that the results are passed through a sanitiser beforehand so that you don't have Google suggesting you look at adult content.
        • Are you suggesting that no-one searches for 'porn' on Google? It's more likely that the results are passed through a sanitiser beforehand so that you don't have Google suggesting you look at adult content.

          As if there are people searching for "Paris Hilton" in a non-adult-content way?
        • Metacrawler has a page ( Metaspy ) that shows what people are using metacrawler for at the time. Awhile ago , I wrote a screensaver [uts.edu.au] for OSX that scraped this page and used its contents.

          I don't think I've ever seen anyone search for "porn" - they tend to be quite a good deal more specific. The oddball, useless, generic search that keeps popping up over and over and over again is, strangely "food". Guess there's a lot of hungry folks out there in Corporate America. The other weird one is people searchin

      • Going by that, entering 'B' would bring up Brittney Spears, while in reality, it brings up Best Buy...
    • First Suggestions (Score:2, Informative)

      First suggestion for each letter/number:

      amazon
      best buy
      cnn
      dictionary
      ebay
      firefox
      games
      hotmai l
      ikea
      jokes
      kazaa
      lyrics
      mapquest
      news
      onl ine dictionary
      paris hilton
      quotes
      recipes
      spybot
      tara reid
      ups
      verizon
      weather
      xbox
      yahoo
      zip codes

      1
      2004 election
      3m
      411
      50 cent
      60 minutes
      7th heaven
      89.com
      911
      02
  • SEO (Score:5, Interesting)

    by FiReaNGeL (312636) <fireang3l AT hotmail DOT com> on Friday December 24, 2004 @11:04AM (#11176603) Homepage
    If you're interested in Search Engine Optimization, the tool can be used like the Overture Keyword Selector Tool [overture.com]. Similar results are obtained with both, which is interesting all in itself. A guy built an interface [hooznet.com] similar to Overture to use with Google Suggest.

    Other than that I can't think of a real use... I usually know what I want to search for on Google. It could help optimize queries I guess (see the "number" of results before hitting submit, but not the quality...)

    Happy Holidays to all Slashdotters, by the way :)
    • I like that one better, P gives u p diddy first hes much cooler than paris hilton =0
    • by Corrado (64013)
      Wow!! A couple of days ago I was doing some Googling for cable modem help (new subscriber and it's not working yet :( ). Anyway, I typed ca into the www.hooznet.com/suggest site and it popped up my exact search phrase!

      cable modem problem "return path" pending

      This is too creepy...

      Then again, maybe it was just looking in my browser cache or something. Can someone out there try it and let me know what you get?

  • by LSA (764123)
    http://eric.blognews.com/blog/archives/2004/12/10/ 202467.html
    • Re:Eric Rice (Score:1, Informative)

      by Anonymous Coward
      That link doesn't work, add an underscore before the "archives"
  • by Anonymous Coward on Friday December 24, 2004 @11:07AM (#11176626)
    Google needs to remember the last x queries that we submitted and the time we submitted them to better guess what we're looking for. If I hit 'p' I get Paris Hilton even though previous searches were for perl, parrot and pascal.

    When will they work out that there are different classes of users out there that look for different things at different times?
    • Oh, I love this privacy concept. Then we'll see google ads adapt to our latest queries and you'll just be happy if they restrict that to Paris Hilton for p, too.

      I'm fine with Google acquiring huge amounts of data, but with the wealth of possible info, I think I at least should be able to see a "clean" web, too. Sometimes I don't want to see only the search hits I've been likely to click on in the past. Giving me at least an option to see "unbiased" hits would be nice.
    • If they did that people would not stop complaining about their search habits being tracked. Personally I'd love for them to do that, but after the whole gmail episode they are probably beign careful. Regardless, your browser probably already keeps track and will "autosuggest" searches for you.
      Regards,
      Steve
    • That sounds like a good application for one of their other google labs projects: google sets [google.com]. It would be cool if it could take your last few searches, try them with google sets, and see if they create a coherent category. If so, use the other results as the suggestions.
    • that would be very easy to implement with cookies. i guess google will do it eventually. btw, is there any way to suggest feature for google suggest?!
    • Just use google classic. Your browser should suggest those three.
      Google suggest is another thing. Probably it's just not right for you.
  • by MicroBerto (91055) on Friday December 24, 2004 @11:08AM (#11176629)
    As big as the web is, it's just the same boring drivel over and over... it shouldn't be too hard to make Google Suggest! :)
  • Unexpected Ways (Score:5, Interesting)

    by RmanB17499 (829438) on Friday December 24, 2004 @11:12AM (#11176653)
    I like trying to use Google Suggest in unexpected ways: Try typing in 1ZE and see all the UPS tracking numbers that come up. Pick one and track it. Or try typing an area code with a large population (201, 212, 213, 818, etc) and maybe add a digit or two and see what telephone numbers people have been searching for lately.
    • Re:Unexpected Ways (Score:1, Interesting)

      by Anonymous Coward
      Heh... good idea. I just used the program to enumerate all suggestions starting with 1ze. Not that many of them it seems.

      Suggestion: 1zea54660331985982
      Suggestion: 1ze20a324260463891
      Suggestion: 1ze278020330000933
      Suggestion: 1zea54610384411386
      Suggestion: 1ze17a584283834117
      Suggestion: 1ze2e8630216613599
      Suggestion: 1ze6w3110315135840
      Suggestion: 1ze6w3114214877030
      Suggestion: 1ze13a834220148077
      Suggestion: 1ze208290391650789
      Suggestion: 1ze17a584265752490
      Suggestion: 1ze1024v0342273265
      Suggestion: 1ze077r50304406359
    • Interesting. I tried "5424" but all I got is "5424000000000000". I wonder if there are any valid credit card numbers on there. Can you say "potential privacy violation"?
    • Re:Unexpected Ways (Score:3, Interesting)

      by Quixote (154172)
      Here's a web page where someone wrote a script to see what package numbers are coughed up by Google: enjoy! [buffalo.edu].
    • I'm personally a fan of typing in 0x to see what Windows error messages are the most common.

      Of course, one of the results is someone using the Google calculator, which just goes to show that lots of interesting stuff can be seen by using Google Suggest.
  • Quite amusingly, a number of words seem to be censored... It you type, say, sex, then you have no more suggestions... Even, if you type it within a word...
    • by Anonymous Coward
      True... I tried searching for legit place names in the UK (which happen to contain expletives) and it stopped suggesting.

      Try:

      Essex
      Cockfosters
      Scunthorpe ...etc...
    • It's easy to find whatever you want with Suggest. Overly broad terms don't make it into the list. Why should they? Each term shows how many results would be retrieved. Searching for "sex" or "porn" will return more digits than can fit.
      • But words like "Essex", "Sussex", and "Scunthorpe" are also missing.

        The first two are English counties, and the other is an English town. All of these have been mistakenly binned by over zealous filters in the past.

        Apparently North Lincolnshire Council blocked all emails with Scunthorpe in them at one point. As Scunthorpe is in North Lincolnshire, quite a lot of emails were wrongly blocked.
  • Next Step (Score:2, Insightful)

    by mahesh_gharat (633793)
    After some period Google will not only suggest but will also take decisions for you!

    wait....

    Isn't "I'm Feeling Lucky" option takes a decision for you?
  • by tommertron (640180) * on Friday December 24, 2004 @11:14AM (#11176659) Homepage Journal
    ... it doesn't include dirty words. I know, I may be a little immature, but it's almost always the first thing I try on anything like this. There's not even a way of turning 'safe suggest' on or off or anything. Even such innocuous (and popular!) words like 'nude' aren't suggested. What if you're searching for nude models for your art class, or the great nudes? It's just interesting... Google is becoming very corporate in terms of filtering out content these days.
  • by stevejsmith (614145) on Friday December 24, 2004 @11:23AM (#11176696) Homepage
    a: amazon
    b: best buy
    c: cnn

    WHO THE FUCK SEARCHES FOR THOSE THINGS?? It amazes me how stupid people are - rather than type in amazon.com, bestbuy.com, or cnn.com, they actually search for them on Google.
  • It will show "penthouse", but not "playboy".
  • Mined Query Logs (Score:1, Informative)

    by Anonymous Coward
    Contrary to what the author suggests, I suspect that the suggested searches are derived from query logs, not from the documents themselves.

    As others have noted, the top suggestion for p is paris hilton with 6.7M results, but the number of results for the next 5 suggestions contain far more results -- more than 20M, in fact.

    I doubt there is much of an attempt at precision. For example, the first suggestion for "new york" is "new york times"; the second is "new york."

  • It sounds like an extended version of the "I'm Feeling Lucky" feature.
  • It would be sweet to have this as an extension to the search bar in firefox. Other than that, I don't think I'd ever use it - too likely to forget it exists in the future...
  • Is that the internet is no longer *just* the geeks/nerds/calculator-watch crowd. There are increasing numbers of grandmothers and soccer-moms gaining access everyday. What was once a haven for the slide-rule crowd will soon become just like everything else, an asylum commercialized for the lowest common denominator - the general public. Once that milestone is reached, sites like /. will become fewer and fewer as we see more recipedot and howtogetmudoutofchildrensclothesdot popping up. It's not longer a
  • in the early days of the internet, people were posting all sorts of websites on all sorts of topics. as the web became more commercialized, most geeks were (rightfully) worried that major commercial hubs would be created that would attract the majority of attention and dilute the importance of the more peripheral areas of the web. this trend is already underway, and tools such as google suggest will hasten the decline. users will be directed to the areas that most people are already going, thereby increa
    • But if people cannot pull find data on the web in a organized manner it is all just noise. I remember how hard it was to find anything on the net. The webcrawler cam out and I thought it was a godsend. Now Google makes life much easier.

      If data cannot be organized and indexed it quickly becomes usless. If there is a smaller site that has the data that is being looked for and enough people find it then would not Google suggest place that site at the top of the heap above the commercial sites?
      • i am not arguing that google is bad for the web (on the contrary), but that google suggest is bad for the web. of course google allows users to find what they are looking for, but google suggest tends to direct them to the POPULAR destinations (which appear at the top of the results of popular searches) which may or may not be the most RELEVANT destinations. here's what i mean... suppose a group of three people were interested in some information on a widget, but when they search for it, they might search
  • To answer your questions about how the suggestions are generated - from looking through your enumeration lists, they are obviously compiled from words/phrases that people have actually searched for.
    • by Anonymous Coward
      No, not entirely obvious when have suggestions like "privacy alert when was the last time you cleaned your pc
      clean out your computer and make it run smoother along with protecting
      your privacy today try it free with the test now option for secure
      privacy clean your computer at least once a week i want to... gtgt...."
  • better address? (Score:2, Insightful)

    by earthstar (748263)
    When will Google suggest get a better address than this one?
    http://www.google.com/webhp?complete=1&hl=en

    that long address wont help anyone.

    Even if it is in beta.

  • found one use (Score:2, Informative)

    by earthstar (748263)
    Today , finally found a use for google suggest. Lets say some weird news [ or rather a 'hushed' news ] has broken out.Then when you type only some word about it ,then you get to see, under which combination of words , maximum number of results have been obtained.

    The suggested words by themselves may not be all that useful,but when combined with the number of results shown for each keyword ,I think it can be useful.

    Google suggest may not be immediately be of use to everyone like Google.com,but will rather

  • here we go then:
    in Soviet Russia, you suggest to Google what it should search! /ducks
  • About six months ago I wrote a little spellchecker plugin for an internal app and used a very similar principal. The completion lists worked like open office though rather than presenting a dropdown menu the remainder of the word would be highlighted and you could up-arrow or down-arrow to cycle through the list. It's fun to see how quickly you can narrow down a given result set.
  • Hi guys, I just finished implementing Google suggest for a dictionary database. http://www.objectgraph.com/dictionary The code is clean and you could see it by using "View Source" The dictionary database is on an SQL server (total of 18000+ words) with an index on the word column.

With all the fancy scientists in the world, why can't they just once build a nuclear balm?

Working...