Google Programming Contest Winner 229

Posted by michael on Friday May 31, 2002 @09:27AM from the check-out-the-big-brain-on-brad dept.

asqui writes "The First Annual Google Programming Contest, announced about 4 months ago has ended. The winner is Daniel Egnor, a former Microsoft employee. His project converted addresses found in documents to latitude-longitude coordinates and built a two-dimensional index of these coordinates, thereby allowing you to limit your query to a certain radius from a geographical location. Good for difficult questions like "Where is the nearest all-night pizza place that will deliver at this hour?". Unfortunately there is no mention whether this technology is on its way to the google labs yet. There are also details of 5 other excellent project submissions that didn't quite make it."

This discussion has been archived. No new comments can be posted.

Google Programming Contest Winner

Load All Comments

Search 229 Comments Log In/Create an Account

Comments Filter:

if i'd only known (Score:2, Interesting)

by oogoody ( 302342 ) writes:

they wanted such boring submissions.
The winning idea was cool, but the rest looks
like free development for google rather
than something novel.
- Re:if i'd only known (Score:4, Informative)
  
  by Indras ( 515472 ) writes: on Friday May 31, 2002 @09:52AM (#3616742)
  
  like free development for google
  
  Let me quote from the homepage of the annual contest:
  
  "Grand Prize
  
  $10,000 in cash
  VIP visit to Google Inc. in Mountain View, California
  Potentially run your prize-winning code on Google's multi-billion document repository (circumstances permitting)"
  
  Parent Share
  twitter facebook
  - Re:if i'd only known (Score:2)
    
    by Peyna ( 14792 ) writes:
    
    IIRC they also mentioned something about the potential offer of employment, then again, maybe I was just imagining that.
    - Re: (Score:1)
      
      by account_deleted ( 4530225 ) writes:
      
      Comment removed based on user account deletion
      - Re:if i'd only known (Score:2)
        
        by Peyna ( 14792 ) writes:
        
        If my submission won, I would apply for a job =]
        
        Re:if i'd only known (Score:3, Funny)
        
        by Indras ( 515472 ) writes:
        
        I would ask for royalties! A buck for every time someone viewed a page containing my code :o).
As previously designed by me (Score:1)

by oliverthered ( 187439 ) writes:

Here's one [emapsite.co.uk] i wrote earlier.

Dosn't do the document lookup thing, but we were using it for finding the neariest piza on a now defunct e-commerce website.
- Re:As previously designed by me (Score:1)
  
  by corian ( 34925 ) writes:
  
  click here - Afghanistan data half price.
  
  I suppose those maps are a bit out-of-date now, aren't they
  - There always out of date (Score:1)
    
    by oliverthered ( 187439 ) writes:
    
    It would be tempting to put some sarcastic rebuttal in here.
    Get a book on basic qauntum mechanics and it will tell you that your observations are always out-of-date.
    What's wrong with Afganistan anyhow, they seemed a nice bunch of people, with a strong religious following before the US regieme ousted there lovley government.
    - Re:There always out of date (Score:1)
      
      by morgajel ( 568462 ) writes:
      
      well, a topilogical map WOULD sorta be outta date-with the mountain buster missles and giant-ass potholes now covering the country...
      
      like it or not, the us millitary did some MAJOR terraforming over there.
      - pot-holes (Score:1)
        
        by oliverthered ( 187439 ) writes:
        
        Pot-holes they should search in there, I here the Afghanistanies like to some a bit of pot now and then!
- - Lets join up (Score:1)
    
    by oliverthered ( 187439 ) writes:
    
    We could undermine all the original ideas in the world.
    Who the hell was judging that thing anyhow, there's geo searches all over the place, and i've done plenty of address parsing code in my time!!!
I see one being implemented soon (Score:5, Interesting)

by Masem ( 1171 ) writes: on Friday May 31, 2002 @09:34AM (#3616628)

From the hon. mentions:

Laird Breyer, for his project, Markovian Page Ranking Distributions: Some Theory and Simulations. This project examined various properties of the Markovian process behind Google's PageRank algorithm, and suggested some modifications to take into account the "age" of each link to reduce Pagerank's tendency to bias against newly-created pages.

This may help to defeat the current practice of overloading the PageRank results of a given key word as to point to a given page by having people link to that page with a link containing that keyword, aka "Googlebombing". I do think that the winner is a very interesting and useful project, this latter one will probably be implemented ASAP.

Share
twitter facebook
- Re:I see one being implemented soon (Score:2)
  
  by 3.5 stripes ( 578410 ) writes:
  
  Yeah, it's a pity things like this are less exciting, it's a much needed addition, just not as "innovative".
- Re:I see one being implemented soon (Score:1)
  
  by Webz ( 210489 ) writes:
  
  What does Markovian mean? Of course I tried searching on Google but I found more of its use than its simple "what is" definition.
  - Re:I see one being implemented soon (Score:2, Informative)
    
    by Anonymous Coward writes:
    
    Try the Google Glossary [google.com] to find definitions of words or phrases.
    
    Markovian Dependece [google.com]- The condition where observations in a time series are dependent on previous observations in the near term. Markovian dependence dies quickly, while long-memory effects like Hurst dependence, decay over very long time periods.
  - Markov processes (Score:3, Informative)
    
    by dukethug ( 319009 ) writes:
    
    A Markov process is basically a series of random variables where the value of random variable X^(i+1) only depends on X^i. The idea is that if you want to predict the value of X^(i+1), all of the information you could possibly use is in the value of X^i.
    
    Lots of processes are Markovian- for instance, a random walk. If you're at point x at time t, then you know that there's a fifty-fifty chance you will be at x-1 or x+1 at time t+1. Knowing all of the previous points along the random walk won't help you predict the next point any better than that.
- Re:I see one being implemented soon (Score:1)
  
  by camelrider ( 46141 ) writes:
  
  I'm surprised that something like the Hon. Mention system for giving higher ranking to pages that contain "all of the keywords" hasn't already been implemented. It seems that Freshmeat already has this feature and it is quite useful.
I sent something into the contest. (Score:4, Funny)

by thedanceman ( 582570 ) writes: on Friday May 31, 2002 @09:35AM (#3616633) Homepage

But I guess they thought there was no need for -thedanceman- [watchmedance.com] on the google site.

Share
twitter facebook
- Mod parent up! (Score:1)
  
  by Migrant Programmer ( 19727 ) writes:
  
  He's not offtopic, he's getting his google on!
- Re:I sent something into the contest. (Score:1)
  
  by ttexx ( 126330 ) writes:
  
  HAHAHAHAH..... *sides hurt*
- Re:I sent something into the contest. (Score:1)
  
  by mprindle ( 198799 ) writes:
  
  roflol.... Just when you think there is nothing on the net to entertain you. :)
- Re:I sent something into the contest. (Score:2)
  
  by xtremex ( 130532 ) writes:
  
  Grrrrr! I can't play WMV files in Linux! Anyway anyone can convert them to mpegs or somethign?
What a great idea (Score:4, Funny)

by TedCheshireAcad ( 311748 ) writes: <ted&fc,rit,edu> on Friday May 31, 2002 @09:36AM (#3616636) Homepage

If only more pizza restaurants in my area had web sites. Soon enough, I won't even have to pick up the phone to make my food come to me! I wonder if the delivery guy will bring the pizza up to me at my computer. Hmm...

Share
twitter facebook
- Re:What a great idea (Score:1)
  
  by axneck ( 573097 ) writes:
  
  Papa Johns - web ordering. remembers your last order, so getting pizza and breadsticks is roughly 3 clicks and a half hour away.
- Re:What a great idea (Score:1, Offtopic)
  
  by generic-man ( 33649 ) writes:
  
  Or you could go to Yahoo! Yellow Pages [yahoo.com] and search for pizza places sorted by proximity to your house. I do it all the time with all sorts of locations.
- Re:What a great idea (Score:1, Offtopic)
  
  by Rob.Mathers ( 527086 ) writes:
  
  Don't know where you live, but Pizza Pizza allows you to go to their site and order pizza online (after you register) for some regions, most notably Toronto. Unfortunately their app is some POS Oracle thing that takes forever to do even the slightest thing like adding a topping (good 10-30 seconds between clicking a button and seeing the order frame change - with my cable connect and decent comp (733 w/ 384 megs), so it can take 10 minutes or longer for a complex order, plus delivery.
  
  BTW, this isn't O/T in the context of this thread.
Google Search (Score:4, Funny)

by iramkumar ( 199433 ) writes: on Friday May 31, 2002 @09:38AM (#3616654)

Search => Osama Bin Laden
Latitude/Longitude => 37/180, Pak
Capture ...

If this would have come out before we could have saved a country ...

Share
twitter facebook
- Search = Osama Bin Laden (Score:1)
  
  by oliverthered ( 187439 ) writes:
  
  Good job I have all those web pages saying that Bush is Osama Bin Laden, could make a nice killing on this one
- - Re:saved a country (Score:2)
    
    by jeti ( 105266 ) writes:
    
    It could have saved the US.
    US gov has been trying to
    capture Bin Laden before 9/11.
    
    At least some changes would
    not have occured that fast.
Idea for a Google Query..... (Score:5, Funny)

by ReelOddeeo ( 115880 ) writes: on Friday May 31, 2002 @09:39AM (#3616658)
Where is the nearest server in my jurisdiction where I can download....
- MP3's
- Warez
- Pr0n
- Explosives making instructions
And worst of all....
- DeCSS
We've got to stop all of the terrorists in the categories mentioned above!
Share
twitter facebook
"Google Sets" (Score:2)

by ralian ( 127441 ) writes:

It really seems to me like the "Google Sets" feature recently made available at Google Labs [google.com] is an implementation of Zhenlei Cai's submission(although the details are extremely sketchy in the Google announcement). If this is true, I wonder why they couldn't implement the winning idea too?
more details (Score:5, Informative)

by Alien54 ( 180860 ) writes: on Friday May 31, 2002 @09:40AM (#3616665) Journal

Daniel's project adds the ability to search for web pages within a particular geographic locale to traditional keyword searching. To accomplish this, Daniel converted street addresses found within a large corpus of documents to latitude-longitude-based coordinates using the freely available TIGER [census.gov] and FIPS [nist.gov] data sources, and built a two-dimensional index of these coordinates. Daniel's system provides an interface that allows the user to augment a keyword search with the ability to restrict matches to within a certain radius of a specified address (useful for queries that are difficult to answer using just keyword searching, such as "find me all bookstores near my house"). We selected Daniel's project because it combined an interesting and useful idea with a clean and robust implementation.
This is impressive bit of database manipulation. Somehow I didn't think that all of the datatypes, etc would be so easily parsed.
Although I do recall telephone directories that used to give you results for a specified radius for certain types of businesses

Share
twitter facebook
- Re:more details (Score:3, Insightful)
  
  by Matey-O ( 518004 ) writes:
  
  There's code available now that does this for zipcodes. see http://www.zipmath.com/ (And using Mapquest as the black box, street addresses too.) Tieing it into google is a nifty bit o kit tho'.
- Re:more details (Score:5, Insightful)
  
  by Lars T. ( 470328 ) writes: <Lars.Traeger@goo ... il.com minus bsd> on Friday May 31, 2002 @10:44AM (#3617103) Journal
  
  Sounds like this improvement isn't much use outside the US.
  
  Parent Share
  twitter facebook
  - Re:more details (Score:2)
    
    by nathanm ( 12287 ) writes:
    
    Sounds like this improvement isn't much use outside the US.
    
    His current implementation wouldn't be of much use outside the US, but the code could be used with non-US data elsewhere. The TIGER & FIPS data is just geographical & address information commonly used in GIS. I know the UK has similar data available, other countries probably do too.
    
    Google chose his project because his code was clean and robust. It shouldn't be to difficult to get it to work with other data.
- Re:more details (Score:4, Informative)
  
  by Chester K ( 145560 ) writes: on Friday May 31, 2002 @09:49PM (#3620992) Homepage
  
  This is impressive bit of database manipulation. Somehow I didn't think that all of the datatypes, etc would be so easily parsed.
  
  Although I do recall telephone directories that used to give you results for a specified radius for certain types of businesses
  
  That's just a standard spatial query. It's easy to implement an R-Tree to be able to do (relatively) quick "give me points within x meters of this one" type of searches on a database. There's nothing extremely revolutionary about Daniel's project, anyone with some basic geometry knowledge and the patience to download the 33GB of TIGER data could have done it within the course of a few weeks. (Ironically enough I've been doing the same thing with 1.2 million addresses against TIGER data for the past month.)
  
  But that's the true genius and beauty of it. Now that it's been said, it's such a mindbogglingly obvious and useful application of web search and spatial search technology that it's hard to believe nobody thought of it before.
  
  I'd be honestly surprised if Google doesn't run with the ball and fold it into their main search engine. The only thing standing in the way is the storage space and CPU time to do it.
  
  Parent Share
  twitter facebook
  - Re:more details (Score:2)
    
    by xant ( 99438 ) writes:
    
    I'd be honestly surprised if Google doesn't run with the ball and fold it into their main search engine. The only thing standing in the way is the storage space and CPU time to do it.
    
    . . . Two things Google should have no trouble finding. [slashdot.org]
404 Page Not Found ? (Score:5, Interesting)

by bigmouth_strikes ( 224629 ) writes: on Friday May 31, 2002 @09:40AM (#3616667) Journal

I'm surprised that there are so many 404 Page Not Found errors in Google's search results, even on the top hits.

Shouldn't Google automatically check results that a user follows and flag those that cannot be displayed ?

Share
twitter facebook
- Re:404 Page Not Found ? (Score:3, Insightful)
  
  by DrSkwid ( 118965 ) writes:
  
  #
  
  Thomas Phelps and Robert Wilensky, for their project, Robust Hyperlinks. Traditional hyperlinks are very brittle, in that they are useless if the page later moves to a different URL. This project improves upon traditional hyperlinks by creating a signature of the target page, selecting a set of very rare words that uniquely identify the page, and relying on a search engine query for those rare words to find the page in the future. For example, the Google programming contest can be found using this link.
- Re:404 Page Not Found ? (Score:3, Informative)
  
  by rmohr02 ( 208447 ) writes:
  
  Yea, I've realized that, but then you realize that Google caches most of the web and nearly all of the links produced in search results. So if you get a 404 error you go back and click on the cache link.
- Re:404 Page Not Found ? (Score:3, Interesting)
  
  by PhilHibbs ( 4537 ) writes:
  
  Google's links are not redirected via their server, and a lot of people would object to them "gathering data on their users' browsing activities". However, automatically checking the top link after each search (or scheduling it for checking) should be possible.
  
  What should they do if a page is unavailable, though? What if it's only down for a few seconds?
  - Re:404 Page Not Found ? (Score:1)
    
    by Ashran ( 107876 ) writes:
    
    Every once in a while your click thrus are going thru a google server (no dns, just an ip)
- Re:404 Page Not Found ? (Score:5, Insightful)
  
  by LinuxHam ( 52232 ) writes: on Friday May 31, 2002 @10:12AM (#3616869) Homepage Journal
  
  Shouldn't Google automatically check results
  
  I would much prefer to see them improve the ease of browsing their cache. Specifically, if a cached site is 404, then present a cached version of the site where all clicks within the site simply link to the cached version, unlike today where all clicks are native (and therefore lead to more 404's). Granted that wouldn't be of any use for links to dynamic pages, but anything is better than what they have today.
  
  Parent Share
  twitter facebook
  - Re:404 Page Not Found ? (Score:1)
    
    by heliocentric ( 74613 ) writes:
    
    Get the google toolbar [google.com], it allows you to right click a link and choose to load a cached snapshot of the page, as well as adding nice easy search features.
    - Re:404 Page Not Found ? (Score:3, Informative)
      
      by General Wesc ( 59919 ) writes:
      
      And of course there's the Mozilla Google Toolbar [mozdev.org] for people who don't use IE.
  - Re:404 Page Not Found ? (Score:2)
    
    by srvivn21 ( 410280 ) writes:
    
    Interestingly enough, the google translation page [google.com] does this. Translate a page, and any links on that page lead to translated pages. Very convenient.
winner is... (Score:1, Troll)

by room101 ( 236520 ) writes:

The winner is... a former Microsoft employee.
Let me guess: he "rebranded" a piece of software that was under a BSD license?

All kidding aside, sounds pretty neat.
- Re:winner is... (Score:2)
  
  by redhatbox ( 569534 ) writes:
  
  n0 u 533, h3 0wnz0r3d g00g13 w14h h15 31337 m1cr0$0f4 h4x0r1ng 5ki11z!!!
  
  God, help us all... I *knew* we couldn't go an entire story without someone freaking out about the whole "used to work for Microsoft bit."
  
  Ho hum, back to my OBSD boxen...
Geographical Approximation (Score:1)

by z_gringo ( 452163 ) writes:

I don't think the Pizza delivery analogy will pan out. They can pin it down to a country (usually), and maybe even a region of a country(rarely), but beyond that, it won't be possible to get very close. For example, in the country I work for, the whole class B address shows up as being in the UK, but it is broken up accross each european country, so even if you are in spain or france, it looks like a UK address. I'm sure many companies do the same. Also, the IP addresses from the cable ISP will cover a wide area of several cities or sometimes a whole country.
- Re:Geographical Approximation (Score:3, Interesting)
  
  by Peyna ( 14792 ) writes:
  
  Sounds like it wasn't doing IP Addresses or hostnames , but addresses found in text on pages. Using enough rules, and a funky algorithm, you could probably get pretty accurate for a number of pages, enough to produce good results on searches at least.
- Re:Geographical Approximation (Score:1)
  
  by TheGreatGraySkwid ( 553871 ) writes:
  
  You need to spend more time in the Big Blue Room, man. This guy's project uses PHYSICAL addresses, not IP addresses. So if your address is on your webpage as 1337 Lamar Dr., San Dimas CA, he would use publically available geographical data to determine your physical coordinates and the places near you.
  
  Dig it?
Not earth shattering, but useful (Score:2, Informative)

by f00zbll ( 526151 ) writes:

Credit to the guy for thinking of it. It could save a person the hassle of looking up all the address in mapquest. I've never had the need to do such a search on google, since it's easier to just do a yellowpage search. Most yellow page sites like superpages and switchboard already provide that kind of functionality. Google's directory search doesn't have search by distance yet, but I'm guessing it will be added in the future. They kinda have to considering the other directory sites have those features.
Service already exists (Score:2)

by khendron ( 225184 ) writes:

There is already a service like this at www.lasoo.com [lasoo.com] This service lets you enter an address and a business type, and will find all instances of that business within a certain radius of the address.
Last time I used Lasoo was on Mother's Day, to find the closet florist to my mom's house.
- Re:Service already exists (Score:3, Insightful)
  
  by MullerMn ( 526350 ) writes:
  
  The difference is that the service you're thinking of probably works from a pre-specified list of locations for the businesses it covers.
  
  The cool thing about the winning google entry is that it actually deduces the location of the search result by finding and parsing any address information that appears on the site!
  
  I think that's pretty clever. - Does anyone know if it's limited to the US?
  
  --
  Andy
Nice (Score:3, Interesting)

by Mr_Silver ( 213637 ) writes: on Friday May 31, 2002 @09:52AM (#3616746)

Whilst I'm very impressed with the winner the entry "Robust Hyperlinks" is something that do like a lot.
What would be cool, would be the option to right click on the hyperlink and have the option "Find alternative location".
Or even cooler, have IE (or your favourite browser) on putting up the 404 message have a hyperlink which does the same. Hell, easy enough to do with apache.

Share
twitter facebook
- Re:Nice (Score:1)
  
  by wizman ( 116087 ) writes:
  
  On the Apache thing - I don't think most content providers would like to provide a "way out". They rather provide links within to keep them in their site.
  
  Would be a nice addition to the "google toolbar" though.
- Re:Nice (Score:1)
  
  by sgtsanity ( 568914 ) writes:
  
  It would also work quite well for google whacking.
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
Funny. (Score:1)

by Byteme ( 6617 ) writes:

I was looking for a similar tool yesterday when I was doing research for low power FM transmitter placement. I have a street address for the tower and was looking for an easy (or more accurate, as I am not an adept map reader) tool rather than looking at the USGS maps.
I knew I should have patented it... (Score:2)

by Usquebaugh ( 230216 ) writes:

I was thinking about doing exactly the same thing, a common thought?

But the idea of using it just to find business within a certain radius is very limited thinking.

Mobile phones will soon be broadcasting their position. You want interactive guided tours of a city? How about playing full size monopoly? Driving directions? Any sign you currently see could be removed and replaced with a virtual sign? Any number of VR worlds played out in meat space? etc etc

I think that the ability to automatically tell someone where you are will prove to be a boon.

Kudos to the developer for carrying through, rather than my lazy ass postulating :-)
runner up (Score:2)

by gnugnugnu ( 178215 ) writes:

the runner up entry

Zhenlei Cai, for his project, Discovery and Grouping of Semantic Concepts from Web Pages with Applications. This effort processed a corpus of documents and found words and phrases that tend to co-occur within the same document, producing a list of pairs of terms that seem to be closely related (such as "federal law" and "supreme court", or "Bay Area" and "San Francisco").

sounds a lot like Google sets [google.com]

Robust Hyperlinks has to be my favourite.
NetGeo (Score:5, Informative)

by *xpenguin* ( 306001 ) writes: on Friday May 31, 2002 @10:01AM (#3616800)

There's a public database called NetGeo [caida.org] which will convert IP addresses to latitude and longitude locations. I created a script called IP-Atlas [xpenguin.com] to get a visual location of the lat and lon coords.

Share
twitter facebook
- Re:NetGeo (Score:2)
  
  by Rayonic ( 462789 ) writes:
  
  Nice site, but unrelated to the winner's entry. The IP address will only tell you where the server is, and not the location of, say, a remote business whose site is hosted on that server. In fact, an IP-address-derived location would only get in the way of his project.
Cached Longitude and Latitude (Score:1, Troll)

by shawnmelliott ( 515892 ) writes:

Does this mean that when the time comes to leave this planet and move to Mars that we can still visit our favorite places via Google's Cache?
More uses (Score:5, Funny)

by parad0x01 ( 549533 ) writes: on Friday May 31, 2002 @10:19AM (#3616902)

Search => Nearest horny drunk college girl
Results: Apt 2D

Share
twitter facebook
- Re:More uses (Score:2, Funny)
  
  by Telecommando ( 513768 ) writes:
  
  Neat idea.
  
  But how many milliseconds would it take before she's slashdotted?
- Re:More uses (Score:2, Funny)
  
  by Nighttime ( 231023 ) writes:
  
  Search => Natalie Portman
  Error: This server is too busy to deal with your request
locatity of businesses and services (Score:1)

by mattbland ( 260913 ) writes:

just had an idea for how to implement this using meta tags. I should probably submit it to the w3c if no-one else has already...

have a new meta tag class for "location" which would include the GPS (or long/lat, etc.) co-ordinates for the business, etc. along with human readable country, such as UK, FR, NL, US, NL, etc. and region.

then have another tag for how far the service provider/company does business for with rules for excluding and including areas.

so if a copmany operates, i.e. delivers, within a 3 mile radius then a search engine will know. if they operate in their local state/county it will know. if it operates regionally such as Europe wide or Globally it would be easy to locate a provider of goods and services which will be able to help. The exclusions could cover things like states which the product is deemed illegal or that the merchant doesn't wish to do business with.

Of course you'd need to let the search tools know where abouts you are in order to refine the results. But this needn't be the source of any privacy issues as the engine could submit a complete set of results and be filtered out at the client end dynamically if the user is paranoid.

what do you think?

Mb
- - Re:Quick! Patent that idea! (Score:1)
    
    by mattbland ( 260913 ) writes:
    
    good idea!
    
    but my submitting my idea on this public forum I think I've lost any chance of getting a patent, at least in the UK. Our patent office might be slow, but they're a lot better than the US one. Part of getting a patent over here is that you aren't allowed to disclose anything about the idea until the patent is granted. If that was true in the US you've would have a lot less problems with the patent office and dumb patents such as "1-Click Shopping", etc.
    
    Plus I don't think company's could patent the "discovery" of stuff that we already have in our bodies, etc. such as genes, unless they are artificially produced or altered ones that don't exist in nature.
    
    Also, software shouldn't be patentable imho. Copyright takes enough away from us already without a dumb patent saying that I can't write a program which manipulates some data in a certain way without paying company X some money.
    
    Our kids and grandkids are going to grow up in a world where the only public domain material is the same as what we have... i.e. all from the 19th century and earlier. Star Wars, Mickey Mouse and such like will remain in copyright *forever*, as long as a company can milk them for more money. Just think about that, in 2902 you will probably get locked up in an iso-tube for 55 years for copying Star Wars (6 film collectors edition - now in 3d) using your fingernail media duplicator.
watch out, google. (Score:2)

by Sarin ( 112173 ) writes:

"Potentially run your prize-winning code on Google's multi-billion document repository (circumstances permitting)"

so as a former microsoft employee he's going to run his code on the google servers.
Nah, this one's too easy!
A project that really could be good for Google... (Score:5, Informative)

by chrysalis ( 50680 ) writes: on Friday May 31, 2002 @10:34AM (#3617024) Homepage

is something that prevents cheating.

So you think that Google's results are fair? You're wrong. The best ranked results are from sites that heavily cheat.

Since Google has aggressively removed fake generated sites linking to each other, new ways of cheating have been immediately adopted.

Apart from cloaking (what the Google crawler sees is different from what user see), generated sites now include fake generated english-like sentences in order to make Google think the text is real. Spam indexing is now distributed on multiple IPs. Content is dynamic, it changes everyday (random links and texts are generated) . Temporary sites are hosted on external (yet non-blacklisted) cheap colocated servers. Invisible frames are added, etc.

I'm not innocently talking about that because the company I'm working for is actively doing it. And it works. And they say "Spam? Uh? Who's talking about Spam? It bring us money, so it's not spam, it's our business".

There are ways to prevents cheating on Google. It's probably very complex, but it's realisable. If any human looks at our 'spam site', he will immediately discover that it's not a real site. It's a mess, just for keywords and links.

If such a project had been made for the Google content, it would have been wonderful.

Google is still the best search engine out there. Their technology rocks, and they are always looking for innovation. But what could make an huge difference between and other search engines is : fair results. Same wheel of fortune for everybody.

Yet this is not the case. Trust me, all well ranked web sites for common keywords belong to a few companies that are actively cheating.

Share
twitter facebook
- Not all well-ranked sites for common words cheat (Score:3, Interesting)
  
  by JoeBuck ( 7947 ) writes:
  
  chrysalis writes: Yet this is not the case. Trust me, all well ranked web sites for common keywords belong to a few companies that are actively cheating.
  This statement is easily refuted. Type "linux". The 10 sites you see all belong there, and I can guarantee you that most of them are not engaging in cheating.
  But since you admit that you work for a company that engages in this practice, perhaps it helps you sleep at night to believe that "everybody does it".
  - Re:Not all well-ranked sites for common words chea (Score:2)
    
    by chrysalis ( 50680 ) writes:
    
    The company I'm working for produces, hosts and promotes adult web sites. Keyword we are cheating with are related to sex. All our competitors are doing the same thing.
    
    Linux is not a common keyword. Linux is not something that bring money. Therefore people don't need to cheat for Linux-related sites.
A good source of innovation... (Score:1)

by purpledinoz ( 573045 ) writes:

This is actually a nifty way of getting real good ideas from people. $10000 seems a bit cheap for an idea that can make google helluva lot more than that.
Geographically limited browsing (Score:1)

by Jumperalex ( 185007 ) writes:

So does this now make it easier for governments to limit their citizens ability to get information than ever before?

As well as any other of the many geography based rules, laws, taxes, restrictions, etc that we have seen talked about on /. before???
Daniel Egnor's "Iocaine Powder" (Score:4, Interesting)

by po8 ( 187055 ) writes: on Friday May 31, 2002 @12:13PM (#3617760)

In a weird coincidence, I just spent a half-hour last night lecturing about Daniel Egnor's Iocaine Powder [ofb.net], winner of the First International RoShamBo Programming Competition [ualberta.ca]. Credit this guy with two award-winning pieces of extreme programming cleverness!

Share
twitter facebook
More Information About the Winner (Score:5, Informative)

by td ( 46763 ) writes: on Friday May 31, 2002 @12:39PM (#3617931) Homepage

I've met Dan Egnor, and this isn't the only cool thing he's done. He's the author of Iocaine powder [ofb.net], the world champion rock-paper-scissors program. He's also the proprieter of sweetcode [sweetcode.org] a web log devoted to innovative open source projects (i.e. projects that don't just clone or tweak existing software.) But his best hack (not described on line, as far as I know) is a version of Pac Man that runs on a PDA and uses a GPS for a user interface -- if you run around an open field carrying the GPS+PDA, the pacman correspondingly runs around the maze chasing Blinky, Stinky and Dinky (or whatever their names are.)

Share
twitter facebook
Why I never would have written that program. (Score:2)

by blair1q ( 305137 ) writes:

How many hosts implement their coordinates in their info any more?

5%? 10%?

80% omit it because admins are lazy, and 10% omit it for security reasons.

So Google just gave an award to a tool with half the batting average of a bad baseball player.

--Blair
- Re:Why I never would have written that program. (Score:2)
  
  by radish ( 98371 ) writes:
  
  RTFA - it's nothing to do with the hosts record. It parses the addresses from the pages themselves.
Smooth Move Google (Score:3, Insightful)

by Uttles ( 324447 ) writes: <<uttles> <at> <gmail.com>> on Friday May 31, 2002 @01:40PM (#3618336) Homepage Journal

Did you all read the honorable mentions? Google stands to make some good money off of the ideas and implementations these folks have come up with. I'm assuming that all entries now are owned by Google, and man they might have some really cool new features after seeing the projects that were submitted. I only hope that they give at least some royalties to the developers.

Share
twitter facebook
I'm a little disappointed. (Score:2)

by guttentag ( 313541 ) writes:

Daniel received a bachelor's degree in Computer Science from Caltech in 1996. He has worked for Microsoft Corporation and XYZFind Corporation, and currently resides in New York City working for a large investment bank.

I was hoping the contest would create new opportunities for some young unknown, like: "Bob is a high school sophomore and currently resides in his parents' barn in Fargo and earns his keep stocking shelves at Toys 'R Us." Oh well, maybe next year.
I wonder what this person worked on at Microsoft.. (Score:2)

by zeno_2 ( 518291 ) writes:

They have a product called Streets and Trips. You can enter in your address, and find out what is within a 5 mile radius lets say. Sounds pretty much like what this guy did.
Speech Recognition (Score:2)

by ZigMonty ( 524212 ) writes:

Zhenlei Cai, for his project, Discovery and Grouping of Semantic Concepts from Web Pages with Applications. This effort processed a corpus of documents and found words and phrases that tend to co-occur within the same document, producing a list of pairs of terms that seem to be closely related (such as "federal law" and "supreme court", or "Bay Area" and "San Francisco").

Am I the only one who thinks this would be useful for speech recognition? If you just detected a "federal" and you have two possibilities for the next word, "law" and "paw" say, the software would know it's more likely to be "law". Federal paw is probably fairly uncommon and yet this is exactly the mistake that current software makes.
Wired article (Score:2)

by scubacuda ( 411898 ) writes:

Here is a Wired article [wired.com] on it.
- Re:good news for open source? (Score:1)
  
  by ignatzMouse ( 447031 ) writes:
  
  How is google open source?
- Re:good news for open source? (Score:2)
  
  by rmohr02 ( 208447 ) writes:
  
  It is great to see open source software such as google
  
  Google's not open source. They support the open source community, but they don't release the code the their indexer. This is all the information that they give out about their code:
  
  PageRank Explained
  
  PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."
  
  Important, high-quality sites receive a higher PageRank, which Google remembers each time it conducts a search. Of course, important pages mean nothing to you if they don't match your query. So, Google combines PageRank with sophisticated text-matching techniques to find pages that are both important and relevant to your search. Google goes far beyond the number of times a term appears on a page and examines all aspects of the page's content (and the content of the pages linking to it) to determine if it's a good match for your query.
  - - Re:good news for open source? (Score:2)
      
      by rmohr02 ( 208447 ) writes:
      
      It was at 2 when I replied.
- Re:About the "Former Microsoft Employee" bit.. (Score:2, Insightful)
  
  by GT_Alias ( 551463 ) writes:
  
  Christ, give it a break. I know there's an anti-anti-Microsoft backlash here, but for fuck's sake all he did was mention the previous employer with absolutely NO bias or connotations. If the guy had been employed at XYZ University, I'm sure it would have still shown up.
  - Re:About the "Former Microsoft Employee" bit.. (Score:5, Funny)
    
    by PrimeEnd ( 87747 ) writes: on Friday May 31, 2002 @10:02AM (#3616810)
    
    If the guy had been employed at XYZ University, I'm sure it would have still shown up.
    Actually he was employed by XYZFind Corp. Literally. And it didn't show up.
    
    Parent Share
    twitter facebook
    - Re:About the "Former Microsoft Employee" bit.. (Score:2, Informative)
      
      by asqui ( 61770 ) writes:
      
      The reason I included Microsoft Corp. as a former employer and not XYZFind Corp. is becasue I wanted to point out that despite what most of you like to think, intelligent people do work at Microsoft.
      
      Yes really, it's not a large room full of monkeys!
  - Re:About the "Former Microsoft Employee" bit.. (Score:1)
    
    by Hunter1776 ( 581599 ) writes:
    
    I agree. Listing his previous employer probably wasn't intended as an attack on MS. I don't see why anyone would think it was.
- Re:About the "Former Microsoft Employee" bit.. (Score:1, Informative)
  
  by Anonymous Coward writes:
  
  Gates didn't write [halcyon.com] DOS.
- Re:About the "Former Microsoft Employee" bit.. (Score:3, Funny)
  
  by Mononoke ( 88668 ) writes:
  
  And we certainly dont believe he got better at coding while he worked at MS.
  You've never learned from the mistakes of others?
  Being a former M$ employee tells me he learned quite a bit.
  - Re:About the "Former Microsoft Employee" bit.. (Score:2)
    
    by jfedor ( 27894 ) writes:
    
    FYI, Michael Abrash [google.com] once worked at Microsoft, then went to id Software, and then left id and went back to MS.
    
    So I think there are some programmers at Microsoft that you could learn from (not by seeing their mistakes).
    
    -jfedor
- Re:About the "Former Microsoft Employee" bit.. (Score:2)
  
  by liquidsin ( 398151 ) writes:
  
  the part about him being a former MS employee was directly quoted from the submitter, not inserted there by michael. I know it's slashdot reader policy to not read the story, but at least read the fucking summary on the front page before flaming away...
- Re:About the "Former Microsoft Employee" bit.. (Score:1)
  
  by Ilgaz ( 86384 ) writes:
  
  Former MS employee, submits his project to Google, the ultimate linux site.
  
  Come on...
- Re:About the "Former Microsoft Employee" bit.. (Score:1)
  
  by scd ( 541350 ) writes:
  
  Sigh...
  In submitted articles, italics designate the submitted article, while normal text indicates Michael's or CmdrTaco's, etc., additions.
  Note that this one was all italics, meaning the 'former Microsoft bit' was included by the person submitting the article.
- Re:Some might think.. (Score:2)
  
  by GuyMannDude ( 574364 ) writes:
  
  Though their operating systems may be riddled with bugs and security flaws of all sorts, look at their applications. They tend to be the epitome of quality software.
  
  Yeah, right. That one dancing PaperclipDude was the "epitome of quality software".
  
  Me: (starts writing a letter in Word)
  
  PaperclipDude: "Hi there! It looks like you're writing a letter!"
  
  No shit, Shirlock. What gave it away? The "Dear Sirs" opening line? Shees.
  
  GMD
- Re:Anyone here who got the CDROM with data mailed? (Score:2)
  
  by polymath69 ( 94161 ) writes:
  
  Grrr.
  I requested the data CDs on February 6th, and got an acknowlegdement email from Google the same day. But I never received the CDs, either, so more or less forgot about the contest until now. And I'm in the US, so they're not discriminating against Germans.
  Some fine way to run a contest!
  So I'll second the question... did they follow up with anybody?

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

if i'd only known (Score:2, Interesting)

Re:if i'd only known (Score:4, Informative)

Re:if i'd only known (Score:2)

Re: (Score:1)

Re:if i'd only known (Score:2)

Re:if i'd only known (Score:3, Funny)

As previously designed by me (Score:1)

Re:As previously designed by me (Score:1)

There always out of date (Score:1)

Re:There always out of date (Score:1)

pot-holes (Score:1)

Lets join up (Score:1)

I see one being implemented soon (Score:5, Interesting)

Re:I see one being implemented soon (Score:2)

Re:I see one being implemented soon (Score:1)

Re:I see one being implemented soon (Score:2, Informative)

Markov processes (Score:3, Informative)

Re:I see one being implemented soon (Score:1)

I sent something into the contest. (Score:4, Funny)

Mod parent up! (Score:1)

Re:I sent something into the contest. (Score:1)

Re:I sent something into the contest. (Score:1)

Re:I sent something into the contest. (Score:2)

What a great idea (Score:4, Funny)

Re:What a great idea (Score:1)

Re:What a great idea (Score:1, Offtopic)

Re:What a great idea (Score:1, Offtopic)

Google Search (Score:4, Funny)

Search = Osama Bin Laden (Score:1)

Re:saved a country (Score:2)

Idea for a Google Query..... (Score:5, Funny)

"Google Sets" (Score:2)

more details (Score:5, Informative)

Re:more details (Score:3, Insightful)

Re:more details (Score:5, Insightful)

Re:more details (Score:2)

Re:more details (Score:4, Informative)

Re:more details (Score:2)

404 Page Not Found ? (Score:5, Interesting)

Re:404 Page Not Found ? (Score:3, Insightful)

Re:404 Page Not Found ? (Score:3, Informative)

Re:404 Page Not Found ? (Score:3, Interesting)

Re:404 Page Not Found ? (Score:1)

Re:404 Page Not Found ? (Score:5, Insightful)

Re:404 Page Not Found ? (Score:1)

Re:404 Page Not Found ? (Score:3, Informative)

Re:404 Page Not Found ? (Score:2)

winner is... (Score:1, Troll)

Re:winner is... (Score:2)

Geographical Approximation (Score:1)

Re:Geographical Approximation (Score:3, Interesting)

Re:Geographical Approximation (Score:1)

Not earth shattering, but useful (Score:2, Informative)

Service already exists (Score:2)

Re:Service already exists (Score:3, Insightful)

Nice (Score:3, Interesting)

Re:Nice (Score:1)

Re:Nice (Score:1)

Re: (Score:2)

Funny. (Score:1)

I knew I should have patented it... (Score:2)

runner up (Score:2)

NetGeo (Score:5, Informative)

Re:NetGeo (Score:2)

Cached Longitude and Latitude (Score:1, Troll)

More uses (Score:5, Funny)

Re:More uses (Score:2, Funny)

Re:More uses (Score:2, Funny)

locatity of businesses and services (Score:1)

Re:Quick! Patent that idea! (Score:1)

watch out, google. (Score:2)

A project that really could be good for Google... (Score:5, Informative)

Not all well-ranked sites for common words cheat (Score:3, Interesting)

Re:Not all well-ranked sites for common words chea (Score:2)

A good source of innovation... (Score:1)

Geographically limited browsing (Score:1)

Daniel Egnor's "Iocaine Powder" (Score:4, Interesting)

More Information About the Winner (Score:5, Informative)

Why I never would have written that program. (Score:2)

Re:Why I never would have written that program. (Score:2)