Yahoo! Vs. Google: Algorithm Standoff 270
An anonymous reader writes "There's a new report out from the guys who brought us the Google keyword density analysis. As they put it, "the goal of this analysis is to compare the keyword density elements of Yahoo's new algorithm with Google's algorithm." They compared 2000 low traffic, non-competitive keywords in the hopes of seeing the algorithms more clearly, without any possible search engine tweakings related to high-traffic keywords. Their findings are interesting. Should you go and rebuild your site based on these findings? Maybe not. It's worth a look though."
Search Engine Optimization Professional (Score:5, Interesting)
Re:Search Engine Optimization Professional (Score:5, Informative)
Search engines need help with frames (if anyone can still find a good reason to use them). If you use Flash based navigation, you better make sure that you have a prominent document which links to all pages as well or search engines won't index them. It's also a good idea to use descriptive titles and put what's important at the top of the page. In other words, most good search engine optimization is exactly what you would do to make a site screen-reader or text-browser friendly.
Then there's link-bombing, show-something-different-to-Google, white-on-white text, redirections, etc.
It's quickly becoming so that you can't tell someone to optimize a site for inclusion in search indexes or they'll fall into the hands of this kind of scum. It's a little like the word "Hackers". Can't use that anymore without having to explain that you're not illegally breaking into other people's computers.
Re:Search Engine Optimization Professional (Score:5, Funny)
That's another set of people that need a whack with a clue stick.
Re:Search Engine Optimization Professional (Score:2)
to keep away the non Windows users
go for it
use java menus while you're at it
Re:Search Engine Optimization Professional (Score:4, Funny)
Wait, I've got another clue stick here somewhere.
Re:Search Engine Optimization Professional (Score:5, Insightful)
It excludes blind users with screen readers and people who don't or can't install superfluous plug-ins. Flash is great for entertainment but it should never be required for getting information.
Re:Search Engine Optimization Professional (Score:3, Informative)
Try it. Start up Microsoft Narrator (Windows Key + U on Windows 2000/XP) and head to macromedia.com.
Re:Search Engine Optimization Professional (Score:3, Insightful)
As pcs get faster and faster and more and more people get broadband, you will se
Re:Search Engine Optimization Professional (Score:5, Interesting)
Let's see [macromedia.com]
Mozilla on FreeBSD (that's me)
We are unable to locate a single Web player that best matches your platform and operating system
Mothra on plan9 (also me)
We are unable to locate a single Web player that best matches your platform and operating system
The acceptable list [macromedia.com] is
Windows 98/ME/2000/XP - Internet Explorer/AOL/Netscape/Mozilla/Opera/CompuServe - Flash 7
Mac OSX / OS9 - Internet Explorer/Safari/Netscape/Mozilla/Opera - Flash 7
Other Operating Systems
Linux x86 Flash Player 6 for Mozilla 1.1 - (Not officially supported by Macromedia.)
Pocket PC Flash Player 6 for Pocket PC 2003 (color devices supported only)
OS/2 Flash Player 4 for Netscape
Sun Solaris (Sparc/Intel) Flash Player 6 for Netscape
HP-UX Flash Player 6 for Netscape
SGI IRIX Flash Player 4 for Netscape
On my 500,000 page impression web site, using Flash would have excluded the otherwise successful visitors running the following OS
CPM
Windows 3.xx
WebTV
OSF Unix
Aix
NetBSD
I will admit that the actual numbers are low but being excluded/ignored is how us non Windows users are treated day in day out. Seems you can't fight the pigopolists.
Re:Search Engine Optimization Professional (Score:3, Interesting)
Umm. By definition, if a site is loaded with Flash, it's media-rich.
I just can't really see Flash being a benefit. Folks thought that it was useful back when it was novel -- ("Look, the web page makes sounds when I click!"). We've gone through this same "novel" phones so many times on the web that it's depressing. When music came out, everyone had to put music on their personal pages, and at first it was kind of cute. Then it got really anno
Re:Search Engine Optimization Professional (Score:3, Informative)
This [boohbah.com] is the only good use of Flash I've seen. My nephew likes it, anyway.
Re:Search Engine Optimization Professional (Score:5, Informative)
I'm getting really tired of sites that present one thing to search engines and something totally different to me.
Then complain about it. That practice is known as cloaking, and you can get sites blacklisted for it.
Re:Search Engine Optimization Professional (Score:5, Informative)
Re:Search Engine Optimization Professional (Score:5, Interesting)
I think it's fair to say there are white hat SEOs as well as black hat hijack^H^H^H^H^H^H SEOs.
Re:Search Engine Optimization Professional (Score:3, Interesting)
Bayesian filters learn to recognize spam and are personalized to the user. They are at least as effective as rules-based mail filters, but very effectively halt the rules race (where the filter writer writes a rule to filter by, and the spammer figures out a way around the rule, rinse, repeat).
We need something like that for web pages and web searching. It's not just about keywor
Re:Search Engine Optimization Professional (Score:3, Interesting)
Re:Search Engine Optimization Professional (Score:5, Interesting)
Re:Search Engine Optimization Professional (Score:5, Insightful)
Re:Search Engine Optimization Professional (Score:4, Interesting)
Re:Search Engine Optimization Professional (Score:3, Interesting)
There are legitimate reasons for hiding text. For example, putting help text into a page, and only showing it when the user clicks a help button (far more friendly than popups).
Re:Search Engine Optimization Professional (Score:2, Funny)
Wouldn't that get a bit messy? Personally, I would lower the ranking, but that's just my opinion. To each their own.
Re:Search Engine Optimization Professional (Score:3, Interesting)
Run the bitmap image through an OCR program to extract the real text seen by the user
Wouldn't it be smarter to just render both versions and compare bitmaps? No need to OCR then...
Re:Search Engine Optimization Professional (Score:2, Informative)
Re:Search Engine Optimization Professional (Score:2)
Re:Search Engine Optimization Professional (Score:5, Interesting)
and, this way you are giving a lower ranking to pages which use text in images. it is not good practice to have all the text embedded in images, but it is often necessary for sytle purposes; an example being the logo of a site (ok, alt= should handle this). hell, i even do it! its cleaner than hoping the person on the other side can render the same fonts as me (which would be impossible cuz i filtered then thorugh GIMP to add some effects).
a lot of sites auto detect robots based on what you are saying, and either block them or launch a seek-and-destroy attack against you. to get around this, the file /robots.txt (which every large site should have) WILL be read by the google/yahoo prowler no matter what, and abided by. it plays the prominent role in what the search engines read... not the server reading the browser tag.
thats without even going into the algorithms of matching the read OCR text up against the text from the source.
SEO - SEM (Score:5, Informative)
Search Engine Optimization - doing all things possible to tell a search engine what your page is about while being balanced for humans to read as well. Ethical. Sometime considered spam when really the search engine returns poor results; usually due to the page you are looking for not being easy to understand for spiders.
Search Engine Manipulation - trying to doing things to get search engines to return your page in results when the page may not otherwise be something the engine considers relevent or high quality. Showing something different for the search engine falls under this category, is commonly refered to as cloaking, and is against many search engines "rules" for designing pages. Not ethical, aka spam.
-Pete
Re:SEO - SEM (Score:5, Insightful)
Here we go into the slippery slope that leads to situations like the tradgedy of the commons (where people tend to use up a resource because it isn't theirs), the hiring of lawyers (statistically, if one side hires a lawyer, they get better results, but if both sides hire lawyers they get the same settlement, only smaller because of lawyers fees), etc. It's the prisoner's dilemma - defect (ie, optimize) to improve my position, at the risk of everybody else defecting and earning worse returns than non defecting in the first place (ie, everybody stops using google because the rankings are screwed up and are no longer trustworthy.)
Put simply, the moment any site tries to game the system, even just a little bit, they ruin the usefulness of Google. As it stands, I'm getting better results with Metacrawler now than with Google - something I wouldn't have said just a year ago. Don't even get me started on websites with javascript-redirect gateway pages, or the ones that scrape search-engine/newsgroup/eBay pages for text in order to boost hit counts, and then link back to similar pages in order to get higher link relevancy, OR the ones that take over abandoned domains in order to exploit the ranking generated by pre-existing links that point to the domain name...
Re:SEO - SEM (Score:3, Interesting)
But like the "SEO v. SEM" argument above, search engine optimization done right will also give better results to the end user.
Think about it: if I'm looking for the specs on Widget A and the best
That's right, mod me down. (Score:3, Insightful)
If Yahoo wants my vote... (Score:4, Interesting)
Re:If Yahoo wants my vote... (Score:5, Informative)
Re:If Yahoo wants my vote... (Score:2, Informative)
Re:If Yahoo wants my vote... (Score:5, Interesting)
Well that's all well and good, but how many people would know to type that in?
Has anyone looked at altavista lately? They've certainly taken the Google route, and their home page looks a lot like Google now, as does search.yahoo.com. However, in search.yahoo.com _and_ altavista, I noticed that "sponsored results" show up before the real ones, but they appear in the list just the same. That could confuse newbies, and I prefer the approach Google has taken to advertising (shoving the ads to a separate entity on the right, and keeping them text-based).
Ads good at filtering out crap (Score:5, Informative)
Re:If Yahoo wants my vote... (Score:5, Informative)
Re:If Yahoo wants my vote... (Score:5, Funny)
It failed? If a market cap of 28 BILLION dollars is failure, what do I have to do wrong to get there?
Re:If Yahoo wants my vote... (Score:2, Informative)
Re:If Yahoo wants my vote... (Score:5, Informative)
And a User Friendly game to go along! (Score:5, Interesting)
Re:And a User Friendly game to go along! (Score:5, Interesting)
Touch the dots !
It's written in REBOL [rebol.com]
Re:And a User Friendly game to go along! (Score:3, Informative)
# robots.txt for http://imdb.com/
User-agent: Mediapartners-Google*
Disallow:
Disallow:
Disallow:
It also includes "User-agent: *" about halfway through, but the list was different at some time..
You can always check the previous versions of the robots.txt on the wayback machine [archive.org]
I think (Score:5, Insightful)
Google Super Computer? (Score:4, Interesting)
Re:Google Super Computer? (Score:2, Informative)
For those of you who don't know, a cluster is (as far as my understanding takes me) when you take several ordinary computers and link them together, providing a cheaper way to get a "fake" supercomputer.
Re:Google Super Computer? (Score:5, Informative)
A layman's view (Score:4, Insightful)
"Man, Goggle SUCKS now!, I'll try yahoo."
"DAMN! Yahoo sucks even more!"
I have to admit that I used to think google was incredible just after it came out, but nowadays I'm used to wading through 10-15 pages of results before finding something relevant to what I need.
Re:A layman's view (Score:5, Interesting)
Guys, if I wanted to go to Amazon I would just type "www.amazon.co.uk" into my browser.. If I'm searching on Google it's because I've either already looked at Amazon and didn't find what I want, or because Amazon is really not relevant..
I've started adding "-amazon -kelkoo -dooyoo -pricewatch" and others to my Google searches recently which helps cut down the chaff a little, but doesn't seem to cut out all the Amazon ripoffs.
Q.
Re:A layman's view (Score:4, Informative)
If you're "wading through 10-15 pages of results" (Score:3, Informative)
if you have already done this and you're still wading through that many pages of results you suck at specifying what you want to search for
Re:If you're "wading through 10-15 pages of result (Score:3, Interesting)
Now many of my web searches tend to turn up tons of mailing lists archives. If I want to search those I'd use google groups (I get about the same results for my search terms in google groups).
I'm actually not that surprised - when I first heard they were using Page Rank some years back, I wondered how long that would keep working. It's easy to manipulate, plus it's kind of circular.
Re:A layman's view (Score:2, Interesting)
I have to admit that I used to think google was incredible just after it came out, but nowadays I'm used to wading through 10-15 pages of results before finding something relevant to what I need.
Yep. I agree. I search for something as simple as "Philips DVD driver" for a Philips DVDRom drive and I get at least five adds selling Philips CD/DVDRom drives before I find a "SINGLE" reference to Philips themselves. Is this what Google has become? Maybe I should have put an 's' on driver.
Codifex Maxi
Re:A layman's view (Score:3, Informative)
Then I walk over and find it within 2 minutes.
People still don't really know how to use search engines. They don't use enough keywords or the right ones.
I wont use Yahoo for Search. I think they are hella shady with their privacy policies (they switched my preferences when "aquiring" new services from 3rd parties which I was a member of).
Their games and fantasy sports stuff is fun though. Its all about the value they give me
Re:A layman's view (Score:5, Interesting)
prozac suicide
Prozac prozac suicide. prozac nation nude Viagra prozac hair loss Paxil
prozac dogs Yasmin ssri prozac Propecia prozac ocd.
Prozac Suicide - Shopping and Discounts - PROZAC SUICIDE
Prozac Suicide Prozac Suicide. Are you looking for Prozac Suicide? We've searched
the internet for the best Prozac Suicide and we hope you enjoy what you find!
Prozac Suicide
Real Pharm - Lowest Prices & Fantastic Service - Prozac Suicide,
Suicide Prozac Suicide. Prozac(R) is a selective serotonin
Pattern Recognition (Score:5, Interesting)
Information is essentially the inverse of entropy. Entropy can be calculated, and you can use Bayes probability theory to get a hold on the information content of a given word within a set of words.
What is difficult to do, and what search engines are trying to do, is measure the mutual information inherent between the set of pages that the word appears in, and the word itself, then apply that to all the words in the searched-for phrase; this is commonly called 'context'. This is plainly impossible to do for every given phrase, for every word combination, for every page indexed. The best you can do is use a statistical approach (and Bayes is your friend again) to come up with "good" matches.
The problem with the statistical approach is the class unbiasing, since once you have wildly different statistical populations, your choice of context gets harder and harder - the "easy" standard models don't cope very well. You don't have the computational resources to do a good analysis, so you're essentially stuck between a rock and a hard place.
This is why the google idea of strengthening the importance of a word depending on linked pages was such a good one - it "did" the hard work by relying on the entire planet to do it for them, by creating links. Of course, what one man can do, another can undo, and Google has got progressively worse over time. It's still by-far the best though, and my search engine of choice. When you look at the queries from search-sites, I get 100x as many from Google as Yahoo (next nearest)....
People think searching is easy, and it is. What's really really hard is searching *well*.
Simon
Re:Pattern Recognition (Score:5, Interesting)
Google doesn't only have to make sense of a great big mess.
It has to make sense of a great big mess where a significant part of the pages are made *spesifically* to confuse Google, and where a part of those same pages gets tuned regularily in dedicated attempts at confusing whichever algorithm google use more.
Most of the cases where Google returns poor results these days, it's obvious to a human observer that the bad results on top are *purposely* made to confuse Google. I've even seen pages that return one set of content if your user-agent is "Googlebot", and another, totally different content (dialer, etc) if your user-agent is anything else.
Re:Pattern Recognition (Score:3, Informative)
This is probally Google's biggest problem. What they need to do is make a second pass at specific pages in a site which has recently been crawled with a more typical USER-AGENT to see if there is significant differences. They whould have to hit every page. The second crawler could also check to see what is "visiable" to the user
Re:Pattern Recognition (Score:2, Insightful)
Statistically we should have some inf
Keyword density?! (Score:5, Interesting)
Such pages don't usually mindlessly repeat the keyword I'm searching for over and over again.
My little test.. (Score:4, Interesting)
Re:My little test.. (Score:2)
I hope more people start using Yahoo.
It's All Magic... (Score:5, Insightful)
Re:It's All Magic... (Score:4, Insightful)
What I miss is looking in the card catalogue under the general subject and being able to pull out all sorts of related material I hadn't thought of. Same for browseing the stacks. Grab the general Dewey number and go surf the titles.
Wetware fuzzy logic at its best.
Teoma vs Google (Score:3, Funny)
1) Slashdot
2) Slash's Snakepit
Put the same "slash" keyword and search with Teoma [teoma.com]:
1) Slash's Snakepit
2) Slashdot
Personally for this keyword search I feel Slash's Snakepit is more relevant and belongs at the top of the heap.
So that's what happened! (Score:5, Interesting)
I'll be watching this very closely. Inktomi (sp?) sucked, which is what this is based on. I think it's too early to tell right now if the results are any good. Along the same lines, it will probably take about 6 months for marketers to learn to effectivly spam the results, which is something Google has historically been very good at keeping at bay.
This will be interesting to watch over the next few months.
-Pete
Re:So that's what happened! (Score:4, Informative)
That's interesting. I've notice the reverse with mine. Slurp (Yahoo!'s bot) has been coming to my site almost hourly getting different pages for the past 2 weeks or so. I've also noticed a HUGE increase of referrers from search.yahoo.com. Usually all the referrers from search engines were from Google. Now, Yahoo! is much more frequent.
Once yahoo changed over to Inktomi's search, I did several different searches for keywords or terms tha I want to be listed for. Surprisingly, I am ranked much higher on yahoo than Google right now for some things. I haven't changed anything in my code, its just interesting to see how the different search engines interpret the same thing.
Warning: You are being watched! (Score:5, Interesting)
Re:Warning: You are being watched! (Score:2)
I agree that it is advisable to click on the direct link or, if clicking on the original link, at least to wear a tin foil hat [c2.com] to prevent falling under the mind control of the new world order.
Re: (Score:2, Informative)
Re:Warning: You are being watched! (Score:3, Insightful)
Clarification! (Score:5, Informative)
This is VERY sneaky (akin to putting an Amazon referral link in a book review).
Do NOT click on the link. If the submitter had actually bothered to use a logged in slashdot account, I would be more trusting.
[gorank.com]
Copy Link location, open new browser window, paste.
W3 compliance? (Score:4, Interesting)
Re:W3 compliance? (Score:3, Interesting)
Over-generalising here, it means you get a lot of professional sites rather than little Timmy's Frontpage creation, however, being a large corporation doesn't guarantee you a decently constructed site, and is no guarantee of it being W3C compliant.
But then, Google probably sees this as a possible 80:20 rule - with the majority of W3C compliant sites probably offering something useful to index
Re:W3 compliance? (Score:3, Insightful)
When it comes to web design issues, Google does not punish naive mistakes. If somebody's HTML is so weird that it must be an attempt at manipulation (like making an e
Missing the google point? (Score:5, Interesting)
Hence, it's an interesting read, and maybe you could draw your own preferences from what the weighting turns out to be in the listed cases, but it's not a very fair representation of how google works. *NB* I've no clue how Yahoo/Inktomi works, so I couldn't comment.
Sale sites. (Score:2, Insightful)
They are different (Score:4, Funny)
I guess Yahoo really doesn't love me after all.
Re:They are different (Score:2)
I did a search for myself in yahoo and google and I came up 9th on yahoo and 19th in google. Yahoo was more accurate even if by a little for me. Interestingly though is that the page that Yahoo found was a much more obscure page for a project I worked on 3 years ago. The same page the google found at 19, yahoo found at 20.
Re:They are different (Score:3, Interesting)
Also, it was interesting to see that I seem to be the only person on the Internet with my name. A search for my name in quotes, first and last, with either the long form or short form of my first
nr 4 loves you. (Score:2)
make up your mind...do you want 1 friend (yahoo) or 46?
The Problem with Search Algorithm Monocultures (Score:5, Insightful)
The first big challenge in search is in disambiguating what the searcher really wants without requiring a long string of inputs. A multiple-algoithmic approach would let a search engine serve up hits gathered in multiple ways (e.g., hit number 1 was top ranked using mehtod 1, hit #2 was top ranked using methd 2, etc.). The search company could then see which algorithm provides the best hits for a given search (i.e., by watching which hits the searcher clicks on).
The second big challenge is all the nasty spammers and SEOs (Search Engine Optimizers) who will try to use knowledge of any search algorithm to game the system and artificially raise their page rank for commerical purposes. This is probably one reason why Google cannot maintain dominance - any dominant search enegine attracts the concerted efforts of SEOs, thus ruining its search quality, thus ruining its dominance.
Yet a multi-algorithmic search engine could create a moving target that frustrates SEOs. By rotating the algorithms and even using negative weights on some algorithm results, a multi-algorithmic search company could cause high-ranked pages to plummet in rank over time. One week, a heavily keyworded site (e.g., one listing every possible keyword in metadata) might be at the top of the list, the next week it is at the bottom of the list. This raises the cost to sites trying to game the system. (The search company might even reward or penalize sites that change structure to often to either find the freshest sites or penalize the efforts of SEO).
There never can be one right way to do search.
Re:The Problem with Search Algorithm Monocultures (Score:3)
You sell widgets, so you want good placement for "widget store." Who's to say that your web site is or isn't the best place to buy widgets? Optimize it, so that when somebody searches for "widget store" they find you. The searcher is happy because he got his widget, and you're happy because you sold it to
Re:The Problem with Search Algorithm Monocultures (Score:3, Interesting)
It's clearly in search engine spammers' benefit to do so (much like email spammers).
It also clearly disadvantages users, since PageRank is a pretty good metric (outside of people trying to game the system) of usefulness.
You clearly have some interest in discussing SEO. The parent has some interest in discussing thwarting SEO. I'd that that the second subject has at least as much merit (as in, it benefits a large group of people a good deal), and is certainly equally interesting.
N
They are search engine spammers (Score:3, Insightful)
yeah trying to figure out how to get to the top of search engines by analysing keyword density so you can then construct copy text with fake entry pages or as the se.spammers call them "gateway" pages with 302 redirects via the useragent or constructing urls/with/the/keywords using ModRewrite
we know what they are up to, spamming search engines peddling shite with their refferer links
fuckers, these people are the reason 90% of search engines suck and who are rapidly poising google so in 5 years no-one can find shit without being taken for circlejerks and wading through shitty websites peddling porn,viagra and whatever shit is flavour of the month, if thats what the internet i see is gonna turn into then why the fuck do i bother
and we link em here at slashdot
i wouldnt give these people the time of day
A>S
All I know... (Score:2, Funny)
Is that I'm pissed off for suddenly loosing my ranking a month ago. I used to be in almost every spot for the top 30 results for the keyword "QQQ", but now I am below 100. =(
Cocks. (Score:5, Interesting)
yeah, I know what your thinking.
You typically get a couple things from this search:
Porn (duh)
Chicken related things
and the band "The Revolting Cocks"
By looking at which ones come up first, you can infer some interesting and useful things about how an engine works. What those things are I will let you decide.
Mostly because it's funnier.
But seriously, folks, try [google.com] it [yahoo.com] out [teoma.com].
more isnt always better (Score:3, Insightful)
Comapre the Algorithms manually (Score:4, Informative)
Check out the algorithms yourself by comparing google and yahoo search results side by side [googleguy.de].
The search engines just need moderation (Score:4, Insightful)
My advice: work hard on content (Score:3, Interesting)
I have never played any games what so ever to get there. What I do however is try very hard to place interesting and useful content on my site (mostly 'free web books').
I don't think that it matters so much what you do in life so long as you love doing it. I have been programming computers since the early 1960s, and I still love it!
-Mark
$25/hour? (Score:3, Interesting)
Yahoo uses more than keyword density (Score:5, Informative)
Re:Yahoo uses more than keyword density (Score:3, Informative)
I was reading an SEO discussion on a programming site earlier today and everyone was complaining about how buying keyword ads on Google didn't help their ranking for those keywords in the search results (of course not; it just buys you ad-space).
Missing Domain Name Data Points (Score:3, Interesting)
Search Engines definately give rank to domains which contain your keyword in them. Tons of sites out there seem to have figured this out to make searches useless. There are tons of "keyword.useless-site.com" dictionary pages out there.
I would really like to see the search engines be able to figure out that certain pages make no sense. They read like something from the old SNL subliminal man skits. Or site that bounce you somewhere else as soon as you arrive.
Look Out For Yahoo! Lawyers... (Score:3, Interesting)
Re:Yahoo? (Score:4, Informative)
So you are way out of touch, I'm afraid
Re:Yahoo? (Score:4, Informative)
Re:Yahoo? (Score:4, Informative)
Re:Yahoo? (Score:5, Interesting)
Personally, I find the differences in how the two engines handle bold text to be most interesting. If only for that, I'd stick to Google.
Most pages that have 17 occurences of your search text in bold are only going to be Porn sites ((unrelated to your search)) or Spam sites ((unrelated to your search)).
Re:Yahoo? (Score:3, Informative)
I use Teoma [teoma.com] a lot these days, it's very much like Google was about 6 years ago. Fresh, relevant and speedy. Plus their twist on pagerank is a pretty sweet idea that's worth a look.
Re:Yahoo? (Score:2, Informative)