Alexa Web Search Platform Released 63
Philipp Lenssen writes "Amazon's Alexa is releasing their search index (the same that powers the Wayback Machine) to developers via their new Alexa Web Search Platform. The Alexa framework is not for the weak of heart -- expect to learn how to use their C API, and expect to pay micro-amounts for requests and CPU cycles used -- but it also seems to be more powerful than the rival APIs from Yahoo and Google."
Pay? (Score:3, Funny)
Re:Pay? (Score:5, Informative)
Re:Pay? (Score:2, Offtopic)
Re:Pay? (Score:4, Funny)
i'll give you a friendly piece of advice.
You are under no circumstances to read TFA before making at least one post. It's fine to read it, but you must make at least one wildass guess, and pretend to know what it's talking about.
Second, even if TFA did answer your question, you should again, under no circumstances be apologetic.
Finally, welcome to /
Re:Pay? (Score:2)
How embarassing.
Re:Pay? (Score:1)
(it's like when you're good friend with a guy named Fleischer. it's OK to call him Fleisch)
Re:Pay? (Score:1)
Re:Pay? (Score:2)
Alexa? Nope. (Score:4, Informative)
Google's APIs [google.com] are better.
It's lookware. (Score:2)
What is the definition of spyware (Score:4, Insightful)
Re:What is the definition of spyware (Score:3, Informative)
- Tricks users into installing its software, or installs itself without permission
- Actively tries to stop users from uninstalling it, forcing people to use a third-party app to remove it (Ad-Aware, etc.)
- Tracks users
The first two make it scumware, the last makes it spyware. Google toolbar does track users, but warns them before doing so and only installs when users want it installed.
Re:What is the definition of spyware (Score:3, Insightful)
Re:What is the definition of spyware (Score:2)
Re:Alexa? Nope. (Score:1)
Oh, don't be so harsh, we are shuddering.
"Micro-amount" (Score:1)
Re:"Micro-amount" (Score:2)
Re:"Micro-amount" (Score:1)
There is something about that word that just bothers me... maybe it is all the porn sites out there who advertise their "micro" monthly cost.
Yeah, I hate those sites.
I mean, based on what I've heard.
Price (Score:4, Informative)
$1 per CPU hour ($.50 for unused hours)
$1 per GB/year
$1 per 50GB processed
$1 per GB downloaded
and $1 for every 4000 user requests.
This is just for search service, right?
And how do these prices relate to similar services?
Re:Price (Score:4, Informative)
Google & Yahoo API's are Results not Data (Score:2)
Not strictly true, you have to do *ranking* on your own. Reading the docs it does let you reduce the document set, just not rank the finished result set. So you can filter the result set down to the matching documents, but which is the most important? Your algo decides.
Google and Yahoo give you finished results but only ones ranked by their own algorithms and then only the first 1000 result. Even then it's only 5000 query max for Yahoo and 1000 max for Google
Re:Price (Score:2)
It does provide searching:
Alexa Web Platform User Guide > Search > Criteria > Overview [alexa.com]
Also, the crawl data includes pictures and movies, and the search engine metadata provides this useful field:
CRITERIA,SEARCH FIELD = Adult content, Porn
Coul
Re:Price (Score:2)
Re:Price (Score:2)
Re:Price (Score:2)
Re:Price (Score:1)
Does anyone have any reasons that Alexa's API is better than, say, Google's?
Re:Price (Score:2)
Spyware? (Score:2)
Re:Spyware? (Score:4, Interesting)
If you mean collecting data, then yes Alexa does it.
If you mean collecting personnal data, I don't think the toolbar does it.
Then what about Google? With AdSense running (almost) everywhere + your unique eternal Google ID, they surely collect a lot of data too. And with Google Analytics, they have also a lot of info.
So the question becomes: Is Google AdSense spyware?
Re:Spyware? (Score:5, Insightful)
Nothing they try to hide deep down in some obscure EULA or anything. Sure, it's about collecting data, but there's a difference between collecting data, and collecting data by spying. The former is about doing it visibly, the other trying to hide it.
Besides, technically speaking, I'm not sure one should call a business model or an online service "spyware" anyway, as it's usually a term used for client-side software often piggybacking on another tool, that secretly phones home by using an internet connection.
Re:Spyware? (Score:2)
Re:Spyware? (Score:2)
To summarize: they can do what ever they want. But you're right: it is not hidden.
there's a difference between collecting data, and collecting data by spying. The former is about doing it visibly, the other trying to hide it.
Is displaying an ad something visible? You know they record every click (since they wi
Re:Spyware? (Score:2)
God yes. However, slashdot loves google, so you will hear people explaining why spyware's actually a good thing in this case.
Re:Spyware? (Score:1)
Search history is great - I can see what I was searching for a month ago and vaguely remember what I was doing that day, what I was thinking about etc.
C not required (kinda) (Score:5, Informative)
The Data Retrieval API is written in C, so it may be natural for users to develop C applications against this API. However, the Platform features a utility named awsp_cat. This utility reads CIDs from stdin and writes the raw content to stdout. Users may develop applications in arbitrary programming languages to process the awsp_cat output.
Perl developers would be able to wrap this into their existing codebase in no time, assuming they want to pay the fees.
Amazon as a serious concurrent to Google? (Score:1, Funny)
Re:Amazon as a serious concurrent to Google? (Score:1)
That certainly would give the Google Books project a different twist... It might go away entirely, given that Google would be in essence competing with itself then, giving users a reason not to buy books.
Not that any of this would happen, of course.
Re:Amazon as a serious concurrent to Google? (Score:1)
Re:Black Gold, Texas Tea. (Score:2)
Data Value (Score:4, Informative)
What's your opinion about Alexa ranks? Reliable? IMHO, there is too few users of the Alexa toolbar. It is also quite biased (IE, Windows). So except maybe for the top 30,000 websites, I'm not sure about the reliability of the stats.
Re:Data Value (Score:2)
Anyway the question remains: how good is the evaluation function of the search engine?
Re:Data Value (Score:2)
No, only the bias applies. For the top 30,000 websites, I think the daily sample is big enough to have at least a bit of meaning.
Do you really believe that MSN is more popular than Google?
Among the people that use the Alexa toolbar? yes. But of course, Alexa users are not representative of the internet population.
Shell access? Arbitrary C code? (Score:5, Interesting)
That seems a little dangerous, doesn't it?
Who is responsible for a security breach? (Score:5, Informative)
Man, I would hate to see who or what is held responsible.
Google API vs Alexa API (Score:5, Insightful)
Someone can download billions of pages for several thousand dollars then use that to build their own search engine. Another user could be to mine the web for content such as email addresses(which would be bad). Alexa's announcement is a big shift and was bound to happen. Instead of getting crumbs from Yahoo & Google, they're giving up huge chunks of juicy data.
Our favorite flawed rankings (Score:4, Insightful)
Re:Our favorite flawed rankings (Score:2)
Their rankings aren't flawed. They just don't represent what you want them to.
It's a raw count of GETs/POSTs, which includes pop-up advertising and such. It's not a ranking based on 'popularity'.
More than just an index (Score:5, Insightful)
It seems some people (especially the author of the cited article) missed some very important points:
1. You have access to more than just the index - you have access to the crawled data, which is about 300 Terabyte. So, if you want to do something with the pages, you don't have to download them, you don't have to rely, that they are there - you can use the crawled data to do whatever you want.
2. The processing does not take place on your machine, but on the provided infrastructure. There is a Web-Interface, so you can administer your account, your jobs etc. You do not download any software from Alexa. You get an account on their Linux cluster and there you can compile and run your own arbritrary applications. You are able to provide these results in form of Amazon Web Services.
So, this is much more than Google, MSN or Yahoo offer, it's hard even to compare those services. Alexa is a complete different beast, and it's a huge beast.Alexa's index does not drive the Internet Archive (Score:2)