Can rev="canonical" Replace URL-Shortening Services? 354
Chris Shiflett writes "There's a new proposal ('URL shortening that doesn't hurt the Internet') floating around for using rev="canonical" to help put a stop to the URL-shortening madness. In order to avoid the great linkrot apocalypse, we can opt to specify short URLs for our own pages, so that compliant services (adoption is still low, because the idea is pretty fresh) will use our short URLs instead of TinyURL.com (or some other third-party alternative) replacements."
Re:"Great link apocolypse" WAT? (Score:2, Interesting)
About what I was thinking. It sounds like someone pissed their panties about not counting click origin and in some way not making money. If the batshit paranoiac morons can't put up a shortened URL to START with then they need to gag on their own spittle.
Re:sorry but I dont get... (Score:3, Interesting)
Its like typing in those serial numbers with software compared to cheat codes in old-school video games. The serial numbers are abstract so the letters in it are simply letters, whereas the cheat code may spell part of some word, if someone frequently misspells it (or the code is a misspelling of a word), it may be harder to enter.
URL shortened, of course (Score:5, Interesting)
On the Twitter /. feed, this of course shows as:
slashdot [twitter.com] Can rev="canonical" Replace URL-Shortening Services? http://tinyurl.com/c3j4n8 [tinyurl.com]
P.S. Now if you want a really short URL, try http://tinyarro.ws/ [tinyarro.ws] (no affiliation; just impressed by the idea)
It's a phone problem (Score:3, Interesting)
This is a phone-related problem. The basic problem is that URLs are being sent to devices that don't cut, paste, and bookmark. This is only an issue if you have to type the URL manually.
Maybe what's needed are smarter Twitter clients.
Re:I have an easier solution: (Score:5, Interesting)
How about Twitter just stops arbitrarily limiting characters. Go by word count, perhaps?
I know some avid twitter users, and the majority of them apparently use the idiotic SMS message system to 'tweet' each other all throughout the day on their phones. Twitter can't abandon the 140-character limit for this reason.
For the record, I am against anything that keeps the SMS system relevant in this day and age. It should have been abandoned long ago in favor of standard data packets on the internet, rather than control packets on a proprietary wireless system. There's no good reason to keep this system alive when it either forces you to pay $X per month for it, or pay $.15 per 140 characters when one of your idiot friends 'texts' you. There's no way (that I know of) to force incoming SMS to route through GPRS, so you are hit with SMS fees even when you already pay for unlimited data. It also invites spam that you actually DO pay for, quite literally, and from which the wireless carrier profits as well. It should be illegal for the carrier to charge you for incoming SMS messages. Anyone who agrees with me should call their congressperson to protest this policy and call their wireless carrier to block all SMS messages.
Re:I have an easier solution: (Score:3, Interesting)
idontthinkthatwillworkverywell.
Re:"rel," not "rev" (Score:2, Interesting)
Re:but will they be cute? (Score:0, Interesting)
http://www.socuteurl.com/foo
Absolute foo...
Re:Alternative Solution: Implement it Right? (Score:3, Interesting)
URL mapping is the answer (Score:5, Interesting)
Unfortunately, it's not yet an integral part of web frameworks that I have seen. So I am adding it in a new web site I'm building. It means I have to add the feature to the web server.
It works like this. Every part of the web site code that builds URLs for the same site passes them first through the mapping logic. This basically builds an SHA1 checksum of the canonicalized URL string. Then it looks up the string in a fast database (I'll be using Berkeley DB for this). If it's already there, and is the same URL, it generates a new URL that references the checksum. If it was a different URL, it notifies me that it found an SHA1 collision. If not already there, it adds it. The original URL is thus replaced with the mapping URL.
Code added to the web server will be designed to detect checksum URLs. If it looks like one, it looks it up in the database to get the original URL, and proceeds with the request using that URL. Original URLs would still be processed as usual, in case they leak out, or are intentionally made to bypass the mapping for special purposes. Basically it's like a tiny URL service, but integrated without the need to do a redirect.
One thing I am looking at doing is shortening even these URLs, even though they should be short enough already. But this raises the chance for a collision to the point I'll need to add logic to deal with it. How I would do that is similar to a hash data structure collision, but by expanding on the SHA1 checksum by adding back digits that were removed to shorten it.
External URLs to other sites can be done the same way. This does add the extra redirection. I could limit the use of this only to long external links, since this being a web interface, should handle long external links OK. It could be an option.
Re:Alternative Solution: Implement it Right? (Score:3, Interesting)
I've actually been thinking about switching to longer URLs for my own blog. I'm currently using numerical filenames, because it seemed simpler at the time, but the number is basically meaningless to any human looking at the URL. Links within my site always have title tags, but every once in awhile I'll send somebody the URL to one of my blog entries, and it would be nice to see at a glance which entry it is (in case you've read it already).
To hell with Twitter. :-P
A Few Responses (Score:5, Interesting)
A couple of good questions I have seen, and my best attempt to answer them:
1. Don't you mean rel? No, I mean rev. It indicates a reverse link.
2. Why not make your URLs short in the first place? I happen to like my URLs and have made them as short as I want them. They're only too long in some very specific use cases, like Twitter. I could just complain about Twitter, or I could support an idea that makes URL shortening suck less. I chose the latter.
Thanks for reading, and please do feel free to criticize whatever you think is wrong with this idea. I'd like a way to indicate a preferred short URL for my own stuff, and this seems like a pretty good way to do it that makes sense semantically and is easy to implement. For an ongoing discussion about adding an HTTP header to do the same thing (so that only a HEAD request is required), read here:
http://shiflett.org/blog/2009/apr/a-rev-canonical-http-header [shiflett.org]
Re:Alternative Solution: Implement it Right? (Score:3, Interesting)
The biggest problem with long URLs would be in e-mails as they usually get word wrapped. So when they click on it may not properly cut/paste the full URL into the default browser.
Every try cut and pasting this LONG URL from e-mail to the browser if you're using a small monitor, i.e. laptop?
http://maps.live.com/default.aspx?v=2&FORM=LMLTCP&cp=37.827041~-122.422875&style=h&lvl=18&tilt=-90&dir=0&alt=-1000&phx=0&phy=0&phscl=1&encType=1 [live.com]
Re:Alternative Solution: Implement it Right? (Score:2, Interesting)
By "some search engines" I hope you mean "the search engines the overwhelmingly vast majority of people use".
As soon as Google / Yahoo stop bumping up pages if the user's search keywords match keywords in the URL, sure, people might stop using the long URLs; until then, though, expect everyone to keep at it to try to stay on top of the search result pile.
Re:a better idea (Score:3, Interesting)
how about we just kill all twitter users instead?
Funny? No, you deserve +5 Interesting at least.
My wife signed up for that crap and at age 37 I've got to cope with her phone going off multiple times during Easter diner and her sharing with my family that Kevin Smith (of Clerks fame) can't decide if he should dry-hump his wife's leg or just rub one out because it's 3am she's asleep and he's stoned and horny.
Re:"rel," not "rev" (Score:2, Interesting)
No, rev was in previous versions of HTML, but was apparently dropped in HTML 5, probably because people didn't understand the different between rev and rel.
rel="canonical" and rev="canonical" are different things
Re:Alternative Solution: Implement it Right? (Score:3, Interesting)
I've seen a solution in a few places that I think deserves to be picked up more widely. You've pointed out the two main styles, which are http://example.com/123 and http://example.com/super-long-title. The best solution seems to be to be a compromise between the two: the first link works, AND it ignores anything after the ID. You could give someone a link to any of the following:
http://example.com/123/super-long-title
http://example.com/123/long-title
http://example.com/123/title
http://example.com/123
http://example.com/123/puppies
And they'd all redirect to http://example.com/123/super-long-title. Everybody wins.