Forgot your password?
typodupeerror
Programming The Internet IT Technology

Can rev="canonical" Replace URL-Shortening Services? 354

Posted by Soulskill
from the snds-lk-a-pln dept.
Chris Shiflett writes "There's a new proposal ('URL shortening that doesn't hurt the Internet') floating around for using rev="canonical" to help put a stop to the URL-shortening madness. In order to avoid the great linkrot apocalypse, we can opt to specify short URLs for our own pages, so that compliant services (adoption is still low, because the idea is pretty fresh) will use our short URLs instead of TinyURL.com (or some other third-party alternative) replacements."
This discussion has been archived. No new comments can be posted.

Can rev="canonical" Replace URL-Shortening Services?

Comments Filter:
  • by mrmeval (662166) <mrmeval AT gmail DOT com> on Sunday April 12, 2009 @03:46PM (#27550163) Journal

    About what I was thinking. It sounds like someone pissed their panties about not counting click origin and in some way not making money. If the batshit paranoiac morons can't put up a shortened URL to START with then they need to gag on their own spittle.

  • by Darkness404 (1287218) on Sunday April 12, 2009 @03:56PM (#27550225)
    For anything that isn't electronic, a shortened URL has you make less mistakes. For example: example.com/typeskjd583 is going to be more accurately typed than somesite.org/wiki/index/cool_tips/code/perl/hello_world.php . A lot of people when they see a site in print can easily mentally change it around, so somesite.org/wiki/index/cool_tips/code/perl/hello_world.php might become somesite.com/wiki/index/cool_tips/code/perl/hello_world.php , the shortened URL protects from this because people aren't trying to convert it to words and then type it, for example, something that was written as "Gray" may be mentally changed by someone to "Grey" because when they say the word "Gray" in their heads they see it written as "Grey".

    Its like typing in those serial numbers with software compared to cheat codes in old-school video games. The serial numbers are abstract so the letters in it are simply letters, whereas the cheat code may spell part of some word, if someone frequently misspells it (or the code is a misspelling of a word), it may be harder to enter.
  • by Toe, The (545098) on Sunday April 12, 2009 @04:04PM (#27550291)

    On the Twitter /. feed, this of course shows as:
    slashdot [twitter.com] Can rev="canonical" Replace URL-Shortening Services? http://tinyurl.com/c3j4n8 [tinyurl.com]

    P.S. Now if you want a really short URL, try http://tinyarro.ws/ [tinyarro.ws] (no affiliation; just impressed by the idea)

  • It's a phone problem (Score:3, Interesting)

    by Animats (122034) on Sunday April 12, 2009 @04:14PM (#27550367) Homepage

    This is a phone-related problem. The basic problem is that URLs are being sent to devices that don't cut, paste, and bookmark. This is only an issue if you have to type the URL manually.

    Maybe what's needed are smarter Twitter clients.

  • by Christophotron (812632) on Sunday April 12, 2009 @04:15PM (#27550373)

    How about Twitter just stops arbitrarily limiting characters. Go by word count, perhaps?

    I know some avid twitter users, and the majority of them apparently use the idiotic SMS message system to 'tweet' each other all throughout the day on their phones. Twitter can't abandon the 140-character limit for this reason.

    For the record, I am against anything that keeps the SMS system relevant in this day and age. It should have been abandoned long ago in favor of standard data packets on the internet, rather than control packets on a proprietary wireless system. There's no good reason to keep this system alive when it either forces you to pay $X per month for it, or pay $.15 per 140 characters when one of your idiot friends 'texts' you. There's no way (that I know of) to force incoming SMS to route through GPRS, so you are hit with SMS fees even when you already pay for unlimited data. It also invites spam that you actually DO pay for, quite literally, and from which the wireless carrier profits as well. It should be illegal for the carrier to charge you for incoming SMS messages. Anyone who agrees with me should call their congressperson to protest this policy and call their wireless carrier to block all SMS messages.

  • by khendron (225184) on Sunday April 12, 2009 @04:18PM (#27550391) Homepage

    idontthinkthatwillworkverywell.

  • Re:"rel," not "rev" (Score:2, Interesting)

    by Opyros (1153335) on Sunday April 12, 2009 @04:27PM (#27550423) Journal
    Direct link [appspot.com] to the revcanonical website. It really is "rev" rather than "rel"; evidently this attribute is an HTML 5 proposal which hasn't been accepted, or so it says at http://benramsey.com/archives/a-revcanonical-rebuttal/ [benramsey.com]
  • by Anonymous Coward on Sunday April 12, 2009 @04:43PM (#27550523)

    http://www.socuteurl.com/foo

    Absolute foo...

  • by noidentity (188756) on Sunday April 12, 2009 @04:43PM (#27550527)
    First off, why do long URLs even matter? Is this link [shiflett.org] too long? Ahhh, you don't even care, because it's a normal link! But let's say the length is a problem. On the linked page, the author suggests that he could have his site also provide an alternate shorter URL for the same page, and have the HTML href tag encode both the long and short versions. Here's what I don't grasp: why not just use the short URL to begin with, and never even post the long one?!? No new HTML features are needed.
  • by Skapare (16644) on Sunday April 12, 2009 @04:47PM (#27550545) Homepage

    Unfortunately, it's not yet an integral part of web frameworks that I have seen. So I am adding it in a new web site I'm building. It means I have to add the feature to the web server.

    It works like this. Every part of the web site code that builds URLs for the same site passes them first through the mapping logic. This basically builds an SHA1 checksum of the canonicalized URL string. Then it looks up the string in a fast database (I'll be using Berkeley DB for this). If it's already there, and is the same URL, it generates a new URL that references the checksum. If it was a different URL, it notifies me that it found an SHA1 collision. If not already there, it adds it. The original URL is thus replaced with the mapping URL.

    Code added to the web server will be designed to detect checksum URLs. If it looks like one, it looks it up in the database to get the original URL, and proceeds with the request using that URL. Original URLs would still be processed as usual, in case they leak out, or are intentionally made to bypass the mapping for special purposes. Basically it's like a tiny URL service, but integrated without the need to do a redirect.

    One thing I am looking at doing is shortening even these URLs, even though they should be short enough already. But this raises the chance for a collision to the point I'll need to add logic to deal with it. How I would do that is similar to a hash data structure collision, but by expanding on the SHA1 checksum by adding back digits that were removed to shorten it.

    External URLs to other sites can be done the same way. This does add the extra redirection. I could limit the use of this only to long external links, since this being a web interface, should handle long external links OK. It could be an option.

  • by Phroggy (441) <slashdot3.phroggy@com> on Sunday April 12, 2009 @04:52PM (#27550569) Homepage

    I've actually been thinking about switching to longer URLs for my own blog. I'm currently using numerical filenames, because it seemed simpler at the time, but the number is basically meaningless to any human looking at the URL. Links within my site always have title tags, but every once in awhile I'll send somebody the URL to one of my blog entries, and it would be nice to see at a glance which entry it is (in case you've read it already).

    To hell with Twitter. :-P

  • A Few Responses (Score:5, Interesting)

    by shiflett (151538) on Sunday April 12, 2009 @04:56PM (#27550595) Homepage

    A couple of good questions I have seen, and my best attempt to answer them:

    1. Don't you mean rel? No, I mean rev. It indicates a reverse link.

    2. Why not make your URLs short in the first place? I happen to like my URLs and have made them as short as I want them. They're only too long in some very specific use cases, like Twitter. I could just complain about Twitter, or I could support an idea that makes URL shortening suck less. I chose the latter.

    Thanks for reading, and please do feel free to criticize whatever you think is wrong with this idea. I'd like a way to indicate a preferred short URL for my own stuff, and this seems like a pretty good way to do it that makes sense semantically and is easy to implement. For an ongoing discussion about adding an HTTP header to do the same thing (so that only a HEAD request is required), read here:

    http://shiflett.org/blog/2009/apr/a-rev-canonical-http-header [shiflett.org]

  • by Darkk (1296127) on Sunday April 12, 2009 @05:01PM (#27550609)

    The biggest problem with long URLs would be in e-mails as they usually get word wrapped. So when they click on it may not properly cut/paste the full URL into the default browser.

    Every try cut and pasting this LONG URL from e-mail to the browser if you're using a small monitor, i.e. laptop?

    http://maps.live.com/default.aspx?v=2&FORM=LMLTCP&cp=37.827041~-122.422875&style=h&lvl=18&tilt=-90&dir=0&alt=-1000&phx=0&phy=0&phscl=1&encType=1 [live.com]

  • by theTerribleRobbo (661592) on Sunday April 12, 2009 @09:13PM (#27552119) Homepage

    By "some search engines" I hope you mean "the search engines the overwhelmingly vast majority of people use".

    As soon as Google / Yahoo stop bumping up pages if the user's search keywords match keywords in the URL, sure, people might stop using the long URLs; until then, though, expect everyone to keep at it to try to stay on top of the search result pile.

  • Re:a better idea (Score:3, Interesting)

    by PsychoSlashDot (207849) on Sunday April 12, 2009 @09:22PM (#27552161)

    how about we just kill all twitter users instead?

    Funny? No, you deserve +5 Interesting at least.

    My wife signed up for that crap and at age 37 I've got to cope with her phone going off multiple times during Easter diner and her sharing with my family that Kevin Smith (of Clerks fame) can't decide if he should dry-hump his wife's leg or just rub one out because it's 3am she's asleep and he's stoned and horny.

  • Re:"rel," not "rev" (Score:2, Interesting)

    by uhoreg (583723) on Sunday April 12, 2009 @11:20PM (#27552847) Homepage

    No, rev was in previous versions of HTML, but was apparently dropped in HTML 5, probably because people didn't understand the different between rev and rel.

    rel="canonical" and rev="canonical" are different things

  • by shannara256 (262093) on Monday April 13, 2009 @06:02AM (#27554335) Homepage

    I've seen a solution in a few places that I think deserves to be picked up more widely. You've pointed out the two main styles, which are http://example.com/123 and http://example.com/super-long-title. The best solution seems to be to be a compromise between the two: the first link works, AND it ignores anything after the ID. You could give someone a link to any of the following:
    http://example.com/123/super-long-title
    http://example.com/123/long-title
    http://example.com/123/title
    http://example.com/123
    http://example.com/123/puppies
    And they'd all redirect to http://example.com/123/super-long-title. Everybody wins.

Chemist who falls in acid is absorbed in work.

Working...