Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
Programming The Internet IT Technology

Can rev="canonical" Replace URL-Shortening Services? 354

Chris Shiflett writes "There's a new proposal ('URL shortening that doesn't hurt the Internet') floating around for using rev="canonical" to help put a stop to the URL-shortening madness. In order to avoid the great linkrot apocalypse, we can opt to specify short URLs for our own pages, so that compliant services (adoption is still low, because the idea is pretty fresh) will use our short URLs instead of TinyURL.com (or some other third-party alternative) replacements."
This discussion has been archived. No new comments can be posted.

Can rev="canonical" Replace URL-Shortening Services?

Comments Filter:
  • by Anonymous Coward

    I read the first link, sounds like complete and total batshit paranoia. I can't be alone in this opinion. Really, tinyurl has been around the entire 11+ years I've been on the internet, and somehow the internet's survived just fine.

    tag:slownewsday anyone?

    • Re: (Score:3, Funny)

      by whopub ( 1100981 )
      Please, more comments, or I'll be forced to read the actual article. I don't want to be kicked off slashdot for RtFA...
      • Re: (Score:2, Informative)

        by ultrabot ( 200914 )

        Please, more comments, or I'll be forced to read the actual article. I don't want to be kicked off slashdot for RtFA...

        Try to avoid reading the article, because it's pretty nonsensical. It may be the beer I was drinking, but I didn't really get what they are talking about.

        • by Feyr ( 449684 ) on Sunday April 12, 2009 @04:12PM (#27550351) Journal

          short summary: everyone should adopt this NewTechnology(tm) because it will make twitter work better

          1. If everyone uses it
          2. if twitter implements support for it

          of course it's pretty much useless for everyone else

          • by ushering05401 ( 1086795 ) on Sunday April 12, 2009 @04:18PM (#27550387) Journal

            This story should be tagged Twitter.

            This guy seems to be focusing on the meaningful identifier aspect of URL shortening for use in a space limited context - without actually confining his suggestion to use in that sort of environment.

            He puts forth other reasons for using this method such as control over the persistence of the shortened URL, but that doesn't make a whole lot of sense to me... and then he goes back to mentioning Twitter.

    • Re: (Score:2, Interesting)

      by mrmeval ( 662166 )

      About what I was thinking. It sounds like someone pissed their panties about not counting click origin and in some way not making money. If the batshit paranoiac morons can't put up a shortened URL to START with then they need to gag on their own spittle.

    • by CarpetShark ( 865376 ) on Sunday April 12, 2009 @04:01PM (#27550265)

      Yes, TinyURL hasn't killed anyone. BUT... any attempt to fix this is entirely missing the point anyway. From the article:

      I happen to think this URL is beautiful. :-) Unfortunately, it is sure to get mangled into some garbage URL if you try to talk about it on Twitter, because it's not very short. I really hate when that happens. What can I do?

      If rev="canonical" gains momentum...

      If they fix twitter to support links with proper labels or tag contents --- Oh, I don't know, like HTML has supported from the very beginning --- then there wouldn't be a problem.

      Don't work around the bugs, fix the bugs. Links are designed for machines, the higher-level marked up text is for people.

      • by rusl ( 1255318 ) on Sunday April 12, 2009 @04:28PM (#27550431)

        But then you're going to have the problem solved instead of opening up a new can of worms with lots of jobs and neverending problems to solve. Intelligence is bad for the economy.

      • Re: (Score:3, Insightful)

        by Jurily ( 900488 )

        If they fix twitter to support links with proper labels or tag contents --- Oh, I don't know, like HTML has supported from the very beginning --- then there wouldn't be a problem.

        So you're proposing we don't fix the entire internet so a pointless little social service doesn't have to bugfix? Blasphemy!

    • by Sebilrazen ( 870600 ) <blahsebilrazen@blah.com> on Sunday April 12, 2009 @04:12PM (#27550355)
      Oh great, mysterious and anonymous time traveler, what year did you start using the internet so that we may know what year you are posting from and get lottery numbers, World Series and Superbowl winners from you?

      From tinyurl:

      Copyright © 2002-2009 Gilby Productions. All rights reserved.

      (2009 - 2002) < 11+

      • Re: (Score:2, Informative)

        Even better:

        me@myhost:~$ whois tinyurl.com

        Whois Server Version 2.0
        [snip]
          Registrar of Record: TUCOWS, INC.
          Record last updated on 27-Jun-2008.
          Record expires on 27-Jan-2018.
          Record created on 27-Jan-2002.

        Here we have the exact date of creation for TinyURL.com!

        So, you're right. TinyURL celebrated its 7th birthday in January.

  • by SethJohnson ( 112166 ) on Sunday April 12, 2009 @03:46PM (#27550161) Homepage Journal


    What value are these new URLs if they aren't cute?!? [socuteurl.com]

    Seth
  • WTF? (Score:4, Insightful)

    by Anonymous Coward on Sunday April 12, 2009 @03:47PM (#27550169)

    I didn't understand a single word of the submission, and I used to teach Web design. Is it too much to ask submitters to define terms they use?

    • Re:WTF? (Score:5, Informative)

      by Renderer of Evil ( 604742 ) on Sunday April 12, 2009 @04:55PM (#27550589) Homepage

      This whole url shortening shit started to pick up steam few days ago when Digg introduced Diggbar - a hybrid of frame and url-shortening that framed other sites and did not display the proper site address. John Gruber went nuts and modified his blog to redirect users to a special page [digg.com]. Then he blogged for 2 days non-stop how to make diggbar go away. Since he's widely read around the web everyone started chiming in with their opinions on the general idea of url shortening services and how it hurts or helps the web.

      Nerd bullshit. And not the good kind.

      • by SuperKendall ( 25149 ) on Sunday April 12, 2009 @10:35PM (#27552619)

        It wasn't even the Digg Bar exactly. Gruber didn't like it because of the obvious reasons (breaks bookmarks, history, hides the site, etc) but mainly because the DiggBar was turned on by default for all users. Other sites have things like the Diggbar, but no-one really complained about them because users had to turn them on by default.

        If he alone had not liked it you would not have seen the rush to block it from all quarters. I as a user despised it myself, and am happy to see all framing mechanisms die a horrible death.

        Shortening services that use a redirect, he and others have no issue with.

    • Re: (Score:2, Informative)

      by spydabyte ( 1032538 )
      Yes it would be too much to ask. As a reader of slashdot it is your duty to understand terms or google it [justfuckinggoogleit.com] . If you didn't, a submitter would have to define every word entered, making submissions 100x as large, more complicated , and annoying to read. So please, for the sake of all that is good and holy, justfuckinggoogleit. Thanks.
  • what exactly is the point in URL shortening ?

    the only argument I can see is publications and twitter

    publications - there is no way that I am going to be able to example.com/typeskjd583 better than a URL this has been tried and frankly failed

    twitter char limit - well actually twitter should solve this by offering their own service and key into what people are looking at thus having that knowledge inside twitter and being able to monitize it...

    apart from those two reasons (which are false for I belive the rea

    • Re: (Score:3, Informative)

      Comment removed based on user account deletion
    • Re: (Score:3, Interesting)

      For anything that isn't electronic, a shortened URL has you make less mistakes. For example: example.com/typeskjd583 is going to be more accurately typed than somesite.org/wiki/index/cool_tips/code/perl/hello_world.php . A lot of people when they see a site in print can easily mentally change it around, so somesite.org/wiki/index/cool_tips/code/perl/hello_world.php might become somesite.com/wiki/index/cool_tips/code/perl/hello_world.php , the shortened URL protects from this because people aren't trying to
    • Have you ever tried to post a link in chat which was anything longer than the domain name? It's quite easy for that to cover many lines of chat and get people annoyed.

      It's not perfect, but it's far better than some of the alternatives.

    • Have you tried pasting in an IM window a Google maps URL? I'm guessing not or URL shortening would be painfully obvious to you.
  • by Anonymous Coward on Sunday April 12, 2009 @03:47PM (#27550181)

    how about we just kill all twitter users instead?

  • How many recall such threats to the internet as the massive ascii storm caused by Cantor and Siegel and the like, or the sudden tsunami of traffic due to graphics being constantly broadcast by the world wide webby thingy?

    Those and many other phenomena usually resulted in people running around with their hair on fire, flapping their arms and screaming DEATH OF THE INTERNET!

    The majority of bandwidth is taken up by email spam and botnet traffic. Next to those URL relay traffic isn't even noticeable.

    Film at 11.

  • by Toe, The ( 545098 ) on Sunday April 12, 2009 @04:04PM (#27550291)

    On the Twitter /. feed, this of course shows as:
    slashdot [twitter.com] Can rev="canonical" Replace URL-Shortening Services? http://tinyurl.com/c3j4n8 [tinyurl.com]

    P.S. Now if you want a really short URL, try http://tinyarro.ws/ [tinyarro.ws] (no affiliation; just impressed by the idea)

  • I guess I'm stupid. Is there any reason not to have a nice neat hierarchy on the server, that is mirrored with a collapsed symlink farm, with the symlinks exposed in the web pages? Yes, this means one has to map the long names to the short names when generating pages, but that can be an authoring-time issue or dynamic page generation issue. Heck, output-rewriting of the page can do this.
    • by gmuslera ( 3436 )

      Was about to suggest that too. My biggest concern is that the "solution" dont solves one of the biggest problems: 2 access to get that URL. I must access the short url, wherever it is, parse/interpret headers, and then go to the real page.

      With a simple solution that could be a symlink (or server configuration, or catch-all index.php that serves all the content directly) the client only must do one connection to get the real content of the page.

      Of course, there is the option that your server/cms/whatever

  • It's "rel," not "rev."

  • It's a phone problem (Score:3, Interesting)

    by Animats ( 122034 ) on Sunday April 12, 2009 @04:14PM (#27550367) Homepage

    This is a phone-related problem. The basic problem is that URLs are being sent to devices that don't cut, paste, and bookmark. This is only an issue if you have to type the URL manually.

    Maybe what's needed are smarter Twitter clients.

  • Instead of using a plethora of different URL shortening services, any of which might disappear at some point in the future, Twitter should implement its own URL shortening service (using, say, the domain http://tw.it/ [tw.it] or similar) and thereby shorten any URL's that Twitter users post. Assuming the Twitter team can manage this (given their track record with things like message queues, however...) then there would be no possibility of linkrot.

    Unless you're using shortened URL's somewhere besides Twitter, of c

  • by Kupo ( 573763 ) on Sunday April 12, 2009 @04:24PM (#27550409)

    There's all this talk of URL shortening services - whether third-party, or in-house implementation.

    The question here is this: Why are the URLs so long to begin with?

    Why does it have to be:
    http://shiflett.org/blog/2009/apr/save-the-internet-with-rev-canonical

    A full title in the URL is, IMHO, a very inefficient idea. The excuses I've heard are:

    Search Engine Optimizations (better performance when keywords are in the URL)
    Okay, I can't argue that some search engines do stuff like that. But shouldn't the TITLE or META tags have more bearing on this than how ridiculously long the URL is?

    "The URL has meaning, so you know what you're clicking", Context, etc.
    I suppose that when I see a URL like
    http://shiflett.org/blog/2009/apr/save-the-internet-with-rev-canonical
    as opposed to something like
    http://example.org/blog/526
    I would have a slightly better idea of the article's content before clicking on it. But then again, I can't really say that I've decided against clicking on a link just because of the link URL. I would, instead, decide whether I'd want to visit the link by its link text/description.

    So <a href="http://example.org/blog/526">blog on link shortening</a> would still have the same effect on me as a long URL IMO. If it were bookmarked, the same rules would apply.

    Hell, if I were handed an obfuscated shortened URL without context, I'd know even less of what I was getting myself into.

    I think the proper solution is to just stop making ridiculously long URLs to begin with, so we don't have to rely on obfuscation/hashing/shortening to accommodate services that have character limit restrictions. And we'd save bandwidth too [slashdot.org], apparently. Win-win?

    • Re: (Score:3, Interesting)

      by noidentity ( 188756 )
      First off, why do long URLs even matter? Is this link [shiflett.org] too long? Ahhh, you don't even care, because it's a normal link! But let's say the length is a problem. On the linked page, the author suggests that he could have his site also provide an alternate shorter URL for the same page, and have the HTML href tag encode both the long and short versions. Here's what I don't grasp: why not just use the short URL to begin with, and never even post the long one?!? No new HTML features are needed.
    • Re: (Score:3, Interesting)

      by Phroggy ( 441 )

      I've actually been thinking about switching to longer URLs for my own blog. I'm currently using numerical filenames, because it seemed simpler at the time, but the number is basically meaningless to any human looking at the URL. Links within my site always have title tags, but every once in awhile I'll send somebody the URL to one of my blog entries, and it would be nice to see at a glance which entry it is (in case you've read it already).

      To hell with Twitter. :-P

    • Why does it have to be: http://shiflett.org/blog/2009/apr/save-the-internet-with-rev-canonical

      Though the final part of the url "save-the-internet-with-rev-canonical" could easily be shortened, everything else about the url makes sense and has a purpose (and that's 34 characters right there). Keeping a directory structure as opposed to having simply "http://shiftlett.org/###" makes sense. You could argue that you could construct your pages simply as "/###" and hold directory structure either by redirecting to the longer URL or by linking all relevant information to the directory structure, but that

    • Re: (Score:3, Interesting)

      by shannara256 ( 262093 )

      I've seen a solution in a few places that I think deserves to be picked up more widely. You've pointed out the two main styles, which are http://example.com/123 and http://example.com/super-long-title. The best solution seems to be to be a compromise between the two: the first link works, AND it ignores anything after the ID. You could give someone a link to any of the following:
      http://example.com/123/super-long-title
      http://example.com/123/long-title
      http://example.com/123/title
      http://example.com/123
      http:

  • by Knowbuddy ( 21314 ) on Sunday April 12, 2009 @04:27PM (#27550427) Homepage Journal

    Here's the thing: it's not just the path that is the problem, it's also the domain name. You can shorten "/blog/2009/apr/save-the-internet-with-rev-canonical" to "/abc123", but if your domain name is something plus-sized like "rickosborne.org" or worse ... how much have you really gained?

    It's a little helpful, but not really. What you've done is remove the little bit of semantic meaning from the link, all in the name of being able to ego surf easier. Huzzah.

  • by LittleBigScript ( 618162 ) on Sunday April 12, 2009 @04:30PM (#27550449) Homepage Journal

    "Because bigger is better, right?" http://www.hugeurl.com/ [hugeurl.com]

  • by athlon02 ( 201713 ) on Sunday April 12, 2009 @04:37PM (#27550485)

    All this short URL stuff sounds like some phishing scam if you ask me. Short cryptic URLs obviously exist to make me transpose a couple of letters or numbers and end up at some fake bank site. No, give me large detailed URLs so I can see those dead giveaways like pid=poor_sucker&sid=steal_credit_card_info !

    Short URLs indeed... no thank you Nigerian scammers... I won't be transferring any large sums today!

    On a serious note, why is this news exactly?

  • I thought the real purpose of shorten URLs was to help all the memory-challenged people who go to Google to search for the website instead of typing a long URL into the address bar. :P
  • by Skapare ( 16644 ) on Sunday April 12, 2009 @04:47PM (#27550545) Homepage

    Unfortunately, it's not yet an integral part of web frameworks that I have seen. So I am adding it in a new web site I'm building. It means I have to add the feature to the web server.

    It works like this. Every part of the web site code that builds URLs for the same site passes them first through the mapping logic. This basically builds an SHA1 checksum of the canonicalized URL string. Then it looks up the string in a fast database (I'll be using Berkeley DB for this). If it's already there, and is the same URL, it generates a new URL that references the checksum. If it was a different URL, it notifies me that it found an SHA1 collision. If not already there, it adds it. The original URL is thus replaced with the mapping URL.

    Code added to the web server will be designed to detect checksum URLs. If it looks like one, it looks it up in the database to get the original URL, and proceeds with the request using that URL. Original URLs would still be processed as usual, in case they leak out, or are intentionally made to bypass the mapping for special purposes. Basically it's like a tiny URL service, but integrated without the need to do a redirect.

    One thing I am looking at doing is shortening even these URLs, even though they should be short enough already. But this raises the chance for a collision to the point I'll need to add logic to deal with it. How I would do that is similar to a hash data structure collision, but by expanding on the SHA1 checksum by adding back digits that were removed to shorten it.

    External URLs to other sites can be done the same way. This does add the extra redirection. I could limit the use of this only to long external links, since this being a web interface, should handle long external links OK. It could be an option.

  • A Few Responses (Score:5, Interesting)

    by shiflett ( 151538 ) on Sunday April 12, 2009 @04:56PM (#27550595) Homepage

    A couple of good questions I have seen, and my best attempt to answer them:

    1. Don't you mean rel? No, I mean rev. It indicates a reverse link.

    2. Why not make your URLs short in the first place? I happen to like my URLs and have made them as short as I want them. They're only too long in some very specific use cases, like Twitter. I could just complain about Twitter, or I could support an idea that makes URL shortening suck less. I chose the latter.

    Thanks for reading, and please do feel free to criticize whatever you think is wrong with this idea. I'd like a way to indicate a preferred short URL for my own stuff, and this seems like a pretty good way to do it that makes sense semantically and is easy to implement. For an ongoing discussion about adding an HTTP header to do the same thing (so that only a HEAD request is required), read here:

    http://shiflett.org/blog/2009/apr/a-rev-canonical-http-header [shiflett.org]

    • Re: (Score:3, Insightful)

      by Cerium ( 948827 )

      It's a very mildly useful feature, but it's unnecessary bloat.

      First and foremost: It's extra strain on (my) servers. Let's say this becomes an accepted standard and we start having every blogging/forum/comment system doing these lookups to find a smaller url. This means that any time a document on one of my servers is linked to, there's going to be at least one request sent for it so your system can check if a shorter url has been specified. So, now I'm serving up extra data for a feature I won't likely use

  • DNS Overload ? (Score:3, Insightful)

    by Tensor ( 102132 ) on Sunday April 12, 2009 @05:25PM (#27550739)
    Surely the author of that rant knows about dns cache ... your pc will only consult the NS for tinyurl, etc once per day -if at all- depending on how many of those you click on.

    And if you click on them rarely the delay would be neglible, cos you only use them rarely ...

    Plus this, interesting as it may be, still does not solve how to get a long url into a Tweet... it does not matter if Twitter can go look up the small URL on its own ... you still would have the 140 char limit.
  • Reasonable URLs ! (Score:3, Insightful)

    by redelm ( 54142 ) on Sunday April 12, 2009 @06:17PM (#27551065) Homepage
    While I understand linkrot is a danger, the cure isn't some new layer of indirection but fundamentally more permanent archive structure. That really is entirely the site's choice and responsibility.

    Why do so many URLs look like RDBMs queries? Has someone been sold a bill-of-goods?

    As for shorter URLs, they become much shorter minus the DB cruft. And then all it takes is a modicum of logic to form some durable system.

    Some people cannot avoid flavor-of-the-month. Those people should not be making decisions with any sort of permanence or continuity.

  • i blame (Score:3, Insightful)

    by ionix5891 ( 1228718 ) on Sunday April 12, 2009 @06:42PM (#27551217)

    digg

"The algorithm to do that is extremely nasty. You might want to mug someone with it." -- M. Devine, Computer Science 340

Working...