Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
The Internet

User-Centered URL Design 41

Adaptive Path has this interesting essay by Jesse James Garrett on user friendly URL design. When websites were just static files, they were often named in a friendly way, just to make it easier for the designer. But today, many dynamic web sites and CMS's are based around extremely long and complicated URLs that are difficult to work with (ever try to read one to someone over the phone?). This essay explores the way some websites use redirects and smart naming schemes to keep URL's easy and friendly.
This discussion has been archived. No new comments can be posted.

User-Centered URL Design

Comments Filter:
  • Tim Berners-Lee (Score:2, Interesting)

    by queh ( 538682 )
    A good basic guide from the founder of the web is at http://www.w3.org/DesignIssues/Axioms.html [w3.org]. (Note that URI doesn't follow its own advice, heh)
    • "A good basic guide from the founder of the web is at http://www.w3.org/DesignIssues/Axioms.html (Note that URI doesn't follow its own advice, heh)"

      What's more, his 'good basic guide' is written by a PhD who's completely lost touch with the 99.999% of web-authors who don't have PhDs in compsci or physics or anything. The W3C has become a big irrelevant exercise in intellectual masturbation, and responsible web-authors ought to secede from its pseudo-AI [robotwisdom.com] lunacy.

  • by cpeterso ( 19082 ) on Wednesday October 02, 2002 @06:57PM (#4377354) Homepage
    Brent Simmons' Law of CMS URLs [inessential.com]:

    The more expensive the Content Management System, the crappier the URLs. Compare, for instance, StoryServer's weird comma-delimited numeric URLs [vignette.com] to Radio UserLand's human-readable (and guessable) URLs. Then compare the prices--orders of magnitudes of difference. So, at least in this respect, there's an inverse relationship between price and quality.
    • An example of a low-cost CMS ($800 US per site, unlimited users, with a stripped down version for $200 US) that offers a nice URL scheme (every page is /index/PAGENAME, the printable version of that page would be /index/PAGENAME/printable, and any other variables can also be passed in the same manner, ie. /index/news/story.544 translates to /index?page=news&mode=html&story=544 in traditional URL speak), and also offers a lot of features found in the higher-end apps (not quite Vignette-sized) like locking/versioning and has a really solid *Object-Oriented* PHP application framework underneath it, can be found by following the link in my .sig (it's called Sitellite).

      Sorry for the /vertisement, couldn't help it. We're releasing version 3.0 on Halloween and always looking for good beta testers!
    • Interesting! To compare, my site [pdrap.org] has a free content management system I wrote myself [pdrap.org] in Python. The URL's are not too bad, except for the ones that my digital camera made up in the photo album.
  • by RobotWisdom ( 25776 ) on Wednesday October 02, 2002 @07:26PM (#4377548) Homepage
    My site is 100% static HTML, but my rules of thumb for URLs include:

    - never more than 80 chars, so they can be emailed
    without wrapping

    - no uppercase, ever (otherwise you'll forget where
    the caps were)

    - never more than two directories deep (I sometimes
    break this due to bad planning)

    - if a new page seems likely to grow into many
    pages, it should be created as foo/index.html
    instead of foo.html (Someone emailed me this
    brilliant tip, I forget who though.)

    But the bottom line is to arrange directories
    and files (and their names) so that you can
    remember them without having to doublecheck.
    • I take this even one step further: I make every page into a directory. Thus, you would never see "/foo.html" or "/foo/index.html", but just simply "/foo/". The final slash is optional, but what happens when a user doesn't include a final slash is that the browser asks for "/foo" and the server responds with a redirect to "/foo/" and the browser then proceeds to ask for and download "/foo/" (so always specifying a final slash never hurts).

      The advantage is that this makes it easier to move between content management systems without breaking links or maintaining loads of redirects. Eg, if my site goes from php to zope, I don't need to change any links or create any redirects for outside people that linked to us or users that bookmarked a page. This also means that if you adopt this scheme for your 100% static html pages and then decide to go to php (or whatever) you don't need to make page redirects, symlinks, permanent redirects, etc.

      I also only specify absolute URLs that don't include the server name (not sure of the correct terminology, but I mean "/foo/bar/" instead of "../bar/" or "http://server/foo/bar/"). This makes it easy to mirror a set of pages on a different server (for instance, if you have a database app that you MUST run on a specific server but you want to make it look like it's running on your main front-end server). I've tried working with non-absolute URLs ("../foo/"), and this made it very easy to move entire sets of pages to different directory hierarchies or different servers, but it's quite a PITA to actually write pages like that (and it also creates a redirect, like not including the final slash). Anyway, the point is that you should always be aware of the issue when writing pages as consistency saves time in the long run.

      Anyway, these two things have come in quite handy in a number of situations.

      Final pet peeve: never use client-side redirects, always use server-side permanent redirects. Client-side redirects break the "back" button. Client-side redirects are a sure sign of someone who can't grok .htaccess.

      • Client side re-directs are the only things that can theoretically send a 'moved' message to the client so they can update their bookmarks. I don't know what browsers do this though, but it is in a spec somewhere.
        • I'm not certain about that. I thought it was actually the other way around. Quoting RFC 2616:
          10.3.2 301 Moved Permanently

          The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible. This response is cacheable unless indicated otherwise.

          (301 is the permanent redirect I was talking about and is what apache generates on "redirect permanent" statements.)

          I recall doing some tests with a few browsers a couple years ago, and I actually noted that none of them update their bookmarks no matter what you do. Situation may have improved in the meantime.

          When you do this:

          < META HTTP-EQUIV="refresh" CONTENT="N;URL=http://www.foo.com/foo.html">

          (which is what I mean by "client-side redirect") I would say it's a very bad idea for a user agent to update a bookmark since that is used for a number of other things (eg, auto-refreshing pages like a webcam, etc.).

          • I'm not certain about that. I thought it was actually the other way around.

            Yes; there are several different redirect schemes in HTTP (permanent, temporary, see other, etc), and a couple of other response codes that can edit bookmarks.

            410 Gone, for instance, should be used instead of 404 if a resource used to be there but has been removed. While 404's might disappear later, 410's are explicit "this is gone, it's not coming back" notices.
      • and it also creates a redirect, like not including the final slash

        Uhm, no it doesn't.

        The UA does not request "../foo/" from the server; it looks at the current URL and strips off the last directory and appends /foo/ to it, resulting in exactly the same URL you would have got using absolute links.

        It costs a few extra string operations, not an extra HTTP request :)
        • Actually, I just tested this, and we're both wrong. I haven't found a browser that does that string manipulation, and I also haven't found a server that does a redirect.

          Type in http://server.com/somedir/../ into a Netscape and the browser sends a request for exactly that, and apache and thttpd respond with exactly that directory (eg, equivalent to the web root, assuming "somedir" exists). Test it with a sniffer if you don't believe me (I'm using Netscape 4.x for Unix (please, no comments) - perhaps other browsers actually do the string manipulation, but if Netscape 4.x sends that out, I'm sure all servers deal with it). I'm certain apache must be doing some sort of string manipulation internally since it has to figure out whether or not you're requesting something in its web space (eg, to prevent fetching /../../../etc/passwd). I'm not sure how well this works, because such string manipulation would become more difficult with UTF-8 and URI-encoding (where there's more than one representation for a character).

          Actually, now that I think of it, this is probably what these things are trying to exploit:

          GET /scripts/..%255c%255c../winnt/system32/cmd.exe?/c+ dir HTTP/0.9

          Looking through one page of a server's logs, I see three different variants of the above request, from three different IPs, all within an hour :)

          • Actually, I just tested this, and we're both wrong. I haven't found a browser that does that string manipulation, and I also haven't found a server that does a redirect.

            You can't have looked very hard. Both Opera, IE6 and Mozilla normalise the URL before sending; even wget does. curl was the only client I use which doesn't.

            If NS4 doesn't, fine, more fool it. Several months ago my copy started coredumping every time I tried to run it, so it no longer gets tested :)

            No need to packet sniff, anyway; server logs work just as well.
            I'm not sure how well this works, because such string manipulation would become more difficult with UTF-8 and URI-encoding (where there's more than one representation for a character).

            Um, why would it?

            The rules are pretty simple; when you see a .., back up to the next / and try again. How would UTF or URL encoding make this more difficult?
            Actually, now that I think of it, this is probably what these things are trying to exploit:


            GET /scripts/..%255c%255c../winnt/system32/cmd.exe?/c+ dir HTTP/0.9

            Thousands of years in the future, archeologists and server admins will boggle at these mysterious requests that continue to hit their webservers ;)
      • I also only specify absolute URLs that don't include the server name (not sure of the correct terminology, but I mean "/foo/bar/" instead of "../bar/" or "http://server/foo/bar/").

        What if you don't control the namespace below "http://server/foo/"? For instance, what if your URL is http://server/~username/ as it is on fortunecity, geocities, or your university's web space?

        never use client-side redirects

        Unless your hosting service doesn't support server-side redirects.

        Client-side redirects are a sure sign of someone who can't grok .htaccess.

        "Someone" not always being the person responsible for the content.

  • by NickV ( 30252 ) on Wednesday October 02, 2002 @07:50PM (#4377693)
    Anyone else find it a bit ironic that an article that is offering suggestions for cleaner URLs and undoing the damage of CMS naming conventions is named "000058.php" ?

  • by Louis_Wu ( 137951 ) <chris.cantrall@gmail.com> on Wednesday October 02, 2002 @07:56PM (#4377730) Journal
    Well, this was also written up [alistapart.com] at A List Apart, which is directed by Jeffrey Zeldman [zeldman.com], who did an interview [slashdot.org] for Slashdot in May of 2000, and was a recent subject of controversy here [slashdot.org] and elsewhere, as he chronicled on his website [zeldman.com].

    Help me, I've gone link-mad! (But those are all good reads.)

  • Users do not care about urls. They care about the following:

    How to get to the site

    How to navigate to the place they want - and get out as quick as possible

    Bookmark their fav spot - to get in/out quick

    Short URLS should be provided to redirect users to redirect to the intended result site. Such as advertising.

    • The users care about:

      • How to get to the site
      Since writing the URL is one way to get to the site, then the users care about URLs.
      • How to navigate to the place they want - and get out as quick as possible
      Since being able to guess an URL lets the user navigate faster, then the users care about URLs
      • Bookmark their fav spot - to get in/out quick
      True.
      • Short URLS should be provided to redirect users to redirect to the intended result site. Such as advertising.
      Therefore users care about URLs.

      Conclusion: URLs should be for people, not computers. People care what the URLs are, computers don't

    • Damn right. Have whatever URLs you want, just make sure that *the same URL gives the same page* (without weird hidden POST parameters or other crap). Then users can bookmark the pages or link to them. Maybe some small customization with cookies is okay, as long as the page content remains the same.
  • I agree with the general premise of the article, but unfortunately it was vary light on any useful tips or code.

    For instance, this thread in /. is http://developers.slashdot.org/article.pl?sid=02/1 0/02/1946224&mode=flat&tid=95 -- why couldn't it just be http://developers.slashdot.org/thread/02-10-02/194 6224.html ? Or even http://slashdot.org/developers/02-10-02/User_Cente red_URL_Design.html ?

    I'd love to have some tips from various folk on how to use Perl and PHP with Apache in fancy ways to simplify and avoid these clunky URLs.

    -- Herder Of Cats
    • You can use Apache's mod_rewrite module, or aliases. The PHP website uses a smart PHP 404 script that redirects http://php.net/somefunctionname to the correct manual page. Neat idea. They have details on their website: here [php.net]
    • For instance, this thread in /. is http://developers.slashdot.org/article.pl?sid=02/1 0/02/ 1946224&mode=flat&tid=95 -- why couldn't it just be http://developers.slashdot.org/thread/02-10- 02/194 6224.html ? Or even http://slashdot.org/developers/02-10-02/User_Cente red_URL_Design.html ?

      Well, they're not great URL's either. Better would be something like:
      http://developers.slashdot.org/article/2002/10/02/ 1946224
      With this, you can add ../ to the right most side of the URL and get listings for the day/month/year/forever, and you can guess it'll be the same for /poll/, etc. The ID number might be frowned upon somewhat; you could shorten it to just the article number for the day, or even further categorise by time. You might try something like user_centered_url_design, but that takes a bit more work to generate and map to a database.

      I'd probably try to think where the best spot for developer is too; in front of the article number at the end? After /article? Before? :)
      I'd love to have some tips from various folk on how to use Perl and PHP with Apache in fancy ways to simplify and avoid these clunky URLs.

      I just use mod_rewrite to pass the entire path to my scripts, ala:
      RewriteCond %{REQUEST_FILENAME} !-f
      RewriteCond %{REQUEST_FILENAME} !-d
      RewriteRule ^(.*) /index.php?p=$1
      Although it's bit of an insult to mod_rewrite's capabilities, this approach is quite effective.
    • For instance, this thread in /. is http://developers.slashdot.org/article.pl?sid=02/1 0/02/1946224&mode=flat&tid=95 -- why couldn't it just be http://developers.slashdot.org/thread/02-10-02/194 6224.html ? Or even http://slashdot.org/developers/02-10-02/User_Cente red_URL_Design.html ?

      I'm thinking, why should it be? It really doesn't matter. You would never type a slashdot url or any url that refers to a specific page under a site. That's what the bookmars are ment for.

      Using page names like ".../User_Centered_URL_Design.html" only causes more work for the admin. If you say that you'll know then what it contains, try checking the titlebar.

      • Using page names like ".../User_Centered_URL_Design.html" only causes more work for the admin. If you say that you'll know then what it contains, try checking the titlebar.

        The subject of this discussion is making the life easier for the user, not the admin. If the admin have to write som code to let the user have the same URL (amlost) as the title, then so be it.

        It's kind of useless to dictate the titlebar over the phone to someone if you want them to get to the same page.

        • The subject of this discussion is making the life easier for the user, not the admin.

          How will those "easier" urls benefit users?

          It's kind of useless to dictate the titlebar over the phone to someone if you want them to get to the same page.

          Is it really useless to dictate the titlebar? You can always search using the titlebar. Do you usually dictate urls to dynamic page/other than main page by phone?

  • An earlier article (Score:2, Insightful)

    by skware ( 78429 )
    I read a better article quite some time ago by Tim Berners Lee, entitled Cool URIs don't change [w3.org], in which he discusses the designing urls properly ie. what to leave out of a url (things like .cgi, cgiexec, access details - members / non ...). The things he suggests are easily implementable in .htaccess for apache using mod_rewrite for php / cgi things.
  • This place [makeashorterlink.com] makes those hideous 4- or 5-line URLs that you want to post into e-mails or whatever into much, much shorter URLs. It's nice, it's quick, it's easy, it's good.
  • TinyURL (Score:3, Interesting)

    by Bishop923 ( 109840 ) on Thursday October 03, 2002 @12:56AM (#4378944)
    <shameless_plug>
    TinyURL [tinyurl.com]
    Really nifty utility for dealing with sites that choose the long obfuscated URL approach...
    </shameless_plug>
    • Really nifty utility for dealing with sites that choose the long obfuscated URL approach...

      This is cool, I admit. However, how long will these stored tiny URL's exist? For something like this I could email out to clients, and tell them that it would be valid for the next 90 days. This is assuming that tinyurl.com would not store my hundredline URL until the end of time.

  • Users bookmark things in unusual ways. Sometimes they will type in the URL, goto the page, then a couple of days later when they want to tell their coworker the only URL they have is two miles long.

    Then what happens when your user bookmarks the result of the redirect and you want to change the server technology without your users noticing? ie say the original URL is:

    http://myserver.xyz/myproduct

    which redirects the user to:

    http://myserver.xyz/stuff?things=blahblah&product= myproduct

    which the user bookmarks. Then you want to change your content management system. Do you support all the old links and all the redirects?

    Forget about bookmarks, how about when google indexes your site? I think something like this happened to apple's support site a couple of months back. Suddenly all the hits to knowledge base articles started coming up with 404s.

Truly simple systems... require infinite testing. -- Norman Augustine

Working...