What makes a "friendly URL"?

I've read a great deal of discussion recently (both on this site and elsewhere) about "friendly URLs" but I'm not sure what exactly makes a URL "friendly" and why we really even care (up to a certain point). Illustration:

The following is an example of a URL that would be held up by the majority of current web developers as "friendly":

www.myblog.com/posts/123/this-is-the-name-of-my-blog-post

Whereas this would be considered "unfriendly" (i.e. bad, Neanderthal, ignorant, stupid):

www.myblog.com/posts.aspx?id=123

My questions:

  • Doesn't the "friendly" URL contain duplicate identifying information about the blog post in question? In other words, once you have the id (123) of the post, why do you need the title? Wouldn't this be a violation of the "don't repeat yourself" mantra?
  • What difference does the form of a URL make as far as users are concerned? Do users ever actually type full URLs by hand (other than the TLD, of course)? Do users ever look to the URL of a page to determine what the page is about? Why do we need the title of the blog post in the URL? Isn't that what the page's <title> tag and content are for?
  • I often hear SEO as a reason why the "friendly" URL form is preferred. Why does a search engine spider care about the URL? Aren't they just automated pieces of software that crawl pages (and the links to other pages that are contained within them)? If search engines were written like other software components (e.g. database access components), the URL would just be a meaningless identifier (similar to a rowguid in a relational database) to them. If I were designing a database schema with something like the "friendly" URL above as a table's primary key, I would (quite correctly) get chewed out.

I said earlier "up to a point" because obviously, URLs can get out of hand. Here is an actual URL from Amazon.com that I don't think anyone in their right mind would consider "friendly":

http://www.amazon.com/Bissell-Kitchen-Housewares/b/ref=amb_link_5001972_17?ie=UTF8&node=694500&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=gp-center-5&pf_rd_r=1ZXNJFE0CCFFDH4B9HGH&pf_rd_t=101&pf_rd_p=405478901&pf_rd_i=510080


Solution 1:

Tim Berners-Lee (the architect of the WWW) wrote a great article about this subject about 10 years ago.

  • Your example is a bad URL -- but not just because it has both an id and a "slug" (the abbreviated, hyphenated form of the page title). Putting the page title into your URL is problematic in the long term. Content will change over time. If you ever change the title of that blog post, you'll be forced to choose between keeping the old URL, or changing the URL to match the new title. Changing the URL will break any previous links to that page; and not changing it means you'll have a URL that doesn't match the page. Neither is good for the user. Better to just go with www.myblog.com/posts/123.

  • Users often do need to type a URL, but more importantly, sometimes they'll also edit existing URLs to find other pages in your site. Thus, it's often good to have discoverable URLs. For example, if I want to see post #124, I could easily look at the current URL and figure that the URL for the page I want to see is www.myblog.com/posts/124. That's a level of user-friendliness that can be a big help to people trying to find what they're looking for. Including other information (like the subject of the post) can make this impossible -- so it reduces my exploration options.

  • Forget about SEO. Search engine technology has been reducing the effectiveness of SEO hacks for some time. Good content is still king -- and in the long run, you won't be able to game the system.

Solution 2:

To me, friendly URL means there's been some attempt to include semantic information in the URL to make it more fit for human consumption. It's an interesting example of a computer-computer interface being augmented and built upon to make a better human-computer interface.

So, in your two examples:

  • www.myblog.com/posts/123/this-is-the-name-of-my-blog-post is friendly, because you've included the title in the URL - it tells you something about the page.
  • www.myblog.com/posts.aspx?id=123 is unfriendly because it's cryptic and obscure: it makes perfect sense to a database, but none to you or me.

Friendly URLs are fantastic in some situations and useless in others. Basically, if a user is ever going to be exposed to it, I'd make friendly URL creation a priority, and it's not just a matter of aesthetics. It makes it much easier to get back to URLs from the address bar if you can quickly see and understand what the various options are, plus it makes it more obvious where you're about to go if you're following a link from a web page.

Combine all that with the awesome bar in Firefox 3+ (surely coming in other browsers too), and auto-complete in the address bar becomes incredibly powerful when you're dealing with friendly URLs.

Solution 3:

There seems to be a lot of conflicting information about precisely what effect querystring have on crawlers, but the consensus is that having more than a couple parameters harms your SEO because a long querystring variable indicates dynamic content, and so most search engines will be a lot less aggressive indexing your page.

Adding a slug to your url, such as this-is-the-name-of-my-blog-post from your example, also makes your links more different from one another than a simple id number, and adds more significant words into the url. These are all things that search engines look for.

Personally I find such urls much easier parse visually because there are fewer punctuation characters used, and the name-value pairs in the querystring can be very verbose and hard to remember.

Solution 4:

It's a good point about how your putting unnecessary information in the URL.

http://stackoverflow.com/questions/522466/what-makes-a-friendly-url

Once the unique id 522466 is known - the rest is useless, so it purely serves to make the URL look "nice" and provide the user with an idea as to what the page links to. But this creates another problem. Most sites do not "verify" that part of the URL, so you could put --

http://stackoverflow.com/questions/522466/omg-goatse-bought-by-bill-gates

Yet it will still link to this post. You can see how this may cause more problems than they are worth because they could be used maliciously.

I feel Digg have taken the right approach to this. They do not use IDs in their URLs. Behind the scenes they get the ID from their database purely from the title given.

http://digg.com/linux_unix/I_Like_Linux_so_my_aunt_sends_me_this_for_Christmas

This, for me, is the perfect url. It gives me all the information I need to feel secure in clicking the link.

In fact, titles play such a huge role that, in the world of digg, people "blind digg" purely based on the fact that they like the title, or are interested in it. If your url looks interesting, you may very well be getting more traffic to your site. At the same time you will be making it more user friendly, prettier, and search engines will thank you. As far as I can see, friendly urls are win win for everyone.

Solution 5:

My thoughts on your three bullets:

  • I'd say that's not an optimal URL. I have no idea why one would show both the post identifier and title. I don't ever include post IDs in my URLs at all, only titles and (sometimes) dates
  • For users, shorter is better.
  • Search engines look at the url. Whether it makes sense or not, they do. Having keywords in the URL will offer some SEO benefit.