Using Good URL Design to Create a Breadcrumb Navigation System In PHP

One fundamental aspect of any good navigation system is a mechanism to let users know where they are in relation to the rest of the content. On the Web, one of the most popular and thoroughly studied ways to reflect this information is in a so-called breadcrumb navigation trail or bar. This navigation bar lists the categories and subcategories (or sections and subsections) of the page which the user is browsing.

(Quick-minded readers will be quick to point out that the term breadcrumb navigation trail, which originates from the Hansel and Gretel story, is not entirely accurate. In the story, Hansel and Gretel left a trail of breadcrumbs as they wandered through the forest so they could retrace their steps back home. On a web site, the breadcrumb navigation trail reflects the current page’s position in the site hierarchy, not the path the user has taken through the site.)

Defining the Requirements of Breadcrumb Navigation

When I started looking for ways to easily implement a breadcrumb navigation trail on my web site, I immediately saw the potential complexity of the project. The hierarchical navigation system needed to:

  • Accurately reflect a page’s position in the hierarchical information architecture of the site and not the physical location of the page in my web server. (This is important for me because I’m constantly moving things around.)
  • Present a nicely formatted label for each level of depth.
  • Automatically adjust itself for new pages, so I would never have to deal with it when adding new sections to my site.
  • Not require any “heavy” database or large application to use and scale.

Additionally, the module should:

  • Produce semantic, valid HTML marked up as a list.
  • Link all elements in the list to their appropriate page level.
  • Not link the last level, which should display the current location of the user in the site. (We don’t need a link back to the same page we’re already on.)

Searching for Inspiration

The first thing I, like any good programmer, went to do was search for existing code. Unfortunately, there is an appallingly small amount of information on implementing breadcrumb navigation out there. Sure, they’ll tell you what it is and what it’s good for, even convince you why you should have it, but as for actually implementing it goes…sorry, you’re on your own.

Every tutorial I found on the matter was decidely sub-par or completely failed to meet my requirements, listed above. Most sites stated the need for databases to create truly scalable and customizeable breadcrumb trails. Of course, nothing can compete with a fully integrated RDBMS solution, but with intelligent information architecture you can come pretty darn close.

Using the URL to Our Advantage

Thankfully, there is already a system of hierarchical navigation readily available to us. In fact, there are two!

  1. Directory trees on your web server. The folders in which your files are stored are probably already sorted somewhat logically and categorically.
  2. The URL of the requested page. The address bar in your browser drills down through these categories to get to the final page.

Both of these systems provide a basis from which we can draw upon to create a simple, self-organizing breadcrumb navigation system on a web site of almost any size. However, there are a few problems with the first method. (If you want to implement that method anyway, here’s a tutorial based around using a directory tree.)

  • It doesn’t scale. If you’ve got a huge web site, you’re not going to keep all that data on static files on your web server. It would also be a full-time job to categorize and sort that content on said server! Oi, vey!
  • It’s dependant on your organization system. If you’re anything like me, you don’t necessarily have folders within folders within folders just to store files, and if you do, you probably change them around pretty often to suit your needs. Unfortunately, basing breadcrumb navigation on your view of your information as opposed to your visitor’s view of the same information is a recipe for disaster in all but the smallest web sites.

Thus, the only remaining solution was to rope the URL of the page into service to serve as the basis for a breadcrumb navigation system. Even so, there is one huge obstacle to overcome with this method. The URL itself must not be attached to a physical filesystem on the web server.

Otherwise, we’re just basically using the filesystem hierarchy, which we’ve already seen is not adequete for our needs. So we need a way to abstract, or detach, the URL from the filesystem itself. Enter mod_rewrite, stage left.

Abstracting the URL from the Filesystem

Abstracting the URL from the filesystem is the critical step to enabling truly scalable, self-organizing breadcrumb navigation using such a simple foundation. This is essential because the URL of your page is the thing that will always be live; if the URL changes, so has your page! This change will then be reflected automatically in the breadcrumb navigation trail.

There are a number of other benefits to abstracting the URL of a page away from the physical filesystem it represents. I’ve even talked about some of them before when I blogged about playing with Apache and when I suggested tips for improving search engine rankings of CMS-generated content.

The basic prerequisite for a good URL schema, however, is solid information architecture. This means having a well-defined and clear structure to your site so information is easily findable and properly labelled, categorized, and placed within the system you use. Only after you create a sound URL design schema free of any unsightly cruft which succinctly encodes the hierarchy of your site will you be able to leverage them to their fullest potential. (Phew, that was a mouthful!)

So, since this is not a post about URI design (see these resources for articles on that), let’s assume you’ve done all that and are ready to implement your schema. The tools you’ll need are:

  • The Apache web server.
  • Its mod_rewrite module.
  • Access to the server configuration file or the ability to use .htaccess files.
  • Later on, you’ll also need PHP, for the coding thing.

To make things simple, I’ll use my site as an example.

For my own site, I have several sections (“About,” “Services,” “News & Weblog,” etc.) and several top level pages that stand on their own (“Home,” “Contact”). Additionally, each section has an index page to ensure that no hierarchical level of my site is without its corresponding landing page.

This was going to be represented with straight-forward URLs that looked like this:

  • The “Home” page would have this URL: http://www.maymay.net/maymaymedia/
  • So-called “top-level” pages would look like this:
    • For the “Contact” page: http://www.maymay.net/maymaymedia/contact/
    • For the “About Us” page: http://www.maymay.net/maymaymedia/about/
    • For the “Services” page: http://www.maymay.net/maymaymedia/services/
  • Second-level pages would have URLs such as these:
    • For each service:
      • http://www.maymay.net/maymaymedia/services/web-programming/
      • http://www.maymay.net/maymaymedia/services/web-accessibility/
      • http://www.maymay.net/maymaymedia/services/web-site-optimization/
    • For each “About” page, for instance:
      • http://www.maymay.net/maymaymedia/about/accessibility/
      • http://www.maymay.net/maymaymedia/about/philosophy/
      • http://www.maymay.net/maymaymedia/about/technology/

My files are stored pretty simply on my web server. Some pages, like the “Contact” page are stored in the maymaymedia directory but not in their own sub-directory. Other pages, like the “Services” and “About” pages, are stored in the services and about directories, respectively

This created filenames like maymaymedia/contact.php and maymaymedia/about/index.php. That .php stuff had to go!

Writing Your .htaccess File

Using the magic of mod_rewrite, a few relatively simple lines added to your .htaccess file is all it takes to turn these filenames into these beautiful cruft-free URLs.

Creating the sections was easy enough. Some of the work was already done by keeping the pages in their own directory (such as “Services,” for me) so I didn’t even have to do anything. Of course, to be safe, I could future-proof my site to abstract these directory trees using additional lines in my .htaccess file, but for the sake of simplicty we’ll leave it as is.

Here are a few relevant snippets from my .htaccess file:

RewriteEngine On
RewriteBase /maymaymedia/
[…]
RewriteRule ^about/technology/?$ /maymaymedia/about/tech.php
RewriteRule ^about/philosophy/?$ /maymaymedia/about/philosophy.php
RewriteRule ^about/accessibility/?$ /maymaymedia/about/access.php
[…]
RewriteRule ^services?/web-site-optimization/?$ /maymaymedia/services/wso.php
RewriteRule ^services?/web-programming/?$ /maymaymedia/services/webprogramming.php
RewriteRule ^services?/web-accessibility/?$ /maymaymedia/services/accessconsult.php
[…]
RewriteRule ^contact/?$ /maymaymedia/contact.php

These merely map simple URL’s onto the appropriate files for the specific page on my server, creating the nice breadcrumb-like URI structure we can now use to automate the process of generating our hierarchical navigation bar.

There are two main points to keep in mind when creating such URL schemes.

  • Use a standardized syntax.

    Don’t mix and match symbols. Choose one syntax and stick with it. I used a hyphen (-) to simulate spaces, since spaces are not allowed in URIs. You could also use the underscore (_) if you wanted, but hyphens don’t require users to press the shift key.

  • Make the URL descriptive.

    Since we’re going to be using these URLs as our breadcrumb navigation labels, they need to be as descriptive as possible. Keep them short enough to easily type but long enough to provide context. For instance, I could have used wso instead of web-site-optimization but how many people know that that’s what WSO stands for?

    Besides, including the whole phrase boosts our search engine rankings by embedding our keywords into the URL itself! A nice bonus which also enhances usability.

    If you’re really worried about optimizations (for instance, because you’ll have to write all your links like href="/services/web-site-optimization/") then you can use a redirection script and point your links at something like href="/r/wso/" which would then automatically redirect to the expanded URL.

Coding the PHP Breadcrumb Navigation Bar

Now we’re ready to leverage the inherent advantages of good URL design to create our scalable PHP breadcrumb navigation script. I use a special PHP script called navbar.php to dynamically generate all of my navigation, so that’s where the code will go. navbar.php can then be SSI’ed or include()‘ed on our page template so it will be present in all the pages we create from here on out.

The breadcrumb navigation script needs to do a few things. The code flow looks like this:

  • Grab the requested URI and break it into its path components.
  • Initialize (or remove) the first (“top”) element.
  • Count the remaining elements.
  • Check for trailing slashes on the last element, and remove it if necessary (such as if its empty or it’s just a query string).
  • Loop through all the elements, placing them within <li></li> HTML elements as well as <a></a> if we are not at the last level.

Stepping Through the Code

First, we start our HTML by providing a header (which we can later set to display:none; in our CSS) and starting the list element. Due to the specific nature of my site structure (the site starts in a sub-directory of my main site), I also chose to always print the main “Maymay Media Home” link. You don’t have to do this.

<h3>Hierarchical Navigation</h3>
<ul id="hier-nav">
<li id="hier-homeLnk"><a href="/maymaymedia/" title="Maymay Media Home">Maymay Media</a></li>

Next, we start the PHP magic, grab the URI the browser requested and break it on the slashes with explode().

<?php
$url_parts = explode('/', $_SERVER['REQUEST_URI']);

Now the $url_parts variable contains an array, each element of which is a segment of the path from the requested URI. Since a URI request always starts with a leading /, the first element of our $url_parts variable is an empty string. That’s somewhat useless for us, so we’ll get rid of it.

array_shift($url_parts); // first item always empty in URLs

In my particular case, the second element will always correspond to the maymaymedia sub-directory I have my site in. Thus, the second element of this array is also useless to me, since I chose to always print this “top level” link before we even started the conditional code. Thus, I get rid of the second element, too. (You may not want to do this in your script!)

array_shift($url_parts); // second item unneeded since always contains "maymaymedia" due to my site structure

Now we need to count how many elements are left in our array which will tell us how many list items we need to make and where to find the end of our array so we can identify the last element. Remember, we’re going to need to check if the last element is empty or not because we don’t know if the visitor entered a trailing slash in the address bar.

If the last element is empty like the first one then we don’t need it. We also don’t need it if all it contains is a query string, so in either case we can pop it off the end of our array and decrement the variable we use to store the size of our array.

$num_parts = count($url_parts);
// remove last element if empty or if a query string
if (empty($url_parts[$num_parts-1]) || $url_parts[$num_parts-1] == '?'.$_SERVER['QUERY_STRING'])
{
    array_pop($url_parts);
    $num_parts--; // decrement to keep track
}

Now we have all the data we need to construct our breadcrumb navigation list. The first step is to start the second list item and insert a separator (I chose a “>” character), and then to format our URL strings into a more human-readable format. Basically, that just means replacing all of the “-” characters with spaces, and then capitalizing the first character of all the resulting words.

for ($i=0; $i<$num_parts; $i++)
{
    echo '<li> &gt; ';
    $p = ucwords(str_replace('-', ' ', $url_parts[$i])); // format 'things-like-this' to 'Things Like This'

(Of course, advanced CSS coders will note that inserting a physical character in the HTML markup is unnecessary since we can use a CSS rule such as #hier-nav li:before { content: ">"; } to create this presentational separator instead. Unfortunately, Internet Explorer does not support this level of CSS yet, so we must resign ourselves to use a physical character in the markup for now.)

Then we make a quick check to ensure we’re not at the end of the trail. If indeed we’re not, then we need to turn our text into a link and point it to go to the appropriate level. Otherwise, since we’re at the end of the trail, we just print the text and end our list item.

Creating the link looks complicated, but it’s really not. Our array stores all the path components for us, so all we need to do is slice off the end of it so all that remains is the first elements we’ve already looped through.

Once we’ve done that, we merely reassemble our URL by reversing the formatting process we used before. Namely, we implode() our array to turn it into a single string connected with the slashes, replace all spaces with dashes (-), and then turn the whole thing lowercase. We can do all of it on one line.

Also note that due to my particular site structure I’ve hard-coded the links to point to the root directory of this site. You’ll probably want to remove that on your own script.

    // only link if not last time through loop
    if ($i != ($num_parts-1) )
    {
        echo "<a href=\"/maymaymedia/".strtolower(str_replace(' ', '-', implode('/', array_slice($url_parts, 0, $i+1))))."/\" title=\"Go up to $p\">$p</a></li>";
    }
    else // last time through loop, so don't do the link
    {
        echo "$p</li>";
    }
}
?>
</ul>

That’s it! After closing our list element, we’re done, and we now have scalable breadcrumb navigation based on an intelligent URI schema on every single page. Was it good for you too?

Additional URI Design Resources

(Some segments shamelessly swiped for my own easy reference from Pixelcharmer.)

A list of resources that argue for and suggest best practices in URI construction.

By the way, here’s the difference between the two: What is a URL? What is a URI?

6 replies on “Using Good URL Design to Create a Breadcrumb Navigation System In PHP”

  1. Hi Meitar,

    amazing work and exactly what I was looking for – thank you very much! Would you mind adding a short description about how to remove the final .html for different url structures?

    Best regards! Fabian

Comments are closed.