Advertisement

Tutorials

Home Tutorials Beginner Tutorial

How to easily make friendly URL's in Apache - Part 1

3.0/5.0 (5 votes total)
Rate:

Jim
May 09, 2006


Jim

http://quad341.com/
Jim has written 1 tutorials for CGIDir.
View all tutorials by Jim...

This is part one of a two part tutorial using mod_rewrite. Part one can be applied directly to any language, but part two will get into a php specific implementation.

The idea behind this tutorial is to change long urls that have variables passed or messy names into someting more readable (with examples!). For example, we want to take things that look like

http://example.com/news.php? month=04&day=03&title=hi+mom&kitchen-sink=true

and make them look nicer like

http://example.com/news/04/03/hi+mom/true

or something of the sort. There are many more uses for this than just that.

There are two general ways to do this with php (though the first applies to all technology run from Apache like Perl, etc.). The first is well documented but requires some knowledge of regular expressions. The second is only going to be presented for PHP, but could be modified to run on anything and allows for php to handle the processing, not apache. All variables are assumed to be submitted through the GET interface (the $_GET superglobal array)

1. In the above example, I present 4 variables: month, day, title, and kitchen-sink. These variables are integers, a string, and a boolean response. We could set up our .htaccess file (which is what controls everything about the redirection) to merely accept what is given to it and forward that, but we can also use this method to do some input filtering and only select the correct data.

The .htaccess file sits in the root folder of where you want to redirect. This means that if you want to control from every slash (/) after the domain name, it must be in the root directory, but if you only want to control within the /rewrite-test/ directory, it can sit in the /rewrite-test/ directory itself. The .htaccess file is a set of rules that is presented to Apache to modify default values. We are actually working with the Rewrite Engine here. NOTE: this assumes that mod_rewrite is installed for Apache and you have enough permission to create and use the Rewrite Engine through a .htaccess file in your site's directory. If this does not work, consult your server administrator.

In order to use the rewrite engine, our .htaccess file must have the line

RewriteEngine On

above any rewrite rules. This enables rewriting. If you want to test the rewrite engine (and we will use this for our first example), add the next line to the file immediately below:

RewriteRule .* /index.php

Make sure you save and upload. This line can be explained as the following: As a rule, if the path contains anything, redirect the user to /index.php. You can test this by going to any url at your server (preferably a non-existant file) such as http://www.example.com/nonexistingfolderthatshouldproducea404/
After going there (placing your domain in the place of example.com), you should see your index.php. This means it is working. If you get a 500 internal error, the file may contain some invalid characters. I found that adding a blank line to the top of the file helped this on annoying servers.

That isn't very useful, now is it? Anyway, that's a static rule.  There's no conditions. This is where RewriteCond comes into play. These are the Rewrite Conditions. Essentially, they are the if statements for redirection. Let's set this up so that only if the Requsted Filename (the requested resource/path) is not a folder or a file, it does this. Our file will now look like the following:

RewriteEngine On

RewriteCond %{REQUEST_FILENAME} !-f

RewriteCond %{REQUEST_FILENAME} !-d

RewriteRule .* /index.php [L]

We know what the first line is, but the next three are weird! Let's inspect them one at a time.
The first RewriteCond line is asking if the requested filename is not a file.  This line used a few special characters. The second parameter specifies what is being checked. I have specified to check the filename which only looks at the end of the request (the file or folder), not the entire request. Note that these begin with a % and the variable is in all caps and enclosed in braces {}. After that is what the selected variable is being compared to. This comparison is special. The -f parameter specifies a file. The ! specifies not, so this mean Not a file.
The second line does a similar action to the first, except it compares it to a directory (-d)
The third line looks simiar to above, but it has a fourth parameter: [L]. What's that? This is the option to state that this is the Last line in this section of redirects. This means that if you specify different redirections below, they will not be parsed or looked at and hence you won't redirect again. This is useful for not having to worry about the order of the redirects. Other options can be specified and are separated by commas. See links at the bottom.

We have now looked through the basics for mod_rewrite. Let's get back to our practicle example. If you are confused by the regular expression which is the third parameter of the RewriteCond lines or the second parameter of the RewriteRule lines, I suggest you look up regular expression documentation or use Regex Buddy  ( http://www.regexbuddy.com/ ). I use Regex Buddy to check complicated regular expressions even with knowledge of regular expressions.

In our example, we want to check for the first parameter to be a number, the second to be a number, the third to be a string (note, url encoded), and the fourth to be a boolean. Our htaccess file would look something like the following:

RewriteEngine On

RewriteCond %{REQUEST_URI} ^/news/([0-9]+)/([0-9]+)/([a-zA-Z0-9%_- ]+/(true|false)

RewriteRule .* /news.php?month=%1&day=%2&title=%3&kitchen-sink=%4 [L]

The second line looks intimidating, but it really isn't. It is all a long regex string that searches for exactly what I specified above. Again, consult regex documentation for help with regex. Note that all of the parts that we are looking to select are in parenthesis. This makes them able to be referenced in the Rule. If you leave off the parenthesis, it will still check the regex, but you cannot reference it. The third line is the more interesting. It actually rewrites the url to what it originally was using those "back references".  Note that all of the back references from the condition are referenced using the %[number] notation. If you have more than one Cond, it keeps numbering through the lines progressively. That takes care of this example, but here's an extra note because we won't get to it here: You CAN reference parts of the url in the RewriteRule and rewrite them. These back references are prefixed by a $ rather than a %, that is all. Now isn't that nice? What about using php to parse it? see part 2, coming soon.

 

--

This tutorial is licensed under Creative Commons by-nc-nd


Add commentAdd comment (Comments: 2)  
Title: IIS Also August 2, 2006
Comment by Jim

There is a mod rewrite free plugin for IIS also if you want to use it. It doesn\'t use .htaccess files and instead uses an INI file, but the syntax is identical.

Title: Revision August 2, 2006
Comment by Jim

The last rewrite cond line should read RewriteCond %{REQUEST_URI} ^/news/([0-9]+)/([0-9]+)/([a-zA-Z0-9%_- ]+)/(true|false) Note the closing parenthesis

Advertisement

Partners

Related Resources

Other Resources

image arrow