Data Koncepts

Apache's mod_rewrite

Data Koncepts

Apache's mod_rewrite

  • Home Page open submenu
    Data Koncepts'
      Home Page
  • Webmaster open submenu
    Professional
        Webmaster

    Development
        Process

    Small Website
        Fixed Price
        Offer

    Website
        Clients


    FREEBIES:
    Webmaster Security
        (see Security)

    Search For a
        New Host
        Checklist

    Search Engine
        Optimization
        w/mod_rewrite

    mod_rewrite
        Code Generator

    E-Mail
        Encrypter
  • Web Hosting open submenu
    Web Hosting
        Info
    & Checklist
    Web Hosting
        Offer
  • Security Updated! open submenu
    Online
        Security

    SuperScan v2New!
        Attack
        Detection
        & Reporting
    Hack Recovery
  • Professional Services open submenu
    Professional
        Documents

    Digital
        Imaging

    Screensavers
  • Computers open submenu
      Hardware
      Software
  • Contact open submenu
      Contact
      Terms &
        Conditions

      Sitemap
Website monitor by killerwebstats.com

Freedom Lost! Freedom!

Apache's mod_rewrite

Last updated: January 28, 2021.

PDF version for download PDF

Apache's low cost and powerful set of features make it the server of choice around the world. One of its real treasures is the mod_rewrite module who's purpose is to redirect a visitor's request in the manner specified by a set of rules.

This article will lead you through the Why, Installation and Test, Regex, RewriteCond(itions), Flags, Comments, Linking, Introduced Problems, Examples and will Summarize with the best references I've discovered.

Why Redirect a URL?

The simple answer is to make them human-readable (commonly called "user friendly" or "Search Engine Optimized"). URLs with query strings (the URL's text after a question mark) confuse most visitors and are difficult for them to type correctly. By changing the URL, you can make your site more "user-friendly."

For example:

http://www.example.com/display.php?country=USA&state=California&city=San_Diego

could be changed to

http://www.example.com/USA/California/San_Diego

Critical Note: As the webmaster, YOU must create your links in the "new format" then create the mod_rewrite code to redirect that link to the file you wish to serve.

Other possible reasons might include:

  • Updating your website (new directory structure, file names or file extensions)
  • Ensuring Secure Server access to ecommerce scripts
  • Preventing hotlinking of your images
  • Blocking access to sensitive files and
  • Lots more (see the Examples section)

mod_rewrite has other use, too, but let's get on to the basics first.

Server Setup

Some hosts do not have mod_rewrite enabled (it is, by default, not enabled). You can find out if yours server has mod_rewrite enabled by using a script with the simple PHP code:

phpinfo();

Look in the Apache2Handler section and, if mod_rewrite is not listed, you will have to ask your host to enable it - or find a "good host" (most hosts will have it enabled).

The following will describe how to enable and test mod_rewrite on your test server.

First, you will need to change the default Apache configuration (this is in Apache's httpd.conf file) by removing the "#" at the beginning of the line

# LoadModule rewrite_module modules/mod_rewrite.so

(Loadmodule is now AddModule)

While you're in the httpd.conf file, be sure that you have

<Directory />
    Options FollowSymLinks
    AllowOverride All
</Directory>

You will need to RESTART Apache for these changes to take effect.

Apache will now be running with mod_rewrite as you will see with another look at the phpinfo() output.

Test

To be sure that you have mod_rewrite installed and working properly, here is a simple test for you: Create three files, test.html, test.php and .htaccess.

test.html:

<h2>This is the HTML file.</h2>

and ...

test.php:

<h2>This is the PHP file.</h2>

Create the third file, .htaccess, with the following:

RewriteEngine on
RewriteRule ^test\.html$ test.php [L]

If you are using Notepad, you may have to save it as htaccess.txt, upload and change the name to .htaccess on the server.

Upload all three files (in ASCII mode) to your server and then type:

http://www.example.com/test.html

in the location box - using your domain, of course! If the page shows "This is the HTML file." You have got to start over. If it shows "This is the PHP file", it is working properly! Note, please, that the test.html URL has remained in the browser's location box.

Specificity (your specification)

Whether you're changing from one URI to another or creating a whole new file structure (e.g., renaming all files from .html to .php or eliminating the file extension), you must create a specification for what your redirection will accomplish (and what it must NOT accomplish). To amplify, your specification needs to tell you in an unambiguous manner exactly what you want to change so mod_rewrite can match ONLY that URI and the redirection must NOT create a loop.

Matching: Do you want to match/redirect EVERYTHING (or NOTHING)? If not, eliminate (.*) NOW! If you want to remain at the same depth in your directory structure (highly recommended), eliminate /'s from your regex's character set. Uppercase letters? If not, you're pretty much left to lowercase letters, digits and the dot, dash and underscore (as allowed characters - ref: Uniform Resource Identifiers (URI): Generic Syntax by Tim Berners-Lee et al).

Redirection: Will your redirection loop? That's the primary problem with (.*) - although it will also pass unexpected garbage (or nothing at all). ALWAYS check that the redirection cannot be matched by the regex and, if it can, specify an exclusion. (WordPress users will know that WP redirects EVERYTHING to index.php with the exclusion that it will not redirect existing directories or files.)

mod-rewrite Regex

Remember that you create the "new format" URIs then the mod_rewrite code to convert that into a file request for the script you want to provide.

Now we can begin with rewriting your URIs!

If you are not familiar with regular expressions (regex), there are many sites which provide excellent tutorials. At the end of this article, I have listed the best pages I have found: A tutorial, a "cheat sheet," a very nice text editor with regex capabilities and a test tool for your regex. If you are not able to follow my explanations, review the first two of those links.

Problem: Display city information based on the country, state and city requested.

To change

http://www.example.com/USA/California/San_Diego

to

http://www.example.com/display.php?country=USA&state=California&city=San_Diego

so your display script can read and parse the query string, you will need to use regex to tell mod_rewrite what to attempt to match.

Too many people just use the (.*) to select (NOTHING OR) EVERYTHING in an "atom" (an Apache variable you can create and use within mod_rewrite) and try to pass that along to the redirection string. In this case, you would need three of these atoms separated by the subdirectory slashes ("/") so the regex would become:

(.*)/(.*)/(.*)

Note #1: (.*) combines two metacharacters, the dot character (which means ANY character) and the * character (which specifies ZERO or MORE of the preceding character) within an atom (Apache variable created by mod_rewrite). Thus, (.*) matches EVERYTHING in the {REQUEST_URI} string ({REQUEST_URI} is that part of the URL which follows the domain up to but not including the ? of a query string and is the ONLY Apache variable that a RewriteRule can attempt to match). With the above regex, the regex engine will progress to learn that you have required two slashes (anywhere) in the string. For our purposes, though, we need to capture the three values in the {REQUEST_URI} so I've used the slashes to separate them.

To tell mod_rewrite that the URI should begin and end with this string, we add the start anchor (^) and end anchor ($) so the regex becomes:

^(.*)/(.*)/(.*)$

Note #2: Apache changed regex engines when it changed versions so that Apache 1.x requires the leading slash while Apache 2.x forbids it! I satisfy both versions by making the leading slash optional, i.e., ^/? (? is the metacharacter for zero or one of the preceding character) but I'll use the Apache 2 version, the ^.

This allows TOO MUCH to be sent to your query string – often a security hazard – and, when used inappropriately, WILL cause mod_rewrite to loop! To avoid unnecessary problems, I'll change the EVERYTHING atoms to specify exactly the characters I will allow. Thus, the first atom (USA) can be matched by ([A-Z]+) which ONLY allows one or more uppercase letter (the "+" metacharacter specifies one or more of the preceding character while the "*" metacharacter specifies zero or more – I want to ensure at least one character in the range from A to Z). California contains both uppercase and lowercase letters so this atom becomes ([a-zA-Z]+). San_Diego also contains an underscore (replacing the space which would display as the "ugly" %20 in the URI) so this atom becomes ([a-zA-Z_]+) and, with the {REQUEST_URI}'s starting /, we have:

^([A-Z]+) / ([a-zA-Z]+) / ([a-zA-Z_]+) $

All that would be well and good if the only country was USA but we'll need to expand the regex for other countries and allow an underline to replace the spaces in the "North," South," "West" and "New" states so the regex would expand once again to:

^([a-zA-Z_]+)/([a-zA-Z_]+)/([a-zA-Z_]+)$

Note #3: If you have a short list of allowable countries, it would be best to avoid database problems by specifying the acceptable values with regex:

^(USA|Canada|Mexico)/([a-zA-Z_]+)/([a-zA-Z_]+)$

Note #4: If you are concerned about people typing in CAPS when your database is strictly lowercase, have regex ignore the case by adding the No Case flag ("[NC]") after the redirection. Just don't forget to convert to lowercase in your script after obtaining the $_GET array! More on flags later.

Note #5: Since URLs can't have spaces (except as %20), use underlines or hyphens to replace them. If you ABSOLUTELY have to use spaces (%20) in your URIs, you can include them in your regex within a range definition as \{space}, i.e., ([a-zA-Z\ ]+). However, this is NOT advised.

Note #6: If you are converting to/from a database field which does contain spaces, you should convert the spaces to some other character. Using PHP, you can use

$state = str_replace ( ' ', '_', $state );

before placing $country in the link and reverse the process with

$state = str_replace ( '_', ' ', $state );

before matching $state to the database field. Using _'s is better than -'s because text can often include the hyphen character which would be converted to a space by this code and is better than %20 in the URI as spaces require special treatment in the regex and redirection.

With the regex in hand, you can now map the atoms to the query string:

display.php?country=$1&state=$2&city=$3

where display.php is the name of the script, $1 is the first (country) atom, $2 is the second (state) atom and $3 is third (city) atom. Note that there can only be nine atoms created, $1 … $9 (the tenth, $0, is the entire target string, the {REQUEST_URI}).

Almost there! Open a New document with EditPad (or your text editor) and type:

RewriteEngine on
RewriteRule ^([a-zA-Z_]+)/([a-zA-Z_]+)/([a-zA-Z_]+)$
        display.php?country=$1&state=$2&city=$3 [L]

Note #7: The RewriteRule must go on ONE line with one space between the RewriteRule, the regex and the redirection (and before any optional flags). NotePad indiscriminately inserts line returns in long lines so you're far better off using a good text editor (see references at the end).

Note #8: If you won't always have the city or the state and city, then you can easily make them optional replacing the above with:

RewriteEngine on
RewriteRule ^([a-z]+)(/([a-z]+)(/([a-z]+))?)?$
        display.php?country=$1&state=$3&city=$5 [L]

where

  • the first atom is still the country,
  • the second atom is (/([a-z]+)(/([a-z]+))?)
            which is made optional with the trailing ?
  • the third atom is now the state
            which is NOT optional if the second atom is present
  • the fourth atom is (/([a-z]+))
            which is made optional with its trailing ? and
  • the city is the fifth atom
            which cannot be present without both the country and the state

If the optional atoms confused you, use three separate statements. Optional atoms are NOT mandatory, just an easy way to combine several statements into one.

Save this as .htaccess in the directory where display.php resides.

If you want to use digits (0, 1, ... 9) for, say, Congressional Districts, then you'll need to change an atom's specification from ([a-zA-Z_]+) to ([0-9]) to signify a single digit, ([0-9]{1,2}) for one or two digits (0 through 99) or ([0-9]+) for one or more digits (0 through ...; useful for database id's).

The RewriteCond(ition) Statement

Now that you have learned to match mod_rewrite's basic RewriteRule(s) with the {REQUEST_URI} string, it's time to learn to use conditionals to access other variables with the RewriteCond(ition).

RewriteCond is similar in format to the RewriteRule in that you have the command name, RewriteCond, a variable to be matched, the regex and flags (the logical OR flag is a useful flag to keep in mind as RewriteConds are ANDed by default).

The best list of Server Variables I've found is located here.

For an example, let me assume that you want to force the www in your domain name (and you don't have subdomains to be concerned with). To do this, you will need to test the Apache {HTTP_HOST} variable to see if the www. is already there and, if not, redirect.

RewriteEngine on
RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
RewriteRule .? http://www.example.com%{REQUEST_URI} [R=301,L]

Here, to denote that {HTTP_HOST} is an Apache variable, we must prepend a %. Then, the regex says to match the logical negation of (i.e., NOT) (start anchor to match the start of the {HTTP_HOST} string) www, an escaped dot (meaning that it ONLY matches the dot character), the domain name example, another escaped dot, and com (end anchor to match the end of the {HTTP_HOST} string). The No Case flag ([NC]) is necessary because a domain name is not case sensitive. AND …

The RewriteRule says to match zero or one of anything then redirect to http://www.example.com with the original {REQUEST_URI}. The R=301 tells the browser (and search engines) that this is a permanent redirection and the Last flag tells mod_rewrite that you've completed your redirection.

RewriteCond statements can also create atoms via their regex but these are denoted by %1 … %9 the same way that RewriteRule atoms are $1 … $9. You'll see these in operation in the Examples.

Flags

mod_rewrite uses "flags" to give your mod_rewrite code additional power. I've used the Last, Redirect and No Case flags above but the main ones you'll need to be familiar with are:

last|L
The Last flag ([L]) tells Apache to terminate a series of RewriteRule processing upon a match and perform the redirection
nocase|NC
The No Case flag ([NC]) tells Apache to ignore the case of the string in the regex and is best used when applied in a RewriteCond statement examining the {HTTP_HOST} variable for the domain name (which is NOT case sensitive).
redirect|R
The Redirect flag ([R]) is most often used with special notation telling Apache to send a Permanent Redirect code to the browser so it's used as [R=301]. This is when you need to see the redirection in testing or you want SE's and visitors to know that the redirection is permanent as the default is 302, temporary redirection.
qsappend|QSA
The Query String Appended flag ([QSA]) is used to "pass-through" existing query strings. You can define your own query sting to which the old string will be appended so be careful not to replicate key names. Failure to use the QSA flag will cause the creation of a query string in a redirection to destroy an existing query string.
forbidden|F
The Forbidden flag ([F]) is used to tell Apache when NOT to provide a page in response to a request. This is used to protect a file against viewing by unauthorized visitors, bandwidth leeches, etc.
or|OR
The OR flag ([OR]) is useful when combining mod_rewrite statements and you do not want them to be logically ANDed.
next|N
The Next flag ([N]) tells mod_rewrite that you do not want it to go through the remaining mod_rewrite statements but to start over at that point. This is useful in replacing characters in the {REQUEST_URI} when you don't know how many there will be.
skip|S=n
The Skip ([S]) tells mod_rewrite that want to skip the next n RewriteRules. This is useful as it acts like an if...then...else OR GOTO in the mod_rewrite code.

RewriteEngine on
RewriteCond %{HTTP_HOST} example\.com$ [NC]
RewriteRule .? - [S=2]

RewriteRule ^subdir1/(.*)$ subdir2/$1 [L]
RewriteCond %{QUERY_STRING} ^$
RewriteRule ^index\.php$ index.php?marker=on [L]

// the first RewriteCond/RewriteRule act as the if and then
// skipping the else of the next two RewriteRules
// (and their associated RewriteCond statements)
// These can be followed by other mod_rewrite statements.

There are other flags but you can get their definitions from Apache.org's mod_rewrite documentation.

mod_rewrite Comments

While the RewriteEngine on statement tells Apache to "start your engines," it also serves to denote mod_rewrite comments. As a good programmer, you know how important comments are in your code. mod_rewrite allows comments after a // at the beginning of a line but it also allows you to comment out an entire block of mod_rewrite code by wrapping the code in RewriteEngine off and RewriteEngine on statements:

RewriteEngine off
// the following code will not be parsed by the mod_rewrite engine
RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
RewriteRule .? http://www.example.com%{REQUEST_URI} [R=301,L]
// after RewriteEngine on, any following mod_rewrite code will be parsed
RewriteEngine on

RewriteEngine statements can be very helpful when developing new mod_rewrite code – just use them as you would the /* … */ wrapper for PHP comments.

WARNING: Do not use RewriteEngine statements to hide your mod_rewrite code if you don't have mod_rewrite enabled as you will get the same "500" error as if you used the "foo directive" (merely placing foo on a line in your .htaccess file). This is a mod_rewrite directive.

Note: You only need ONE RewriteEngine on statement per .htaccess file (unless you also include RewriteEngine off statement(s) for commenting blocks of code).

mod_rewrite Links*

As a webmaster, it is for YOU to determine how your pages will be identified to visitors as well as how to rewrite those URIs so Apache can serve the appropriate content.

Since nobody yet knows that you have made your links "user-friendly" (nor how you have formatted them), YOU have to create the links in your site's pages. You can use an editor (like Dreamweaver) which will perform multiple find and replace actions across your website (because you did not know about user-friendly URLs when you built it).

In the example in the section above, I used countries, states and cities – items that would be unique in a database. As I build websites for clients to update themselves, it is not reasonable for me to insist that they provide unique names for all their articles so database articles are typically identified by an auto-incremented ID. That's all that's required to pick a single article out of that database! So long as you can use an unique key, you will be able to use any key in your query string.

* There have been many questions about how to use a database to redirect from a title (or other field) to an ID. Unless you have access to your httpd.conf (in order to create a RewriteMap application), forget about using a database for your redirections. Instead, make the field of choice unique and use that field to create your links. The only thing to remember is that spaces appear as %20 in URLs so convert them before creating the link and back after obtaining the string in the $_GET array – the str_replace() code I offered above is perfect for this.

WARNING: There are other characters which are "reserved," "unreserved" or must be "escaped." There is a rather technical article which identifies the Uniform Resource Identifiers (URI): General Syntax. Obviously, you'll need to remove or escape these characters as appropriate.

Relative Links Are Missing!

Sorry, you are not ready yet, though, because, when you test your user friendly URLs, they work the same as the original links except that all your CSS, javascript files and images have disappeared! You can blame mod_rewrite if you like but it is your fault as you have used URLs that tell Apache that the script is in another directory (in my example, you are considered to be in the USA/California/ subdirectory – San_Diego would be the script's name) which is two subdirectories deeper into the website than display.php!

To get around this seeming "bad feature" of mod_rewrite, you can use absolute links throughout your site instead of relative links OR use HTML's <base> tag to identify the real location:

<head>
<!-- ... other head tags withOUT relative links ...-->
<base href="http://www.example.com/display.php" />
<!-- ... other head tags with relative links ... -->
</head>

Note that an absolute link (with either a leading / to denote DocumentRoot OR the full URL) is required as you are trying to "fix" the problem with relative links.

Problems With WordPress mod_rewrite

WordPress (WP) installs their mod_rewrite code in the .htaccess in your DocumentRoot but there are problems which you need to know about. First, their latest code:

# BEGIN WordPress


RewriteEngine On
RewriteBase /{path/}
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /{path/}index.php [L]


# END WordPress

Okay, what's wrong with this?

  1. First (and foremost), every webmaster should know NOT to use tests within an .htaccess file. Why not (the novice webmaster asks)? Because the .htaccess file must be read and parsed MULTIPLE times for every file request. Because the must perform a test for each instance, it's very wasteful of Apache resources. A webmaster would include any test in the server or domain's conf file (where it's only read once) OR tested once (commented out) the then, assuming no 500 error, can be left commented out. Please note that I'm not condemning WP for this because they're coding for the bottom 5% of webmasters, specifically the ones who would whine that "WP broke my website" when they should have known that their host did not enable mod_rewrite on their servers.
  2. Next, the RewriteBase statement is used to offset the WP installation from the DocumentRoot. If your WP installation is in the DocumentRoot, then RewriteBase / accomplishes NOTHING, thus, it should be removed.
  3. The first RewriteRule, the one testing for index.php, is redundant (assuming that index.php exists - and it better for WP to work at all!

If you're going to use third party code, KNOW what it's doing and be sure it fits your website!

Examples

Let's get on to examples which combine these basic structures to so some useful work!

Replace A Character

Once you've discovered that the hyphens (dashes) in your URLs are causing problems (with your regex as well as converting to and from your database fields), you'll want to change them to underscores (the underline character). The problem is that you don't know how many hyphens you have in your URLs so you'll use regex to repetitively replace the hyphen:

RewriteEngine on
RewriteRule ^(.*)-(.*)$ $1_$2 [N,L]

The Next flag tells Apache to restart the mod_rewrite rules (upon successful match and redirection). Unfortunately, you'll need to do further processing to be able to use an R=301 on the resultant redirection so that others will know you've changed your URL format so do this first.

Unlimited key/value pairs

If you followed the Regex section above to it's conclusion, you might guess that there is a limit to the number of key/value pairs. There is: As already explained, the number of Apache variables that can be created is nine. If you need more, however, don't despair!

Using the Next flag, I've just demonstrated how to change an unlimited number of -'s to _'s. We'll now extend that to unlimited key/value pairs.

RewriteEngine on
RewriteRule ^([a-z]+)/([a-zA-Z0-9_]+)/((.+)/)*redirect\.php$ $3redirect.php?$1=$2 [QSA,N,L]

This will capture a new key/value pair with the first two atoms ($1 and $2), anything "leftover" with $3 (which includes the trailing /) and redirect to the "leftover" with the redirect.php script remaining as the target with the key/value pair ADDED to any existing query string by the Query String Append flag before the process is restarted by the Next flag - the Last flag ensures that the mod_rewrite statement is terminated (not ANDed with any following statements).

If you don't want to show the redirect script in the URL, you'll need to account for the final redirection another way.

RewriteEngine on
RewriteRule ^([a-z]+)/([a-zA-Z0-9_]+)(/(.*))*$ $4/?$1=$2 [QSA,N,L]

RewriteCond %{QUERY_STRING} =
RewriteRule ^$ redirect.php [L]

Here, I've captured the key and value pairs with the first two atoms again and used the third to capture anything else. Assuming that the atoms are properly paired, the result will be a query string in the DocumentRoot. Assuming that the DirectoryIndex (normally index.php or index.html) is not the target of your redirection (and does not receive a query string), the existence of a query string (as denoted by finding an = within the query string) is used as a marker to effect a redirection to the script which will handle the redirect.

WARNING: Do NOT exceed 255 characters in your URI. (I recall 255 as the limit but I can't find the source to confirm.)

Force www for a Domain

[repeated from above]

If you want to force a browser to use the full domain with the www. prefix, you will need to test the Apache {HTTP_HOST} variable to see if it already exists and, if not, redirect.

RewriteEngine on
RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
RewriteRule .? http://www.example.com%{REQUEST_URI} [R=301,L]

If you have subdomains, however, preserve the subdomain like this:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^([a-z.]+\.)?example\.com$ [NC]
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule .? http://www.%1example.com%{REQUEST_URI} [R=301,L]

Capture the optional subdomain and, if it does not start with www., redirect with www. prepended to the subdomain and domain with the original {REQUEST_URI}.

Eliminate www for a Domain

Going the other way (getting rid of the www prefix)?

RewriteEngine on
RewriteCond %{HTTP_HOST} !^example\.com$ [NC]
RewriteRule .? http://example.com%{REQUEST_URI} [R=301,L]

Get rid of the www but preserve a subdomain with:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^www\.(([a-z0-9_]+\.)?example\.com)$ [NC]
RewriteRule .? http://%1%{REQUEST_URI} [R=301,L]

Here, the subdomain is captured in %2 (the inner atom) but, since it's optional and already captured in the %1 Apache variable, all you need is the %1 for the subdomain and domain without the leading www.

Prevent Image Hotlinking

If some unscrupulous webmaster is stealing your bandwidth (leeching) by linking to images on your site to post on his:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?example\.com/ [NC]
RewriteRule \.(gif|jpg)$ - [F]

This example uses the optional list to select just GIF and JPG images – do not allow a space in that list and remember, example.com is your site!

If you are upset enough at these pirates, you could change the image and feed something to let his visitors know he's hotlinking. Just don't forget to exclude the hotlinked image:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?example\.com/.*$ [NC]
RewriteCond %{REQUEST_URI} !^hotlinked\.gif$
RewriteRule \.(gif|jpg)$ http://www.mydomain.com/hotlinked.gif [R=301,L]

Of course, these both require the visitor to have his HTTP_REFERER enabled (most browsers do by default).

Block specific hotlinkers with:

RewriteEngine on
RewriteCond %{HTTP_REFERER} ^http://(www\.)?leech_site\.com/ [NC]
RewriteRule \.(gif|jpg)$ - [F,L]

This blocks visitors coming from the leecher's site to view GIF and JPF files.

Rather allow (or forbid) visitors from a specific IP Addresses? Use {REMOTE_ADDR} instead like:

RewriteEngine on
RewriteCond %{REMOTE_ADDR} !^192\.168\.1\.1$
# NOT your LAN address
RewriteCond %{REMOTE_ADDR} !^127\.0\.0\.1$
# NOT your localhost address
RewriteRule \.(gif|jpg)$ - [F]
# Fail to serve GIF or JPG images to non-specified IP Addresses

Redirect to a 404 Page

If your host doesn't provide for a "file not found" redirection, create it yourself!

# you SHOULD be using
# ErrorDocument 404 /error.php
# instead of mod_rewrite

RewriteEngine on
// not a file
RewriteCond %{REQUEST_FILENAME} !-f
// not a directory
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .? /404.php [L]

This script checks to see that the requested filename does not exist and then that it is not really a directory before it redirects to the DocumentRoot's 404.php script. Extend this just a bit by including the URI in a query string by adding ?url=$1 immediately after the /404.php:

RewriteEngine on
// not a file
RewriteCond %{REQUEST_FILENAME} !-f
// not a directory
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /404.php?url=$1 [L]

Rename Your Directories

You've shifted files around on your site changing directory name(s):

# mod_alias can do this faster without the regex engine
RewriteEngine on
RewriteRule ^old_directory/([a-z/.]+)$ new_directory/$1 [R=301,L]

Note that I've included the dot character (not the "any character" metacharacter) inside the range to allow file extensions but the a-z will accept only lowercase characters. If you need uppercase, you know from above how to modify this code.

Convert .html Links to .php Links

Updating your website but need to be sure that bookmarked links will still work?

RewriteEngine on
RewriteRule ^([a-z/]+)\.html$ $1.php [L]

This is not a permanent redirection so it will be invisible to your visitors. To make it permanent (and visible), change the flag to [R=301,L]. Obviously, this will also work for changing any file extension from one to another by changing the html and php above.

Extensionless Links

Need to make your links easier to remember or just want to hide your file types?

Typically you're only using either .html or .php files so:

RewriteEngine on
RewriteRule ^([a-z]+)$ $1.php [L]

Someone has asked about using extensionless URIs for both .html and .php files. Requiring that both php and html extensions be considered requires that you use RewriteCond statements to check whether the filename with either extension exists as a file:

RewriteEngine on
# Test php first - gives preference in case you have both
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^([a-zA-Z0-9]+)$ $1.php [L]

# then test for the .html extension
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^([a-zA-Z0-9]+)$ $1.html [L]

As in the 404 example, the -f checks for the existence of a file.

Redirect TO New Format

I have fielded questions where someone wanted to redirect their real URIs to extensionless URIs so search engines would update to their new, extensionless format? Okay, Apache can do that but it can not serve scripts in the new format (they have to be redirected back to the real link!). Have I got your head spinning?

I do NOT recommend this (unless you're on a dedicated server with low volume) as it requires additional processing by Apache.

The key to this is the No Subrequest flag which will prevent redirection if a request has already been redirected.

# Assumes "usable link" is index.php?id=alpha
# and alpha is the extensionless link
# Redirect to NEW format
RewriteCond %{IS_SUBREQ} false
RewriteCond %{QUERY_STRING} id=([a-zA-Z]+)
RewriteRule ^index\.php$ %1? [R=301,L]

# Redirect back to "usable link"
RewriteRule ^([a-zA-Z]+)$ index.php? id=$1 [L]

 

Here, the original http://www.example.com/index.php?id=something has not been redirected so it is redirected to http://www.example.com/something. Then, the second RewriteRule finds the something and redirects it back to index.php which is prohibited from redirecting again by the No Subrequest flag.

RewriteMap

A major problem arises for a webmaster who has redesigned a website without preserving the old links. The problem is that the old links will return 404s unless a mapping from the old link to the new can be implemented. This is where the RewriteMap shines!

Defined in the server or virtual host configuration files, the syntax is:

RewriteMap MapName MapType:MapSource

where:

MapName is the name you assign,
MapType one of the following: txt, rnd, dbm, int, prg or dbd/fastdbd

txt - A plain text file containing space-separated key-value pairs, one per line.
rnd - Randomly selects an entry from a plain text file.
dbm - Looks up an entry in a dbm file containing name, value pairs.
int - One of the four available internal functions provided by RewriteMap: toupper, tolower, escape or unescape.
prg - Calls an external program or script to process the rewriting.
dbd or fastdbd - A SQL SELECT statement to be performed to look up the rewrite target.

Depending upon the number of mappings required, it's probable that a simple text file will suffice, i.e., the MapSource will be the absolute path to a text file like:

Link1 NewLink1
Link2 NewLink2
Link3 NewLink3
…

and would be called after checking for the existence of the file or directory by:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-z]+\.html)$ ${MapName:$1|NOTFOUND} [L]

where NOTFOUND is necessary to account for the file not being found in the text map.

For our purposes, neither random file selection nor internal functions are appropriate.

For our purposes, the dbm (binary database of key-value pairs) and dbd/fastdbd are useful as they are virtually the same as the txt file but requires the mod_auth_dbm or mod_dbd module to access.

That leaves the prg (program) as an alternative to txt. In this case, the MapSource should lead to an application file (PHP or Perl) to access a database table and return the link (or a tailored 404 file) to mod_rewrite.

Unfortunately, none of the above is useful to the average webmaster as they will not have access to the server or virtual host configuration file.

Stop and think a moment. If an old link no longer exists, mod_rewrite can redirect to a 404 handler script which can look into a text list or database to determine whether the request was to a replaced file and select the replacement. If a replacement is found, the handler file could use a header("Location:{redirection}"); to redirect to the replacement file! I call this a "Poor Man's RewriteMap" but, in reality, it's just a smart 404 handler.

A note from php.net: "Note: The HTTP status header line will always be the first sent to the client, regardless of the actual header() call being the first or not. The status may be overridden by calling header() with a new status line at any time unless the HTTP headers have already been sent."

Translated, that means that the "Poor Man's RewriteMap" needs to send:

header("Status: 301"); // permanent redirection vs 200 (Okay)
header("Location: {redirection}");

AND this must be done before any output is made from the 404 handler.

Check for Key in Query String

If you need to have a specific key's value in your query string, you can check for its existence with RewriteCond:

RewriteCond %{QUERY_STRING} !uniquekey=
RewriteRule ^some_script_that_requires_uniquekey\.php$ some_script.php [QSA,L]

... will check the {QUERY_STRING} variable for lack of the key "uniquekey" and, if the {REQUEST_URI} is the script_that_requires_uniquekey, it will redirect. If you are looking for an unique value, remove the in the RewriteCond statement.

RewriteCond %{QUERY_STRING} !uniquevalue
RewriteRule ^some_script_that_requires_uniquevalue\.php$ some_script.php [QSA,L]

Delete the Query String

Apache's mod_rewrite automatically passes-through a query string UNLESS you

  1. Assign a new query string
    (you can keep the original query string by adding a QSA flag, e.g., [QSA,L])

    OR

  2. Add a ? after a filename, e.g., index.php?. The ? will NOT be shown in the browser's location box.

Enforce Secure Server

Apache can determine whether you're using a secure server in two ways: Using the {HTTPS}and {SERVER_PORT} (which is 443 for a secure server). So, these two bits will redirect to a secure server is you're not already there:

RewriteEngine on
RewriteCond %{REQUEST_URI} ^secure_page\.php$
RewriteCond %{HTTPS} !on
RewriteRule ^(secure_page\.php)$ https://www.example.com/$1 [R=301,L]

OR

RewriteEngine on
RewriteCond %{REQUEST_URI} ^secure_page\.php$
RewriteCond %{SERVER_PORT} !^443$
RewriteRule ^(secure_page\.php)$ https://www.example.com/$1 [R=301,L]

Since the {HTTPS} variable is null when you've not requested a secure server, I use the {SERVER_PORT} option.

Selective Enforce Secure Server

(where the secure and unsecure domains share the DocumentRoot)

This requires a RewriteCond statement to check whether the secure server port is being used and, if not AND the requested script is one in the list requiring a secure server, redirect.

RewriteEngine on
RewriteCond %{SERVER_PORT} !^443$
RewriteRule ^(page1|page2|page3|page4|page5)$ https://www.example.com/%1 [R=301,L]

And, to redirect pages not requiring a secure server,

RewriteEngine on
RewriteCond %{SERVER_PORT} ^443$
RewriteRule !^(page1|page2|page3|page4|page5)$ http://www.example.com%{REQUEST_URI} [R=301,L]

will force the http ({SERVER_PORT}=80) mode.

WARNING: Mixing these two (force HTTPS and HTTP at the same time) will force non-script files to be served in HTTP protocol, i.e., not encrypted. The result WILL BE a warning that some content has NOT been authenticated.

To avoid the "mixing" problem of the force non-secure pages, target the scripts rather than all files:

RewriteEngine on
RewriteCond %{SERVER_PORT} ^443$
RewriteCond %{REQUEST_URI} \.(php|html) # where php or html is your scripts' extension
RewriteRule !^(page1|page2|page3|page4|page5)$ http://www.example.com%{REQUEST_URI} [R=301,L]

Another method utilizes a regex "trick":

RewriteEngine on
RewriteCond %{HTTP_HOST}/s%{HTTPS} ^www\.([^/]+)/((s)on|s.*)$ [NC]
RewriteRule . http%3://%1%{REQUEST_URI} [R=301,L]
# That changes to domain name without the www but retained http/https
# See below for explanation as it's a bit complicated
RewriteCond %{HTTPS} on [NC]
RewriteRule !^(page1|page2|page3|page4|page5)\.php$
    http://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
# That redirects all BUT pages 1-5 to the http server
# The same WARNING applies as above.

Explanation: I needed an explanation so here it is:

  • /s%{HTTPS} will create the string /son or /s , depending on the value of HTTPS.
  • /((s)on|s.*) In this REGEX, %2 will always be matched and set, because son matches (s)on, and other cases match s.*.
  • When (s)on is matched, %3 will be set to 's'.

In short, this is a trick to replace on with s and similar techniques can be used in other situations, too.

Since the first version is far simpler to understand, I recommend that one.

Summary

mod_rewrite is primarily used to allow "Search Engine Optimization" / "User-Friendly" URLs but it is an extremely flexible webmaster tool for other important redirection tasks.

Reference Links

  • Great tutorial: http://gnosis.cx/publish/programming/regular_expressions.html
  • "mod_rewrite Cheat sheet," from Dave Childs is a handy reference: http://www.addedbytes.com/apache/mod_rewrite-cheat-sheet/
     
    Notes:
     
    1. This is regex tailored to mod_rewrite and includes some Apache variables & flags.
     
    2. The No Case flag should RARELY be used in a RewriteRule - it's to match case INsensitive characters such as those in the {HTTP_HOST}.
     
    3. Remember the Last flag ... and the start anchor difference in the two versions of Apache (^/ for Apache 1.x, ^ for Apache 2.x, ^/? for both).
     
  • "Regex Cheat sheet," a handy reference: http://regexlib.com/CheatSheet.aspx
  • A regex-capable text editor: http://www.editpadpro.com
  • Regex Coach: http://weitz.de/regex-coach/

 
  This site designed, created, maintained and copyright © 1995 - 2025 by Data Koncepts.