mod_rewrite

From DreamHost
Revision as of 08:14, 13 April 2012 by Korova admin (Talk | contribs)

Jump to: navigation, search
This article or section may require a cleanup.
We are hoping to create articles that meet certain standards. Please discuss this issue on the talk page. Editing help is available.


mod_rewrite is a URL rewriting engine for the Apache web server. The feature is fully-supported on all DreamHost plans.


Creating Search Engine Friendly URLs using Apache's mod_rewrite

Many dynamic websites employ some sort of navigation achieved through the query string. For example the address of this page in order to edit it uses:

http://wiki.dreamhost.com/index.php?title=Mod_rewrite&action=edit

Just as this is difficult to read by humans, search engines also have trouble reading them. True, they aren't as bad as they used to be and engines like Google claim to read the query strings correctly. One part of the problem is that a query string items can be place in a different order. The page can still read and process the variables, but in essence the actual URL has changed.

The option is to place a .htaccess document in the root of the website, and re-direct any requests that are not actual files or directories to a file that will process the request. Below is the .htaccess document that re-directs all of those failed requests to an index.php file for handling. An added bonus of this is that you won't have to worry about any pesky 404 messages, your file will handle the navigation and you can output your own error message.

.htaccess document:

Options +FollowSymLinks 
RewriteEngine On 
RewriteCond %{REQUEST_FILENAME} !-f 
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

More .htaccess mod_rewrite examples
Ok, now you have the .htaccess file in operation, you can use and address like:

http://wiki.dreamhost.com/Mod_rewrite/edit/1/

It looks like a folder structure, reads much easier and also disguises how your page operates by not showing the query string variable names.

In the PHP page you could handle this navigation like so:

<?php
$navString = $_SERVER['REQUEST_URI']; // Returns "/Mod_rewrite/edit/1/"
$parts = explode('/', $navString); // Break into an array
// Lets look at the array of items we have:
print_r($parts);
?>

This page will output something like:

Array (
[0] = ,
[1] = 'Mod_rewrite',
[2] = 'edit',
[3] = '1'
)

You can use your imagination from there to create something that will navigate off the information you've extracted from the URI.

Rewrite rules in a .htaccess file behave differently from those in a server config. If you're used to writing rewrite rules at the server level, your habits will get you in trouble when working in a .htaccess file. In particular it is so far as I know *never* valid to begin a rewrite rule with a slash. A rule like

RewriteRule ^/anything http://other.server.com/somewhere

will never match, because the incoming URL never has a leading slash in directory context. This is explained in detail in the Apache documentation, see the resource links.

Troubleshooting

Pre-built "magic" error pages

There are some "magic" error pages settings which could be problematic when you want to use rewrite rules. The most simple solution: turn off them!

ErrorDocument 401 default
ErrorDocument 403 default
ErrorDocument 404 default
ErrorDocument 500 default

No input file specified

When using http://site.com/query1/query2/ type of URLs that rewrite to http://site.com/index.php/query1/query2/ with mod_rewrite and PHP you might get a "No input file specified" error. There is a simple workaround, and it is to add a question mark to the the .htaccess right after the file you want to send the query strings to, as such:

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?/$1 [L]

That forces Apache to consider everything after index.php as a query string. This is known to fix broken pretty URLs for MediaWiki, CodeIgniter, ExpressionEngine and a few other major scripts.

If this still doesn't work, you might be using FastCGI. Go into Domains > Manage Domains in your panel. Click the Edit wrench under Web Hosting for the domain you're working with. Change "FastCGI" to just "CGI". Save changes and it should work.

The FastCGI version when paired with Apache 2.2 doesn't seem to like the rewriterule index.php/$1. Instead it prefers index.php?$1 but some CMS don't like that.

/stats no longer works

Some 3rd party apps like WordPress, Drupal will take over your directory structure to give you a nice clean url but in the process break your /stats pages.

The Wiki page on how to deal with this is Making stats accessible with htaccess.

# Quick Fix:
RewriteCond %{REQUEST_URI} ^/(stats/|missing\.html|failed_auth\.html) [NC]
RewriteRule . - [L]
# Rewrite URLs of the form 'index.php?q=x':
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !^/(stats/|missing\.html|failed_auth\.html) [NC]
RewriteRule ^(.*)$ /index.php?q=$1 [L,QSA]
# Rewrite URLS of the form 'index.php' ... mambo doesn't take the query string as a parameter
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !^/(stats/|missing\.html|failed_auth\.html) [NC]
RewriteRule . /index.php [L]
# In newer versions of WordPress the .htaccess file is recreated automatically.
# You should place these lines outside the (begin|end) WordPress block
# or they may get removed.
RewriteCond %{REQUEST_URI} ^/(stats/|missing\.html|failed_auth\.html) [NC]
RewriteRule . - [L]
# Original WordPress rules. Leave these alone.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
RewriteRule ^xml/([a-z]+)$ /xml/$1/feed.xml [R=301]
RewriteRule ^$ index.html [QSA]
RewriteRule ^([^.]+)$ $1.html [QSA]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !^/(stats/|missing\.html|failed_auth\.html) [NC]
RewriteRule ^(.*)$ dispatch.fcgi [QSA,L]
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !^/(stats/|missing\.html|failed_auth\.html) [NC]
RewriteRule ^(.*)$ /trac.fcgi/$1 [L,QSA]
RewriteRule ^$ trac.fcgi [L]
# Rewrite all URLs to index.php/URL :
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond $1 !^(index\.php) [NC]
RewriteRule ^(.*)$ /index.php?kohana_uri=$1 [PT,L]

This depends on how you install Mediawiki and whether you use mod_rewrite to get clean URLs. If you install it correctly, /stats will not be broken. See the alternate simpler method of getting clean URLs for install instructions that do not break /stats.

This is my own modification, it is not ideal surely, but it's work.
.htaccess file:

RewriteEngine On
RewriteBase /dokuwiki

RewriteRule ^/?_media/(.*) lib/exe/fetch.php?media=$1 [QSA,L]
RewriteRule ^/?_detail/(.*) lib/exe/detail.php?media=$1 [QSA,L]
RewriteRule ^/?_export/([^/]+)/(.*) doku.php?do=export_$1&id=$2 [QSA,L]
RewriteRule ^$ doku.php  [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !^/(stats/|missing\.html|failed_auth\.html) [NC]
RewriteRule (.*) doku.php?id=$1 [QSA,L]
RewriteRule ^/?index\.php$ doku.php


They were added only two lines to standard .htaccess file

RewriteCond %{REQUEST_URI} !^/(stats/|missing\.html|failed_auth\.html) [NC]

How to enable URL Rewriting in DokuWiki.

External Links