Rewriting an arbitrary number of path segments to query parameters

I have this .htaccess rule:

RewriteRule viewshoplatest/(.*)/(.*)/(.*)/(.*)/(.*)/(.*)/(.*)/(.*)/(.*)/(.*)/(.*)/(.*)/(.*)/(.*)/(.*)/(.*)/$ /viewshoplatest.php?$1=$2&$3=$4&$5=$6&$7=$8&$9=$10&$11=$12&$13=$14&$15=$16

It should map a URL like this:

http://www.veepiz.com/viewshoplatest/start/10/end/10/filter/0/ownerid/0/sortby/date/sortdir/DESC/cat/0/scat/0/

to this:

http://www.veepiz.com/viewshoplatest.php?start=0&end=10&filter=0&ownerid=0&sortby=date&sortdir=DESC&cat=0&scat=0

When I click on link and print $_GET variables I get this:

Array ( [start] => 10 [end] => 10 [filter] => 0 [ownerid] => 0 [sortby] => start0 [start1] => start2 [start3] => start4 [start5] => start6 )

Any ideas as to why it’s behaving badly?


Ok i have fixed this by rewriting the rule to

RewriteRule viewshoplatest/start/(.*)/end/(.*)/filter/(.*)/ownerid/(.*)/sortby/(.*)/sortdir/(.*)/cat/(.*)/scat/(.*)/$ /viewshoplatest.php?start=$1&end=$2&filter=$3&ownerid=$4&sortby=$5&sortdir=$6&cat=$7&scat=$8

Solution 1:

First of all: You shouldn’t use .* if you can be more specific, like in this case [^/]+. Because multiple .* can cause immense backtracking. So:

RewriteRule ^viewshoplatest/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/$ /viewshoplatest.php?$1=$2&$3=$4&$5=$6&$7=$8&$9=$10&$11=$12&$13=$14&$15=$16

You can use a took like RegexBuddy to see the difference in how these regular expressions are processed.

But since mod_rewrite does only allow to reference the first nine groups (see Tim’s answer), you could use an iterative approach and process one parameter at a time:

RewriteRule ^viewshoplatest/([^/]+)/([^/]+)/([^/]+/[^/]+/.*)$ /viewshoplatest/$3?$1=$2 [QSA,N]
RewriteRule ^viewshoplatest/([^/]+)/([^/]+)/([^/]*)/?$ /viewshoplatest.php?$1=$2&$3 [QSA,L]

The first rule will process one parameter pair at a time (except the last pair) by append it to the already existing ones (see QSA flag) and then restart the rewriting process without incrementing the internal recursion counter (see N flag). The second rule will then rewrite the last parameter pair (or just the name) and end the iteration.

But since using the N flag might be dangerous as it can cause an infinite recursion, you could also use PHP to parse the requested path:

$_SERVER['REQUEST_URI_PATH'] = parse_url($_SERVER['REQUEST_URI'], PHP_URL_PATH);
$segments = implode('/', trim($_SERVER['REQUEST_URI_PATH'], '/'));
array_shift($segments); // remove path prefix "/viewshoplatest"
for ($i=0, $n=count($segments); $i<$n; ) {
    $_GET[rawurldecode($segments[$i++])] = ($i < $n) ? rawurldecode($segments[$i++]) : null;
}

Now you just need this rule to pass the request through:

RewriteRule ^viewshoplatest(/|$) /viewshoplatest.php [L]

Solution 2:

Just to expand on what you found out, you can only define nine groups to be used as backreferences, which is why it's generally a better idea to rewrite to a script sans-query string and have the script examine REQUEST_URI in cases where you will have a lot of data to parse out.

From the documentation:

Back-references are identifiers of the form $N (N=0..9), which will be replaced by the contents of the Nth group of the matched Pattern

$0 is the entire matched pattern, giving you the remaining nine numbers to work with. Any higher number is treated as a backreference followed by some literal numeric characters.