Apache mod_rewrite/mod_redirect: convert URL with query parameters into an SEO-friendly URL. some query parameters are ignored
As above.
might be best explained with an example:
-
original URL (the order of the params is NOT consistent)
- www.myhost.com/my-page?p1=v1&p2=v2&p3=v3
- OR www.myhost.com/my-page?p2=v2&p1=v1&p3=v3
- OR www.myhost.com/my-page?p3=v3&p1=v1&p2=v2
-
If I visit any of the 3 sample URLs above, I want to get redirected to www.myhost.com/my-page/v1/v2
- p3 was purposely ignored because it is not needed anymore.
- in my real world URL, I have p4 to p6 parameters which are also ignored. I only wrote up to p3 for simplicity.
I've looked/searched the net but nothing comes close to what I'm trying to do. Thank you.
Solution 1:
You can do it like the following using mod_rewrite:
RewriteEngine On
RewriteCond %{QUERY_STRING} (?:^|&)p1=([^&]+)
RewriteCond %1@%{QUERY_STRING} ^([^@]+)@(?:|.*&)p2=([^&]*)
RewriteRule ^/?(my-page)$ /$1/%1/%2 [QSD,R=302,L]
The first condition captures the value of p1
and passes this to the second condition that also captures the value of p2
. These are then used in the substitution string as %1
and %2
backreferences respectively.
The URL parameters can occur in any order.
@
(in the second condition) is just an arbitrary character that does not occur in v1
and so is used as a delimiter between v1
and the query string when searching for p2
.
All other URL parameters (ie. p3
..pN
) are ignored (and discarded).
Both p1
and p2
must exist with a non-empty value for the redirect to occur.
UPDATE: p1
must exist with a non-empty value (otherwise the resulting redirect will be ambiguous). p2
must also exist, but the value can be empty. I've also updated the regex so that it would also allow an arguably malformed (but still valid) query string where the URL parameter delimiter (&
) occurs before p2
at the very start of the query string. eg. /my-page?&p2=&p1=v1
would still be redirected to /my-page/v1/
.
The QSD
flag discards the original query string from the request.
Depending on where these directives are being used (eg. directory or server context) and how the URL is ultimately being routed, you may need to add an initial condition to ensure that only direct requests (as opposed to rewritten requests) are processed.
For example, add the following as the first condition if required:
RewriteCond %{ENV:REDIRECT_STATUS} ^$
:
UPDATE:
p1 and p2 are mandatory. p3 is optional.
You could change the above as follows:
RewriteCond %{QUERY_STRING} (?:^|&)p1=([^&]+)
RewriteCond %1@%{QUERY_STRING} ^([^@]+)@(?:|.*&)p2=([^&]+)
# Either p3 is mandatory and must have a value
RewriteCond %1/%2/@%{QUERY_STRING} ^([^@]+)@(?:|.*&)p3=([^&]+) [OR]
# OR p3 is omitted or does not have a value
RewriteCond %1/%2 (.+)
RewriteRule ^/?(my-page)$ /$1/%1%2 [QSD,R,L]
The 2nd condition now ensures there is a value for the p2
parameter (ie. v2
) - it cannot be empty.
Then either the 3rd condition checks for a non-empty p3
parameter OR the 4th condition simply captures the values when the p3
parameter is either empty or omitted altogether.
The resulting redirect omits the trailing slash when p3
is not present. For example, the resulting redirect is either /mypage/v1/v2
or /mypage/v1/v2/v3
. (The trailing slash that might otherwise occur on /mypage/v1/v2/
is avoided.)
In the final substitution string, the values of the %1
and %2
backreferences are a little different to before. These no longer contain each value (ie. v1
and v2
). Instead %1
contains v1/v2
and %2
contains either /v3
(including the slash prefix) or is empty (when p3
is empty or omitted).