Apache RewriteMap with URLs containing space doesn't work
I am actually using a RewriteMap
directive inside my vhost to redirect a list of 800 URLs. It works quiet well:
RewriteEngine On
RewriteMap redirects dbm=db:/data/apps/project/current/configuration/etc/httpd/conf/redirects.db
RewriteCond ${redirects:$1} !=""
RewriteRule ^(.*)$ ${redirects:$1} [redirect=permanent,last]
I use a redirect.txt
file containing the mapping. Then it is converted to a db file:
httxt2dbm -f db -i /data/apps/project/current/configuration/etc/httpd/conf/redirects.txt -o /data/apps/project/current/configuration/etc/httpd/conf/redirects.db
For example for this kind of URL, it is OK:
/associations/old_index.php /
But when the URL contains spaces it doesn't work: (I suppose it will be the same with other special characters)
/Universités%20direct /
How to handle this case?
A workaround might be to internally rewrite URLs that contain a space to a hyphen (replace the space with a hyphen) and include the hyphenated URL in your rewrite map instead.
If you only have some URLs that contain a single space within the URL then you could use something like the following directive before your existing directives:
RewriteRule ^(.+)\s(.+)$ $1-$2
And then use the following in your rewrite map:
/Universités-direct /
UPDATE: If you have URLs that contain two spaces (eg. /the force awakens
) and some with one space then you could add an additional rule:
RewriteRule ^(.+)\s(.+)\s(.+)$ $1-$2
RewriteRule ^(.+)\s(.+)$ $1-$2
These rules do assume that you don't have URLs that end with a space. And no URL has more than one contiguous space.
If three spaces then add another rule before the above...
RewriteRule ^(.+)\s(.+)\s(.+)\s(.+)$ $1-$2
I would tend to do it this with multiple (simple) rules, rather than a generic "convert everything in a single rule", unless you specifically need that. A generic rule will run recursively, reducing multiple spaces to a single character. You will also likely need additional flags (ie. DPI
) to prevent a known rewrite bug in Apache.
You can use a second rewrite map, the internal function 'escape' this turns spaces into %20:
RewriteMap ec int:escape
RewriteMap redirects dbm=db:/data/apps/project/current/configuration/etc/httpd/conf/redirects.db
RewriteCond ${redirects:${ec:$1}} !=""
RewriteRule ^(.*)$ ${redirects:${ec:$1}} [redirect=permanent,last]
Then in your own rewrite map db you can have:
/Universités-direct%20/
This should then match.
You can solve this by extracting the encoded URI from the %{THE_REQUEST} variable and using that to do the lookup. You need to put the encoded URIs in the map though of course. Something like the following:
RewriteEngine On RewriteMap redirects dbm=db:/data/apps/project/current/configuration/etc/httpd/conf/redirects.db RewriteCond %{THE_REQUEST} "\w+ ([^ ]+)" RewriteRule ^ - [E=MYVAR:%1] RewriteCond ${redirects:%{ENV:MYVAR}} !="" RewriteRule ^ ${redirects:%{ENV:MYVAR}} [redirect=permanent,last] [B]
I've only tested it with a text based map instead of the DB one though. This will probably need modification if you have to deal with URLs with query strings.