javascript url-safe filename-safe string
Well, here's one that replaces anything that's not a letter or a number, and makes it all lower case, like your example.
var s = "John Smith's Cool Page";
var filename = s.replace(/[^a-z0-9]/gi, '_').toLowerCase();
Explanation:
The regular expression is /[^a-z0-9]/gi
. Well, actually the gi
at the end is just a set of options that are used when the expression is used.
-
i
means "ignore upper/lower case differences" -
g
means "global", which really means that every match should be replaced, not just the first one.
So what we're looking as is really just [^a-z0-9]
. Let's read it step-by-step:
- The
[
and]
define a "character class", which is a list of single-characters. If you'd write[one]
, then that would match either 'o' or 'n' or 'e'. - However, there's a
^
at the start of the list of characters. That means it should match only characters not in the list. - Finally, the list of characters is
a-z0-9
. Read this as "a through z and 0 through 9". It's is a short way of writingabcdefghijklmnopqrstuvwxyz0123456789
.
So basically, what the regular expression says is: "Find every letter that is not between 'a' and 'z' or between '0' and '9'".
I know the original poster asked for a simple Regular Expression, however, there is more involved in sanitizing filenames, including filename length, reserved filenames, and, of course reserved characters.
Take a look at the code in node-sanitize-filename for a more robust solution.
For more flexible and robust handling of unicode characters etc, you could use the slugify in conjunction with some regex to remove unsafe URL characters
const urlSafeFilename = slugify(filename, { remove: /"<>#%\{\}\|\\\^~\[\]`;\?:@=&/g });
This produces nice kebab-case filenemas in your url and allows for more characters outside the a-z0-9
range.