Regular expression for all printable characters in JavaScript
Looking for a regular expression for that validates all printable characters. The regex needs to be used in JavaScript only. I have gone through this post but it mostly talks about .net, Java and C but not JavaScript.
You have to allow only these printable characters :
a-z, A-Z, 0-9, and the thirty-two symbols: !"#$%&'()*+,-./:;<=>?@[] ^_`{|}~ and space
Need a JavaScript regex to validate the input characters is one of the above and discard the rest.
Solution 1:
If you want to match all printable characters in the UTF-8 set (as indicated by your comment on Aug 21), you're going to have a hard time doing this yourself. JavaScript's native regexes have abysmal Unicode support. But you can use XRegExp with the regex ^\P{C}*$
.
If you only want to match those few ASCII letters you mentioned in the edit to your post from Aug 22, then the regex is trivial:
/^[a-z0-9!"#$%&'()*+,.\/:;<=>?@\[\] ^_`{|}~-]*$/i
Solution 2:
To validate a string only consists of printable ASCII characters, use a simple regex like
/^[ -~]+$/
It matches
-
^
- the start of string anchor -
[ -~]+
- one or more (due to+
quantifier) characters that are within a range from space till a tilde in the ASCII table:
- $
- end of string anchor
For Unicode printable chars, use \PC
Unicode category (matching any char but a control char) from XRegExp
, as has already been mentioned:
^\PC+$
See regex demos:
// ASCII only
var ascii_print_rx = /^[ -~]+$/;
console.log(ascii_print_rx.test("It's all right.")); // true
console.log(ascii_print_rx.test('\f ')); // false, \f is an ASCII form feed char
console.log(ascii_print_rx.test("demásiado tarde")); // false, no Unicode printable char support
// Unicode support
console.log(XRegExp.test('demásiado tarde', XRegExp("^\\PC+$"))); // true
console.log(XRegExp.test(' ', XRegExp("^\\PC+$"))); // false, \u200C is a Unicode zero-width joiner
console.log(XRegExp.test('\f ', XRegExp("^\\PC+$"))); // false, \f is an ASCII form feed char
<script src="http://cdnjs.cloudflare.com/ajax/libs/xregexp/3.1.1/xregexp-all.min.js"></script>
Solution 3:
For non-unicode use regex pattern ^[^\x00-\x1F\x80-\x9F]+$
If you want to work with unicode, first read Javascript + Unicode regexes.
I would suggest then to use regex pattern ^[^\p{Cc}\p{Cf}\p{Zl}\p{Zp}]*$
-
\p{Cc}
or\p{Control}
: an ASCII 0x00..0x1F or Latin-1 0x80..0x9F control character. -
\p{Cf}
or\p{Format}
: invisible formatting indicator. -
\p{Zl}
or\p{Line_Separator}
: line separator character U+2028. -
\p{Zp}
or\p{Paragraph_Separator}
: paragraph separator character U+2029.
For more information see http://www.regular-expressions.info/unicode.html
Solution 4:
Looks like JavaScript has changed to some degree since this question was posted?
I'm using this one:
var regex = /^[\u0020-\u007e\u00a0-\u00ff]*$/;
console.log( regex.test("!\"#$%&'()*+,-./:;<=>?@[] ^_`{|}~")); //should output "true"
console.log( regex.test("Iñtërnâtiônàlizætiøn")); //should output "true"
console.log( regex.test("☃💩")); //should output "false"