What characters are valid for JavaScript variable names?

I want to create a small "extension library" for my non-JavaScript users here at work (who all seem to be squeamish when it comes to the language). I love how jQuery and Prototype have both use the $ dollar sign, and since I use jQuery, I'm looking for another nice single-character symbol to use.

I realize that I could just test out a number of characters, but I'm hoping to narrow down my list of characters to start with (in consideration of future integration with another popular library, perhaps).

To quote Valid JavaScript variable names, my write-up summarizing the relevant spec sections:

An identifier must start with $, _, or any character in the Unicode categories “Uppercase letter (Lu)”, “Lowercase letter (Ll)”, “Titlecase letter (Lt)”, “Modifier letter (Lm)”, “Other letter (Lo)”, or “Letter number (Nl)”.

The rest of the string can contain the same characters, plus any U+200C zero width non-joiner characters, U+200D zero width joiner characters, and characters in the Unicode categories “Non-spacing mark (Mn)”, “Spacing combining mark (Mc)”, “Decimal digit number (Nd)”, or “Connector punctuation (Pc)”.

I’ve also created a tool that will tell you if any string that you enter is a valid JavaScript variable name according to ECMAScript 5.1 and Unicode 6.1:

P.S. To give you an idea of how wrong Anthony Mills' answer is: if you were to summarize all these rules in a single ASCII-only regular expression for JavaScript, it would be 11,236 characters long. Here it is:

// ES5.1 / Unicode 6.1

From the ECMAScript specification in section 7.6 Identifier Names and Identifiers, a valid identifier is defined as:

Identifier :: 
    IdentifierName but not ReservedWord

IdentifierName :: 
    IdentifierName IdentifierPart 

IdentifierStart :: 
    \ UnicodeEscapeSequence 

IdentifierPart :: 
    \ UnicodeEscapeSequence 

    any character in the Unicode categories “Uppercase letter (Lu)”, “Lowercase letter (Ll)”, “Titlecase letter (Lt)”, 
    “Modifier letter (Lm)”, “Other letter (Lo)”, or “Letter number (Nl)”. 

    any character in the Unicode categories “Non-spacing mark (Mn)” or “Combining spacing mark (Mc)” 

    any character in the Unicode category “Decimal number (Nd)” 

    any character in the Unicode category “Connector punctuation (Pc)” 

    see 7.8.4. 

HexDigit :: one of 
    0 1 2 3 4 5 6 7 8 9 a b c d e f A B C D E F

which creates a lot of opportunities for naming variables and also in golfing. Let's try some examples.

A valid identifier could start with either a UnicodeLetter, $, _, or \ UnicodeEscapeSequence. A unicode letter is any character from these categories (see all categories):

  • Uppercase letter (Lu)
  • Lowercase letter (Ll)
  • Titlecase letter (Lt)
  • Modifier letter (Lm)
  • Other letter (Lo)
  • Letter number (Nl)

This alone accounts for some crazy possibilities - working examples. If it doesn't work in all browsers, then call it a bug, cause it should.

var ᾩ = "something";
var ĦĔĽĻŎ = "hello";
var 〱〱〱〱 = "less than? wtf";
var जावास्क्रिप्ट = "javascript"; // ok that's JavaScript in hindi
var KingGeorgeⅦ = "Roman numerals, awesome!";

Basically, in regular expression form: [a-zA-Z_$][0-9a-zA-Z_$]*. In other words, the first character can be a letter or _ or $, and the other characters can be letters or _ or $ or numbers.

Note: While other answers have pointed out that you can use Unicode characters in JavaScript identifiers, the actual question was "What characters should I use for the name of an extension library like jQuery?" This is an answer to that question. You can use Unicode characters in identifiers, but don't do it. Encodings get screwed up all the time. Keep your public identifiers in the 32-126 ASCII range where it's safe.

Before JavaScript 1.5: ^[a-zA-Z_$][0-9a-zA-Z_$]*$

In English: It must start with a dollar sign, underscore or one of letters in the 26-character alphabet, upper or lower case. Subsequent characters (if any) can be one of any of those or a decimal digit.

JavaScript 1.5 and later * : ^[\p{L}\p{Nl}$_][\p{L}\p{Nl}$\p{Mn}\p{Mc}\p{Nd}\p{Pc}]*$

This is more difficult to express in English, but it is conceptually similar to the older syntax with the addition that the letters and digits can be from any language. After the first character, there are also allowed additional underscore-like characters (collectively called “connectors”) and additional character combining marks (“modifiers”). (Other currency symbols are not included in this extended set.)

JavaScript 1.5 and later also allows Unicode escape sequences, provided that the result is a character that would be allowed in the above regular expression.

Identifiers also must not be a current reserved word or one that is considered for future use.

There is no practical limit to the length of an identifier. (Browsers vary, but you’ll safely have 1000 characters and probably several more orders of magnitude than that.)

Links to the character categories:

  • Letters: Lu, Ll, Lt, Lm, Lo, Nl
    (combined in the regex above as “L”)
  • Combining marks (“modifiers”): Mn, Mc
  • Digits: Nd
  • Connectors: Pc

*n.b. This Perl regex is intended to describe the syntax only — it won’t work in JavaScript, which doesn’t (yet) include support for Unicode Properties. (There are some third-party packages that claim to add such support.)

Javascript Variables

You can start a variable with any letter, $, or _ character. As long as it doesn't start with a number, you can include numbers as well.

Start: [a-z], $, _

Contain: [a-z], [0-9], $, _


You can use _ for your library so that it will stand side-by-side with jQuery. However, there is a configuration you can set so that jQuery will not use $. It will instead use jQuery. To do this, simply set:


